You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Nov 19, 2020. It is now read-only.
looks like a bug in one of the Vector Sample methods
/// <summary>
/// Draws a random sample from a group of observations, without repetitions.
/// </summary>
///
/// <typeparam name="T">The type of the observations.</typeparam>
///
/// <param name="values">The observation vector.</param>
/// <param name="size">The size of the sample to be drawn (how many samples to get).</param>
///
/// <returns>A vector containing the samples drawn from <paramref name="values"/>.</returns>
///
public static T[] Sample<T>(T[] values, int size)
{
int[] idx = Vector.Sample(size);
return values.Get(idx);
}
From my understanding the following method should take a random sample from a group of values
However the returned data isn't always random and wouldn't contain the last elements.
because if we look at the inner idx parameter it can only be as high as the size parameter which means the values can't be with indices higher then size .
I.e we have values [0,1,2,3] and we want a random population of 2 so we call (values,2)
the returned result would never contain the elements 2,3
Following is a fix proposition
public static T[] Sample<T>(T[] values, int size)
{
int[] idx = Vector.Sample(size, values.Length);
return values.Get(idx);
}
Stumbled on this while working with the random forest coverage parameter.
Also wanted to say that's it's a great project thanks for all the hard work!