Skip to content
This repository was archived by the owner on Nov 19, 2020. It is now read-only.
This repository was archived by the owner on Nov 19, 2020. It is now read-only.

Accord.Math -> Vector -> T[] Sample<T>(T[] values, int size) incorrect #862

@stas-vi

Description

@stas-vi
  • [X ] bug report

looks like a bug in one of the Vector Sample methods

    /// <summary>
    ///   Draws a random sample from a group of observations, without repetitions.
    /// </summary>
    /// 
    /// <typeparam name="T">The type of the observations.</typeparam>
    /// 
    /// <param name="values">The observation vector.</param>
    /// <param name="size">The size of the sample to be drawn (how many samples to get).</param>
    /// 
    /// <returns>A vector containing the samples drawn from <paramref name="values"/>.</returns>
    /// 
    public static T[] Sample<T>(T[] values, int size)
    {
        int[] idx = Vector.Sample(size);
        return values.Get(idx);
    }

From my understanding the following method should take a random sample from a group of values
However the returned data isn't always random and wouldn't contain the last elements.
because if we look at the inner idx parameter it can only be as high as the size parameter which means the values can't be with indices higher then size .

I.e we have values [0,1,2,3] and we want a random population of 2 so we call (values,2)
the returned result would never contain the elements 2,3

Following is a fix proposition

    public static T[] Sample<T>(T[] values, int size)
    {
        int[] idx = Vector.Sample(size, values.Length);
        return values.Get(idx);
    }
  • Stumbled on this while working with the random forest coverage parameter.

Also wanted to say that's it's a great project thanks for all the hard work!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions