Skip to content
This repository was archived by the owner on Nov 19, 2020. It is now read-only.
This repository was archived by the owner on Nov 19, 2020. It is now read-only.

Liblinear (Linear SVMs) does not train, exits with "index out of range on Math.Accord..." #330

@andy-soft

Description

@andy-soft

Hi there, I was trying to emulate the simple example of the a9a dataset (included)
So I compiled the library into train.exe with Visual Studio 2013
After a while, all ok, now run with command line parameters: -s 2 a9a
(the 'a9a' file has been placed into the same directory)
the file is read perfectly, but on calling train(problem, parameters); the system exists with this error:
(this is the console output)

L2RegularizedL2LossSvc iter 1 act 1.174E+003 pre 1.161E+003 delta 5.718E-001 f 2.348E+003 |g| 6.779E+003 CG 2 cg reaches trust region boundary iter 2 act 1.424E+002 pre 1.253E+002 delta 6.722E-001 f 1.174E+003 |g| 4.085E+001 CG 4 iter 3 act 3.261E+001 pre 2.966E+001 delta 6.722E-001 f 1.032E+003 |g| 3.819E+001 CG 6 cg reaches trust region boundary iter 4 act 5.202E+000 pre 4.930E+000 delta 7.117E-001 f 9.989E+002 |g| 2.083E+001 CG 14 A first chance exception of type 'System.IndexOutOfRangeException' occurred in Accord.Math.dll The program '[0x548] linear.vshost.exe' has exited with code 1 (0x1).
¿any clue?

I could not debug so deeply, don't know all the implementation tricks and issues!

Comment
Actually a saw all the Matrix math and vector training is performed over dense matrixes.
Therefore I cannot load into memory a huge sparse problem, I tried and it cannot be read.

LIBLINEAR C++ code does this internally as sparse arrays (index, data) and is really very fast, it trains over a whole 99 megabytes text file (240k samples, 70 jagged parameters) just in under 2 seconds. The same code using C# does not end after several hours.

Another thing
I want to know if the 'model' files are compatible among C++ and C# (your version) and the loading of the support vectors are equal so if I train on the original C++ code, and load the model file to use it with C#, and just use Decide()
¿Am I right?

¡ and thanks for such a good job!

I guess a sparse vectors implementation, may be faster and less memory hungry than the dense one, (on sparse data, of course)

I am doing lots of NLP work and actually use C# therefore I need your code, I am using some code I've developed on my own but you program faster on new algorithms, and I cannot cope with it.

Even I asked you on CRF some time ago and you just did it!
the problem is the sparse data, I have tons of training corpuses, and the problem does not fit in memory (I have only a miserable 8 Gigs on W10x64 and sometimes I guess it needs 120 Gigs or more)

Also I am thinking on using CUDA and optimized code, because training a deep belief network on more than 10k dimensions, and several deep layers becomes impossible on human times. (weeks training)
and with CUDA's it can go into a few minutes, rarely going into hours.

best regards, and hope we can find this bug, or whatever

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions