Hey,
I've got a multi-label classification problem.
My dataset has one primary key attribute, 12 numeric attributes and 16 label attributes (with 0;1). Overall instances are 814. Of these 814 instances, 375 have no label data and are therefore the target (prediction) set. The remaining 439 instances have to be used to train and test the model.
This is the first time I apply data mining methods (with the MULAN framework) and I'm wondering: Should I use all my 439 instances for training AND testing or split them up? They are so few and I have no experiences with optimal ratios.
Not sure if this subred is the right place for such a question, but I appreciate any advice.
Regards.
[link][3 comments]