I am thinking about creating a new descriptor vector using some rather irregular biological data for classification via SVM. For example one property I want to measure may occur from zero to may times. I have recently learned about normal forms in database design. It would be nice if there were some kind of generalized rules or guidelines for handling data with different properties when designing a descriptor vector .
[link][3 comments]