Hi everyone, I've found many great resources here and now that I've learned a bit (and I'm still overwhelmed on how much there is to know) I have a couple of questions:
What is the process of improving accuracy of the model? Like - I have a data set with n features, have chosen algorithm - I can't just blast the data through the algorithm and expect to have a perfect fit. How do you choose the weights for features or features themselves? Or any other ways of improving the learning process?
How to handle missing values? I took a look at kaggle's higgs boson competition and there is some good part of the data that has null values in one to x columns. I thought about dividing the data into segments with missing column a, b, c, ab, bc, ac, abc but first of all the number of segments is 2x and I don't know if I don't break some statistical relations by doing so.
I've looked at job offerings (in Europe) and more than half of them require a PhD, is this normal?
Thank you all for any guidance.
[link][3 comments]