Hi everyone -- I'm fairly fresh into machine learning and was hoping to get some guidance...
I'd like to construct a proof of concept algorithm that would take an MP3 encoded song, and output two files of the same length: one "acapella" and one "instrumental". This was the first thing I thought of when reading about the Cocktail Party Algorithm.
I would think to approach the problem in an ML one could use a neural network, and train it with studio quality song/acapella/instrumental tuples.
As a beginner, the questions I have are:
- Is this idea feasible?
- How might you preprocess the data (if at all?)
- What do you think would be the biggest roadblock?
- How much training data would be necessary?
- What kind of classifiers (in addition to or in lieu of a backprop NN) would you use to increase accuracy?
- Other general advice?
Thanks all in advance!
[link][2 comments]