Quantcast
Channel: Machine Learning
Viewing all 57546 articles
Browse latest View live

Google acquires Katango for its Automatic Friend Finder based on ML clustering techniques


How do you decide the bins initially in Real AdaBoost?

Apache Mahout: Scalable machine learning for everyone

How can I acquire the text of reddit headlines from R?

Mentorship?

$
0
0

I am a noob machine learner, just starting on my career in an NLP job in a few months. And I also plan to go to grad school in the next few years I feel clueless a lot of the times, being mainly self-taught, and don't have anyone to talk to about machine learning, or discuss research topics and papers with.

Is there any place or website where I can go to find mentors? Is there any redditor here who would be willing to mentor me? I'll be much obliged!

submitted by lone_haranguer
[link] [12 comments]

All code from "Machine Learning for Email" now on Github

Large scale ML

$
0
0

Hi all.

I work for a company who is looking to do large scale data mining. I was wondering if there were any solutions currently available to do large scale distributed data mining. Thanks!

submitted by ml_noob
[link] [3 comments]

How to compare models with different dimensions

$
0
0

I have some data (~200 dim) and not that many examples (~300). I am trying to get the best generative model of the data. I am using GMMs to create the model however I am adding a PCA reduction prior to this and building the GMM over the reduced subspace, the problem arises in comparing distributions with different dimensionality. I attempted to use cross validation of the likelihood however the problem is in one model the input is 100 dim in another model the input is some other dimension size so my understanding is that i cannot compare these likelihoods directly since lower dimensional data will have more approximation error than the higher dimensional data(but less parameters and noise specific to the training data? one would hope) . Any ideas in how to compare these in order to choose the best dimension to represent the data?

submitted by iHeartML
[link] [9 comments]

How well does gradient descent work on extremely noisy data?

$
0
0

I'm interested in using gradient descent to predict the probability that a website visitor will click on an ad, given a number of things we know about them (geographic location, browser, operating system, referrer, etc).

Typical click-rates are around 0.1%, and obviously there are a lot of factors that play a part in whether or not the user clicks that aren't represented in the input attributes. This means that from the learning algorithm's perspective the output data is extremely noisy.

How well does gradient descent work on this kind of problem where you're essentially trying to pick out comparatively subtle relationships between the input and output data amidst a lot of noise?

Would a data mining approach be more effective here?

edit: In response to some comments, yes - I would use a logistic regression of some form because the output must be a probability.

edit2: In response to those asking about my cost function - my ultimate goal is, given multiple ads to choose from to show to a user, pick the one they are most likely to click on.

submitted by sanity
[link] [35 comments]

This Guy Broke Jeopardy’s All-Time Record… Using ML Techniques To Train Himself

More courses from Stanford

Using fmin from scipy optimize

$
0
0

is using the fmin functions from scipy optimize a good way to train neural nets. And in neural nets, why are multipliers used in each weight of the perceptron - has anyone tried powers or more complicated thing on each weight of the perceptron?

submitted by marshallp
[link] [10 comments]

Can Big Data Fix Healthcare?

ML for making code recommendations (people that called X also called Y)

Data Scientist vs Statistician?

$
0
0

Hi I thought this would be the most appropriate sub reddit for this kind of thing. My question is what exactly is the difference between the two? I tried googling the answers but most people are dodging the question or give an inaccurate description of statisticians.

submitted by SinisterSamurai
[link] [39 comments]

Am I planning it right for a PhD in ML?

$
0
0

I am a Master's student in EE from a pretty decent school, specializing in Image Processing and Machine Learning. I want to do a PhD in the long run, but I want to be fully prepared, armed with a solid knowledge of the basics and good experience in the field before taking my 5-year plunge. Basically, when I start my PhD, I want to hit the ground running, doing research, and not waste the first year just studying the basics.

Toward this end, my plan after my Master's is to do a research internship in some lab for a year and hope to get a/some publications out.

I just want to ask you guys, will this really help my PhD applications? I want to get into a good PhD program.

What about working in a ML startup? I know there are a LOT of companies out there with TONS of data, will it boost my resume if I worked in some such core ML company (even if only a small startup), gained good knowledge, but got no publications?

Thanks a lot!

submitted by marshmallowsOnFire
[link] [31 comments]

Ask r/ML: What to do when you the size of the feature set is much larger than the training set?

$
0
0

Hi,

As mentioned above what should I do? If there are resources that I can read about this that'd be great too. Thanks

EDIT: Was going to use NN and SVM...

submitted by tshauck
[link] [14 comments]

I am interested in applying statistics/machine learning to the field of finance

$
0
0

I have a very strong background in finance, but much weaker background in statistics and computer science. I am interesting in learning this topic to apply to my work in financial research.

To be more specific, I am interested in using historical data about a company (company 10K, for example, which includes income statement, balance sheet, etc), and from this data predict their credit rating (which is given by a third party. There are a number of academic articles written on the topic that suggest high accuracy. However, these articles obviously aren't aimed at someone just exploring the field.

So where do I begin? If i wanted to accomplish something like this and get my feet wet, what software do I need? what books/references should I have to learn more? Avoiding returning to school, what are some of the basic books I should read if any (such as linear algebra).

Thanks for your comments.

submitted by rrbest
[link] [27 comments]

Any tips about becoming an AI engineer?

$
0
0

I'm extremely interested in studying AI for graduate school. Any tips on where I should go,and what I can expect as a career?

I am currently working on my Software Engineering degree at Oregon Institute of Technology.

submitted by Rakosman
[link] [24 comments]

Functional and Parallel time series cross-validation

Viewing all 57546 articles
Browse latest View live




Latest Images