Hi guys,
I'm a little confused and I'm hoping you can help me out. Suppose you have a non-convex differentiable function that only contains one minimum. Gradient descent should find this one minimum (global minimum) even though the function is non-convex, correct?
Assuming the above is correct, then the only way gradient descent would fail is if the differentiable function had more than one minimum (local minima). Looking at http://holehouse.org/mlclass/06_Logistic_Regression.html at the "Cost function for logistic regression" section, he goes over an intuitive default cost function, then explains that it won't work due to non-convexity/localminima. I've plotted the function here: http://www.wolframalpha.com/input/?i=plot+%281%2F%281%2Be%5E-x%29%29%5E2+ and it doesn't appear to have multiple minima. The entire cost function J(theta) is a sum of multiple instances of these, so I played around with summing the function with different y values like this: http://www.wolframalpha.com/input/?i=plot+%281%2F%281%2Be%5E-x%29%29%5E2+%2B+%281%2F%281%2Be%5E-x%29+-+1%29%5E2++ but there are still no local minima (only one minimum). It seems as though gradient descent should be guaranteed to work. I know I must be missing something or misunderstanding something, but I can't figure out what.
Any help would be greatly appreciated. Thanks!
[link] [5 comments]