I'm playing around with neural nets in Haskell, and I've got this code :
import AI.HNN.FF.Network import Numeric.LinearAlgebra main :: IO () main = do net <- createNetwork 2 [5] 1 :: IO (Network Double) print samples let betterNet = trainNTimes 100 0.8 tanh tanh' net samples print (output betterNet tanh (fromList [11,10])) print "hi" samples :: Samples Double samples = [(fromList [a,b], fromList [out]) | a <- [1..25], b <- [1..25], let out = if a > b then 1 else 0]
Basically I give it input of the form [a, b], and the output is [c], where c is 1 when a > b, and 0 when a <= b. Using tanh as the activation function gives terrible results, when b >= a, the result is usually around -.9, and when a > b, the result is ~.85, but varies wildly, sometimes as low as 0.3. Then, I just swap out every instance of 'tanh' with 'sigmoid', and suddenly it works flawlessly, giving .9999 when a > b, and something like 3*10-5 when b >= a.
I really don't have much of an idea what I'm doing, so I'm hoping someone can explain the difference in performance, since everything I've been reading on the difference between the two makes it seem like they're almost interchangeable.
[link][2 comments]