Hessian Matrix
We will look into the Hessian Matrix meaning, positive semidefinite and negative semidefinite in order to define convex and concave functions. So let us dive into it!!! It would be fun, I think!
Hessian Matrix is a matrix of second order partial derivative of a function.
Positive Semidefinite Matrix :
For given Hessian Matrix H, if we have vector v such that,
transpose(v).H.v ≥ 0, then it is semidefinite. Example.
Similarly we can calculate negative semidefinite as well.
If we have positive semidefinite, then the function is convex, else concave.
Okay, but what is convex and concave function?
Convex and Concave function of single variable is given by:
Local Minima
What if we get stucked in local minima for non-convex functions(which most of our neural network is)? Well, the solution is to use more neurons (caution: Dont overfit). Why it works? I don’t know. CS theorists have made lots of progress proving gradient descent converges to global minima for some non-convex problems, including some specific neural net architectures.