Information Geometry for Neural Networks

D. A. Wagenaar

Term paper for reading course with A. C. C. Coolen, King’s College London, 1998. [Full text (pdf)]

Information geometry is the result of applying non-Euclidean geometry to probability theory. The present work introduces some of the basics of information geometry with an eye on applications in neural network research. The Fisher metric and Amari’s α-connections are introduced and a proof of the uniqueness of the former is sketched. Dual connections and dual coordinate systems are discussed as is the associated divergence. It is shown how information geometry promises to improve upon the learning times for gradient descent learning. Due to the inclusion of an appendix about Riemannian geometry, this text should be mostly self-contained.

[Back]