Information Physics Part One

Creating the physics of learning

Featured image

This is a work in progress

Typically when I have worked on projects in the past, I dont share them until they are complete. The advantage is that readers get a complete description of the project, but the problem with this strategy is that since writing papers is a long process, so I don’t get any feedback until the end of the project. Today I am going to try something different: I am going to share my work as I go. Keep in mind that since it is incomplete, there may be typos, errors, and logical pitfalls, but I hope that most readers will be able to ignore these and focus on the big picture. Please feel free to leave comments and suggestions, your feedback is greatly appreciated!

Background

In the history of AI, models are often either inspired by nature, such as using neural networks to mimic the brain, created from intuitions, such as word vectors and transformers, or they are discovered through trial and error, such as the development of RELU layers. But wouldn’t it be nice if we could derive models from first principles? This is the goal of information physics. The idea is to derive models of learning by first writing down exactly how we want the thing to behave, then using mathematics to derive a solution. The idea is that by constructing models in this way, we can break free from limits set by our intuition and trial and error, and instead create models that are truly optimal.

The fascinating thing that we discovered along the way, is that the equations governing “optimal learning” are equivalent to laws of physics under certain modeling assumptions. Explicitly, the dynamics of continuous model parameters can be mapped to the dynamics of charged particles in a vacuum, and the dynamics of discrete model parameters can be mapped to the dynamics of quantum spin states. Learning becomes equivalent to an external force acting on the system.

Why is this important? Because it allows us to take hard problems, such as language processing or image analysis, and map them onto problems that are easy to solve, such as simulating the motion of particles. The hope is that by creating this mapping, we are able to solve hard problems in AI by solving easy problems in physics.

Information Physics

This part one of a series of 3 posts that I plan to write on what I am calling “information physics”. In part one we show that the dynamics of continuous model parameters can be mapped to the dynamics of charged particles in a vaccum. In part two we show that the dynamics of discrete model parameters can be mapped to the dynamics of quantum spin states. In part three we show that learning becomes equivalent to an external force acting on the system.

You can read my paper in progress here: Information Physics. Please feel free to leave comments and suggestions. Your feedback is greatly appreciated!