But these models become computationally expensive as their number of neurons and synapses increases, and they require heavy computer programs to solve their underlying, complex mathematics. And all of this math, similar to many natural phenomena, gets harder to solve with size, meaning we calculate many small steps to arrive at a solution. Now, the same team of scientists has discovered a way to alleviate this bottleneck by solving the differential equation behind how two neurons interact via synapses to unlock a new type of fast and efficient artificial intelligence algorithms. These modes have the same characteristics of fluid neural networks—flexible, causal, robust, and explainable—but are orders of magnitude faster and scalable. This type of neural network could therefore be used for any task that involves learning about data over time, as it is robust and adaptable even after training — while many traditional models are fixed. The models, called a “closed-form continuous-time” (CfC) neural network, outperformed their state-of-the-art counterparts on a number of tasks, with significantly higher speeds and performance in recognizing human activities from motion sensors, modeling physical dynamics of a simulated robot walker, and sequential event-based image processing. In a medical prediction task, for example, the new models were 220 times faster on a sample of 8,000 patients. A new paper on the project is published today in Nature Machine Intelligence. “New machine learning models that we call ‘CfC’s’ replace the differential equation that defines the neuron’s computation with a closed-form approach, preserving the beautiful properties of fluid networks without the need for numerical integration,” says MIT Professor Daniela Rus, director . of the Computer Science and Artificial Intelligence Laboratory (CSAIL) and senior author on the new paper. “CfC models are causal, compact, explainable and efficient in training and prediction. They pave the way for reliable machine learning for security-critical applications.” Keeping things moist Differential equations enable us to calculate the state of the world or a phenomenon as it evolves, but not over time — just step by step. To model natural phenomena over time and understand past and future behavior, such as human activity recognition or a robot’s path, for example, the team grabbed a bag of mathematical tricks to find just the ticket: a solution “closed form” that models the entire description of an entire system, in a single computational step. With their models, one can calculate this equation at any time in the future and at any time in the past. Not only that, but the speed of calculation is much faster because you don’t have to solve the differential equation step by step. Imagine an end-to-end neural network that receives driving data from a camera mounted on a car. The network is trained to produce outputs such as the steering angle of the car. In 2020, the team solved this using fluid neural networks with 19 nodes, so that 19 neurons plus a small perception unit could drive a car. A differential equation describes each node of this system. With the closed-form solution, if you substitute it into this network, it would give you the exact behavior, as it is a good approximation of the actual dynamics of the system. They can thus solve the problem with an even smaller number of neurons, which means it would be faster and less computationally expensive. These models can take inputs as time series (events that happened in time), which could be used for classification, controlling a car, moving a humanoid robot, or predicting economic and medical events. With all these various modes of operation, it can also increase accuracy, robustness and performance, and most importantly, calculation speed — which sometimes comes as a trade-off. Solving this equation has far-reaching implications for advancing research in both systems physics and artificial intelligence. “When we have a closed description of the communication of neurons and synapses, we can create computational models of brains with billions of cells, a capability not possible today due to the high computational complexity of neuroscience models. The closed-form equation could facilitate such large-scale simulations and therefore opens up new avenues of research to understand intelligence,” says Ramin Hassani, the first author of the new paper, an MIT CSAIL Research Fellow. Portable learning Additionally, there is early evidence of Liquid CfC models learning tasks in an environment from visual inputs and transferring their learning skills to an entirely new environment without additional training. This is called non-distributional generalization, which is one of the most fundamental open challenges of AI research.
“Neural network systems based on differential equations are difficult to solve and scale to, say, millions and billions of parameters. “Describing how neurons interact with each other, not just thresholding, but resolving the physical dynamics between cells allows us to build larger-scale neural networks,” says Hasani. “This framework can help solve more complex machine learning tasks—enabling better representation learning—and should be the core building blocks of any future embedded intelligence system.” “Recent neural network architectures, such as neural ODEs and fluid neural networks, have hidden layers composed of specific dynamical systems that represent infinite latent states instead of clear stacks of layers,” says Sildomar Monteiro, head of the AI ​​and Machine Learning Group at Aurora Flight Sciences. a Boeing company, which was not involved in this paper. “These implicitly specified models have demonstrated state-of-the-art performance while requiring far fewer parameters than conventional architectures. However, their practical adoption has been limited due to the high computational cost required for training and inference.” He adds that this paper “shows a significant improvement in computational performance for this class of neural networks… [and] has the potential to enable a wider range of practical applications related to security-critical commercial and defense systems.” Hasani and Mathias Lechner, a postdoc at MIT CSAIL, wrote the paper under Rus’ supervision, along with Alexander Amini, an MIT postdoc. Lucas Liebenwein SM ’18, PhD ’21; Aaron Ray, MIT engineering and computer science PhD student and CSAIL Fellow. Max Tschaikowski, associate professor of computer science at Aalborg University in Denmark. and Gerald Teschl, professor of mathematics at the University of Vienna.