Every so often, a paper comes along that feels like it bridges two worlds that have been talking past each other. Recently, I came across one of those papers: “Automatic network structure discovery of physics informed neural networks via knowledge distillation” by Ziti Liu and colleagues. It’s an ambitious piece of work that tries to give neural networks a structure that reflects the laws of physics.
Physics informed neural networks, or PINNs, are already fascinating (I introduced myself to them while working at the German Aerospace Center a few years ago). They learn from both data and the governing equations of a system, blending numerical analysis with machine learning. The problem is that they often rely on carefully designed loss functions to enforce physical laws, which means they don’t really understand structure. They follow the rules we give them, but they don’t internalize the deeper symmetries or conservation laws behind those rules.
Liu and colleagues propose an intriguing alternative. Instead of just penalizing the network when it violates physics, they teach it to discover and internalize the physical structure itself. What I love about their paper is that it solves a problem by sidestepping it. The authors note that if you just add standard regularization, a method to simplify networks, to a normal PINN, it conflicts with the physics loss function. The two goals „be simple“ and „obey the physics“ fight each other, and you often end up with a worse result. So, they decouple them. This is the core idea, and it works in three stages.
First, they train a „Teacher NN,“ which is a standard, run-of-the-mill PINN. Its job is to do the initial hard work: it learns to solve the partial differential equation by minimizing all those complex physics-based loss functions. Next, a „Student NN“ is trained. But its job is much simpler: it doesn’t look at the original PDE. It just learns to mimic the Teacher’s output. This is classic knowledge distillation. But here’s the trick: while the Student is learning, the researchers apply L2 regularization. Since the Student is decoupled from the complicated physics loss (the Teacher already handled that!), the regularization can do its job properly. It gently nudges the network’s parameters. Parameters that do similar or related jobs start to „cluster“ together, converging toward similar values. You can literally watch this happen in a plot from their paper. This is the payoff. Once the Student network is trained and its parameters are all clustered, they don’t just use it. They analyze its structure. They use a clustering algorithm to find the centers of these parameter groups. This reveals the hidden relationships. They might find that a bunch of weights are the same, or one weight is exactly the negative of another. This pattern is the physical symmetry, discovered automatically. Finally, they build a new, „Re-constructed NN“. They call it Ψ-NN, or „Psi-NN„. In this new network, those relationships are hard-wired. The network’s very architecture now reflects the physics. For the Laplace equation, they show how this process results in a network that intrinsically computes a symmetric solution, u(x1,x2)=…(ua(x1,x2)+ua(x1,−x2)), purely from its new design.
So, this all sounds elegant, but does it work? (spoiler: yes, see their implementation on GitHub) The team tested their Ψ-NN against a standard PINN and a „PINN-post“ (a PINN where they manually added the physical symmetries, the old-fashioned way) on several classic problems. The results are pretty staggering. It’s more accurate. In all cases, Laplace, Burgers, and Poisson equations, the Ψ-NN had a significantly lower final error than both the standard PINN and the manually-constrained one. The automatically discovered structure was better than the human-imposed one. It’s also way faster. The new structured network converges in a fraction of the time. For the Laplace problem, it reached the same accuracy (a loss of 1e-3) in about 50% fewer iterations than the standard PINN. By learning the right structure, the network doesn’t waste time searching in the wrong directions. But the real superpower is generalization. This is the part that really got me. The new, structured network isn’t just better at solving the one problem it was trained on. It seems to have learned the underlying physics in a way that lets it generalize.
They tested this on the Burgers‘ equation, which has a viscosity parameter. They used their method to discover the network structure for one viscosity value. Then, they took that same network structure and used it to solve the problem for two new viscosity values. The Ψ-NN found the correct answer for the new problems more efficiently and accurately than the other models. Because it had learned the form of the solution, not just a single, brittle answer. This is even cooler: they took the network structure they discovered from the Laplace equation and applied it to a completely different, much more complex problem: steady fluid flow around a cylinder. And it still worked, performing beautifully and helping to solve this complex fluid dynamics problem. This implies that the fundamental physical structures learned from one system can serve as a powerful building block for understanding another.
This work really does feel like a shift. We’re moving from „black-box“ solvers that are just forced to be less wrong by a loss function, to models that learn to internalize the principles of a system. It’s an automated way to do what scientists have done for centuries: find the underlying symmetries that make the world make sense. It’s a beautiful, elegant, and powerful idea. It makes you wonder: what other hidden structures are just waiting to be distilled from our data?
PS. In the last couple of months after returning from Brussels, I was busy finishing two research papers that use PINNs (I mentioned one of them related to modeling of printed memristors last year at the end of this blog post 😉 which is already submitted to the IEEE Access journal since a few weeks, so fingers crossed that it will be accepted so that I can inform you all about it on this blog), with the second one being also done and ready to be submited in a few days also to IEEE Access. The third research I was working on, which I didn’t have great success with in my experiments when using only PINNs this summer, is related to Hardware Description Language (e.g., Verilog-A). However, after reading the Ψ-NN paper and experimenting with their open-source implementation until late at night over the last two days, I started tinkering with the old PINN implementation I had, updated it to use Ψ-NN, and wow, it works! I am excited to share outstanding results soon, as I am already working hard on writing/updating all my experiments for the third research paper. I hope to submit it to IEEE Access within 1-2 weeks, once I’ve refined it. All thanks to this Ψ-NN paper being open-access and its implementation open-source on GitHub! This is why I also prefer publishing only in open-access journals and posting my code on either my blog or my GitHub account. So that everyone has easy and fast access to free knowledge and the opportunity to judge for themselves whether my work can bring them joy in their research, without incurring any costs. The authors of the Ψ-NN paper and its implementation have definitely given me some pleasure over the past few days: I started to see the light at the end of the tunnel.




Neueste Kommentare