PINNs are neural networks that encode model equations. a NN must fit observed data while reducing a PDE residual.
Introduction
The “curse of dimensionality” was first described by Bellman in the context of optimal control problems. (Bellman R.: Dynamic Programming. Sci. 153(3731), 34-37 (1966))
感觉可能更全面的一篇综述:https://doi.org/10.1007/s12206-021-0342-5。该文关注 what deep NN is used, how physical knowledge is represented, how physical information is integrated,本文只关于 PINN, a 2017 framework。
What the PINNs are
PINNs solve problems involving PDEs:
approximates PDE solutions by training a NN to minimize a loss function
includes terms reflecting the initial and boundary conditions
and PDE residual at selected points in the domain (called collocation points)
given an input point in the integration domain, returns an estimated solution at that point.
incorporates a residual network that encodes the governing physical equations
can be thought of as an unsupervised strategy when they are trained solely with physical equations in forward problems, but supervised learning when some properties are derived from data
DNN (deep neural network) is an artificial neural network that is deeper than 2 layers.
Feed-Forward Neural Network:
uθ(x)=CK∘Ck−1...α∘C1(x),Ck(x)=Wkxk+bk
Just change CNN from convolution to fully connected.
Also known as multi-layer perceptrons (MLP)
FFNN architectures
Tartakovsky et al used 3 hidden layers, 50 units per layer, and a hyperbolic tangent activation function. Other people use different numbers but of the same order of magnitude.
A comparison paper: Blechschmidt, J., Ernst, O.G.: Three ways to solve partial differential equations with neural networks –A review. GAMM-Mitteilungen 44(2), e202100,006 (2021).
performs well with multidimensional data such as images and speeches
CNN architectures:
PhyGeoNet: a physics-informed geometry-adaptive convolutional neural network. It uses a coordinate transformation to convert solution fields from irregular physical domains to rectangular reference domains.
According to Fang (https://doi.org/10.1109/TNNLS.2021.3070878), a Laplacian operator can be discretized using the finite volume approach, and the procedures are equivalent to convolution. Padding data can serve as boundary conditions.
convolutional encoder-decoder network
Recurrent Neural Network
fi(hi−1)=α(W⋅hi−1+U⋅xi+b), where f is the layer-wise function, x is the input, h is the hidden vector state, W is a hidden-to-hidden weight matrix, U is an input-to-hidden matrix and b is a bias vector. 我认为等号左边的 hi−1 应当作为下标
can be used to perform numerical Euler integration
基本上输出的第 i 项只与输入的第 i 和 i-1 项相关。
LSTM architectures
比 RNN 多更多中间隐变量,至于怎么做到整合长期记忆的,技术细节现在可以先略过
other architectures for PINN
Bayesian neural network: weights are distributions rather than deterministic values, and these distributions are learned using Bayesian inference. 只介绍了一篇文章
GAN architectures:
two neural networks compete in a zero-sum game to deceive each other
physics-informed GAN uses automatic differentiation to embed the governing physical laws in stochastic differential equations. The discriminator in PI–GAN is represented by a basic FFNN, while the generators are a combination of FFNNs and a NN induced by the SDE