due 1/30/2025 before midnight via Learning Suite 25 possible points
Reproduce Appendix A.1 of this paper using a physics informed neural net (PINN) to solve Burgers’ equation.
Create a figure similar to Fig A6: the top contour plot (but you don’t need all the x’s marking data locations) and the rightmost graph that shows a slice through the data at t=0.75 (you just need your prediction, don’t need to plot “exact”). I used \(N_u = 100\) and \(N_f = 10,000\). You’ll likely find that you get pretty good prediction but the shock wave isn’t captured as well (rounded instead of sharp). That’s sufficient for the purpose of this assignment, but if you’re interested in improving that, see the advanced tips below the regular tips.
Note that, like we’ve done in the past, separate train/test datasets are important to make sure the model isn’t overfitting. In this case the problem is small, so the number of data and collocation points provides super high coverage everywhere we are making predictions at, so making a separate testing set won’t matter. But for larger problems you should definitely have a test set.
A few tips:
torch.autograd.grad
you will need to use the grad outputs option. We have vectors x and t going in, and vector u coming out, where each element of the vectors corresponds to a different data sample. In other words, dui/dxi and dui/dti are independent of every other index i. To compute all these derivatives in one shot, pass a vector of ones in the grad outputs option (i.e., grad outputs = torch.ones_like(x)
). This is called the “seed” for algorithmic differentiation.create_graph=True
in the call to torch.autograd.grad
since we will need to backpropgate through these derivatives (i.e., compute derivatives of derivatives)from scipy.stats import qmc
. Though I’m sure you could do fine for this small problem with just regular random sampling or even sampling. Either way, be sure that these sampling points stay fixed during the training.Optional advanced tips if you want to really capture that shock:
dtype=torch.float64
and for the network you need to change all its weights and biases to double precision also: model.double()
where model is your instantiated network.line_search_fn="strong_wolfe"
). In the optimization world, we always use second-order methods like BFGS. But they are not compatible with minibatching and so the DL world almost always uses first-order methods. In this case we don’t have tons of data, so we don’t need minibatching, and the second-order optimizer will do much better. It will be much slower per epoch, but you’ll also need way less epochs. LBFGS with a line search is setup to work differently and you will need to create a closure function when you call optimizer.step(closure)
. It’s essentially the same as the train function. Search online or use AI chatbots for examples. In this case you’ll want to set optimizer.zero_grad()
at the beginning of the closure function. Adam and all the other optimizers work with closure functions too, they just don’t require it.