due 2/20/2025 before midnight via Learning Suite 25 possible points
Autoencoders are used in many architectures to either reduce or increase dimensionality. In this problem we solve a high dimensional ODE problem by first projecting it into a low dimensional space (encoder). This low dimensional space is called the latent space. Next we solve the neural ODE in the latent space. Finally, we project the solution back to the original high-dimensional space (decoder). Our objective will simultaneously train the neural net in the ODE and the neural nets for the encoder/decoder.
This paper uses that approach but also adds a physics-based loss term. That allows us to have more accurate solutions outside the training data. Read the methods section to better understand the formulation.
We’ll reproduce the first example: the lifted duffing oscillator. In this file I’ve written functions that generate training data, testing data, and collocation points. I’ve reproduced the results in the paper, but it takes a little while to run. To make things easier I’ve expanded the range of the training data a bit (rather than limiting to just the red region, I expanded the training data to a little over half the domain with collocation points on the remainder of the domain). That should allow you to get by with significantly less training data and collocation points (I used about 600 training points, and 10,000 collocation points but could have gotten away with less). Even still, you may find it beneficial to use a GPU. Our goal is to get the MSE error for the test set (100 test points) below 0.1.
You should also be able to produce a plot like the lower right of Fig 3 (except we won’t worry about coloring separate regions). I provided a function true_encoder that you can use (the paper also uses the true encoder for the visualization). We can use our trained encoder for this projection, but it won’t necessarily match since there will be many possible solutions for a good latent space. So in general this isn’t something that one would know, it just helps in this particularly case where we know what the projection looks like to see if our training is on track.
Tips:
options={'dtype': torch.float32}
to odeint.reshape
works and make sure you aren’t using it when you should be using permute
.nbatches
below) then use that number to calculate what batch size you need for each.
batch_size_train = int(ntrain / nbatches)
batch_size_collocation = int(ncol / nbatches)
device = "cuda"
(to allow it to run on the GPU or locally without changing code you can use device = 'cuda' if torch.cuda.is_available() else 'cpu'
). You also need to moved all the data in your torch tensors, including those in the model, to the gpu device. (i.e., model.double().to(device)
, torch.tensor(x, dtype=torch.float64, device=device)
). Note that Google has time limits on free GPU usage.