AutoEmulate v0.3.0 release

AutoEmulate contributors - 4/30/2025

Updates on new features in v0.3.0 release

Release: v0.3.0

What's new
What's next

We're excited to announce the progress on AutoEmulate in this v0.3.0 release -- we've added several exciting new features advancing the package as an all-purpose emulation toolkit, as well as adding several new tutorials demonstrating the package in action. In the two sections below we look back at "What's new" before looking ahead to "What's next" in the upcoming cycle.

If you'd like to discuss any of our work on AutoEmulate or AI for Physical Systems, feel free to reach out to us at ai4physics@turing.ac.uk

What's new

History matching

One of the commonly used techniques in emulation for model calibration is "History Matching".
This technique aims to redefine the parameter space so that the predictions from the emulator are best representative of real observations.
This is done by iteratively ruling out regions of the parameter space that are implausible according to observed values.

Dimensionality reduction

To accelerate large-scale simulations, AutoEmulate must address the challenge of high-dimensional data. In this new release, we have integrated dimensionality reduction techniques into the framework, including both statistical methods like Principal Component Analysis (PCA) and deep learning approaches such as Variational Autoencoders (VAEs).
AutoEmulate automatically selects the best combination of dimensionality reducer (preprocessing) and predictive model (model) to deliver the best performance, enabling a seamless transformation of your data and advancing toward full auto-emulation of your simulations.

	preprocessing	model	short	rmse	r2
1	PCA	GaussianProcess	gp	0.133731	0.950240
2	VAE	GaussianProcess	gp	0.148129	0.930782
3	VAE	ConditionalNeuralProcess	cnp	0.333605	0.666806
4	PCA	RandomForest	rf	0.369583	0.657539
5	VAE	GradientBoosting	gb	0.376103	0.645977
6	VAE	RadialBasisFunctions	rbf	0.336832	0.643328
...	...	...	...	...	...

Physical simulations often rely on spatio-temporal data with complex, high-dimensional structures. To showcase AutoEmulate's capabilities, we developed a tutorial on constructing an emulator that generates solutions for a reaction-diffusion system governed by parametrized partial differential equations (PDEs). In this example, AutoEmulate selects the optimal combination of dimensionality reduction technique and predictive model to efficiently explore and generate new spatial solutions for different values of the reaction and diffusion coefficients.

Active learning (experimental)

Physics simulations can be quite accurate but also quite costly. Emulators offer fast approximations to simulations, but only in situations they've been "trained" on. Active learning is the processes of efficiently requesting data points from a simulation (to save time and money) to train the emulator on (to maximise its accuracy).
Check out our active learning tutorial notebook, where we demonstrate several different active learning algorithms.
A simulator can be seen as a function mapping from inputs x to outputs y, and an emulator mapping from inputs x to approximated outputs ŷ. We want ŷ to be as close as possible to y. See the schematic below for a basic active learning process.

End-to-end workflow example with a user-provided simulator

In this example we demonstrate the simplicity of setting up an end-to-end workflow which trains and calibrates an emulator for a simulator provided by the user.

What's next

We are always working on extending AutoEmulate to handle a wider range of use cases. Our next development goals focus on:

Introducing more complex models. This includes more types of Gaussian Processes and adding ensemble methods as well as support for multimodal and multifidelity data.
Expanding downstream task capabilities (e.g., uncertainty quantification, data assimilation, inverse design, optimal sensor placement) by adding in-built tools for typical emulation workflows, alongside a PyTorch backend refactor to enable seamless integration with other tools in the broader ecosystem.
Continuing development of the active learning functionality, aiming to integrate it into the main release, to facilitate simulator-in-the-loop deployments.