Neural Operators: Learning Functions of Functions

The Big Idea

Regular neural networks learn to map numbers to numbers: - Input: A vector (e.g., [1.2, 3.4, 5.6]) - Output: Another vector or number

Neural operators learn to map functions to functions: - Input: An entire function f(x) - Output: Another entire function g(x)

A Simple Example

Imagine you have a differential equation:

\[-u'' = a(x)\]

where \(u''\) means “second derivative of \(u\)”.

Given any function \(a(x)\), you want to find the solution \(u(x)\).

This is an operator: it transforms the input function \(a(x)\) into the output function \(u(x)\).

In this notebook, we’ll learn this operator using neural networks!

Setup and Imports

import numpy as np
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
from tqdm import tqdm

# Set random seeds for reproducibility
np.random.seed(42)
torch.manual_seed(42)

# Set matplotlib style
plt.style.use('seaborn-v0_8-darkgrid')
plt.rcParams['figure.figsize'] = (12, 4)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")
Using device: cpu

Step 1: A Simple Differential Equation

We’ll work with the 1D Poisson equation:

\[-u'' = a(x)\]

with boundary conditions \(u(0) = u(1) = 0\).

In plain English: - We have some function \(a(x)\) (the “forcing function”) - We want to find \(u(x)\) whose second derivative is \(-a(x)\) - And \(u\) must be zero at the endpoints

We can solve this exactly using numerical methods (finite differences).

# Simple example: solve one instance
x = np.linspace(0, 1, 50)  # 50 points from 0 to 1
a = np.sin(np.pi * x)  # Our input function

# Solve using finite differences (numerical method)
u = solve_poisson_1d(a, n_points=50)

# Plot
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

ax1.plot(x, a, 'b-', linewidth=2)
ax1.set_title('Input: a(x)', fontsize=14, fontweight='bold')
ax1.set_xlabel('x')
ax1.set_ylabel('a(x)')
ax1.grid(True, alpha=0.3)

ax2.plot(x, u, 'r-', linewidth=2)
ax2.set_title('Output: u(x) where -u\'\' = a', fontsize=14, fontweight='bold')
ax2.set_xlabel('x')
ax2.set_ylabel('u(x)')
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("One solution computed!")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[2], line 6
      3 a = np.sin(np.pi * x)  # Our input function
      5 # Solve using finite differences (numerical method)
----> 6 u = solve_poisson_1d(a, n_points=50)
      8 # Plot
      9 fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

NameError: name 'solve_poisson_1d' is not defined

Step 2: Generate Training Data

To train a neural network, we need lots of examples!

We’ll create many random functions \(a(x)\) and solve for their corresponding \(u(x)\).

# Generate 500 training examples
n_train = 500
n_points = 50
x = np.linspace(0, 1, n_points)

inputs = []  # Will store all a(x) functions
outputs = []  # Will store all u(x) solutions

print(f"Generating {n_train} examples...")
for i in tqdm(range(n_train)):
    # Create random forcing function (combination of sines)
    a = np.zeros(n_points)
    for k in range(1, 4):  # Use 3 sine waves
        amplitude = np.random.uniform(-1, 1)
        phase = np.random.uniform(0, 2*np.pi)
        a += amplitude * np.sin(k * np.pi * x + phase)
    
    # Solve for u(x)
    u = solve_poisson_1d(a, n_points)
    
    inputs.append(a)
    outputs.append(u)

inputs = np.array(inputs, dtype=np.float32)
outputs = np.array(outputs, dtype=np.float32)

print(f"✅ Created dataset with shape: {inputs.shape}")

# Show 3 random examples
fig, axes = plt.subplots(3, 2, figsize=(12, 8))
for i in range(3):
    idx = np.random.randint(0, n_train)
    
    axes[i, 0].plot(x, inputs[idx], 'b-', linewidth=2)
    axes[i, 0].set_ylabel('a(x)')
    axes[i, 0].set_title(f'Example {i+1}: Input')
    axes[i, 0].grid(True, alpha=0.3)
    
    axes[i, 1].plot(x, outputs[idx], 'r-', linewidth=2)
    axes[i, 1].set_ylabel('u(x)')
    axes[i, 1].set_title(f'Example {i+1}: Solution')
    axes[i, 1].grid(True, alpha=0.3)

axes[2, 0].set_xlabel('x')
axes[2, 1].set_xlabel('x')
plt.tight_layout()
plt.show()
Generating 500 examples...
  0%|          | 0/500 [00:00<?, ?it/s]
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[3], line 19
     16     a += amplitude * np.sin(k * np.pi * x + phase)
     18 # Solve for u(x)
---> 19 u = solve_poisson_1d(a, n_points)
     21 inputs.append(a)
     22 outputs.append(u)

NameError: name 'solve_poisson_1d' is not defined

Step 3: Build a Simple Neural Network

Let’s start with the simplest approach: a standard neural network.

Input: 50 numbers (values of \(a(x)\) at 50 points)
Output: 50 numbers (values of \(u(x)\) at 50 points)

class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(50, 100),  # 50 inputs -> 100 hidden
            nn.ReLU(),
            nn.Linear(100, 100),  # 100 -> 100
            nn.ReLU(),
            nn.Linear(100, 50)  # 100 -> 50 outputs
        )
    
    def forward(self, a):
        return self.net(a)

model = SimpleNet().to(device)
print(f"Model created with {sum(p.numel() for p in model.parameters()):,} parameters")
print(model)
Model created with 20,250 parameters
SimpleNet(
  (net): Sequential(
    (0): Linear(in_features=50, out_features=100, bias=True)
    (1): ReLU()
    (2): Linear(in_features=100, out_features=100, bias=True)
    (3): ReLU()
    (4): Linear(in_features=100, out_features=50, bias=True)
  )
)
# Train the network
optimizer = optim.Adam(model.parameters(), lr=0.001)
criterion = nn.MSELoss()

# Convert data to PyTorch tensors
X_train = torch.from_numpy(inputs).to(device)
y_train = torch.from_numpy(outputs).to(device)

losses = []
n_epochs = 200

print("Training...")
for epoch in tqdm(range(n_epochs)):
    model.train()
    optimizer.zero_grad()
    
    predictions = model(X_train)
    loss = criterion(predictions, y_train)
    
    loss.backward()
    optimizer.step()
    
    losses.append(loss.item())
    
    if (epoch + 1) % 50 == 0:
        print(f"Epoch {epoch+1}: Loss = {loss.item():.6f}")

# Plot training curve
plt.figure(figsize=(10, 4))
plt.plot(losses, linewidth=2)
plt.xlabel('Epoch')
plt.ylabel('Loss (MSE)')
plt.title('Training Progress', fontweight='bold')
plt.grid(True, alpha=0.3)
plt.yscale('log')
plt.show()

print(f"✅ Training complete! Final loss: {losses[-1]:.6f}")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[5], line 6
      3 criterion = nn.MSELoss()
      5 # Convert data to PyTorch tensors
----> 6 X_train = torch.from_numpy(inputs).to(device)
      7 y_train = torch.from_numpy(outputs).to(device)
      9 losses = []

TypeError: expected np.ndarray (got list)

Step 4: Test the Network

Let’s create a brand new function and see if our network can solve it!

# Create a test function the network hasn't seen
a_test = 0.5 * np.sin(2*np.pi*x) + 0.3 * np.cos(3*np.pi*x)
u_true = solve_poisson_1d(a_test, n_points)

# Get network prediction
model.eval()
with torch.no_grad():
    a_tensor = torch.from_numpy(a_test.astype(np.float32)).unsqueeze(0).to(device)
    u_pred = model(a_tensor).cpu().numpy()[0]

# Plot results
fig, axes = plt.subplots(1, 3, figsize=(15, 4))

axes[0].plot(x, a_test, 'b-', linewidth=2)
axes[0].set_title('Input: a(x)', fontsize=14, fontweight='bold')
axes[0].set_xlabel('x')
axes[0].grid(True, alpha=0.3)

axes[1].plot(x, u_true, 'k--', linewidth=2, label='True solution', alpha=0.7)
axes[1].plot(x, u_pred, 'r-', linewidth=2, label='Neural network')
axes[1].set_title('Comparison', fontsize=14, fontweight='bold')
axes[1].set_xlabel('x')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

axes[2].plot(x, np.abs(u_true - u_pred), 'g-', linewidth=2)
axes[2].set_title('Absolute Error', fontsize=14, fontweight='bold')
axes[2].set_xlabel('x')
axes[2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

mse = np.mean((u_true - u_pred)**2)
print(f"✅ Test MSE: {mse:.6f}")
print(f"   Max error: {np.max(np.abs(u_true - u_pred)):.6f}")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[6], line 3
      1 # Create a test function the network hasn't seen
      2 a_test = 0.5 * np.sin(2*np.pi*x) + 0.3 * np.cos(3*np.pi*x)
----> 3 u_true = solve_poisson_1d(a_test, n_points)
      5 # Get network prediction
      6 model.eval()

NameError: name 'solve_poisson_1d' is not defined

What We Just Did! 🎉

We trained a neural network to learn an operator - a mapping from functions to functions!

Traditional approach: - Solve the differential equation numerically each time (slow!) - Requires understanding the math and coding finite differences

Neural operator approach: - Train once on many examples - Instantly predict solutions for new inputs - Much faster at inference time!

Limitations of This Simple Approach

  1. Fixed resolution: Trained on 50 points, can only predict at 50 points
  2. No understanding of spatial relationships: Treats each point independently
  3. Not efficient: Doesn’t use structure of the problem

What’s Next?

Advanced neural operators like Fourier Neural Operators (FNO) and DeepONet solve these issues by: - Working in frequency space (FFT) - Being resolution-independent - Learning better representations of operators

This simple example shows the core idea: learn to map functions to functions, not just numbers to numbers!

Further Reading

The Critical Test: Resolution Invariance

This is where Neural Operators shine and MLPs fail!

We trained both models on data with 101 grid points. Now let’s test them on: - Lower resolution: 51 points - Higher resolution: 201 points

Expected outcome: - MLP: Will fail completely (wrong input dimension!) - DeepONet: Will work at any resolution!

def test_resolution_invariance(mlp_model, deeponet_model, test_dataset, resolution_name):
    """
    Test models on different resolution data.
    """
    print(f"\nTesting on {resolution_name} resolution ({test_dataset.n_points} points)...")
    
    # Try MLP
    print(f"  MLP expected input: {mlp_model.n_points}, got: {test_dataset.n_points}")
    if test_dataset.n_points != mlp_model.n_points:
        print("  ❌ MLP FAILS! Input dimension mismatch.")
        mlp_error = None
    else:
        mlp_model.eval()
        errors = []
        with torch.no_grad():
            for i in range(len(test_dataset)):
                a, u_true = test_dataset[i]
                a = a.unsqueeze(0).to(device)
                u_pred = mlp_model(a).cpu().numpy()[0]
                errors.append(np.mean((u_pred - u_true.numpy())**2))
        mlp_error = np.mean(errors)
        print(f"  ✅ MLP works: MSE = {mlp_error:.6f}")
    
    # Try DeepONet
    deeponet_model.eval()
    x_coords = torch.from_numpy(test_dataset.x).float().unsqueeze(-1).to(device)
    errors = []
    with torch.no_grad():
        for i in range(len(test_dataset)):
            a, u_true = test_dataset[i]
            a = a.unsqueeze(0).to(device)
            x_batch = x_coords.unsqueeze(0)
            u_pred = deeponet_model(a, x_batch).cpu().numpy()[0]
            errors.append(np.mean((u_pred - u_true.numpy())**2))
    deeponet_error = np.mean(errors)
    print(f"  ✅ DeepONet works: MSE = {deeponet_error:.6f}")
    
    return mlp_error, deeponet_error

# Test on different resolutions
mlp_low, deeponet_low = test_resolution_invariance(mlp_model, deeponet_model, test_low_res, "LOW")
mlp_high, deeponet_high = test_resolution_invariance(mlp_model, deeponet_model, test_high_res, "HIGH")

Summary and Key Takeaways

What We Learned

  1. Standard MLPs learn point-wise mappings
    • They map fixed-size vectors to fixed-size vectors
    • Cannot handle different resolutions
    • Treat each grid point independently
  2. Neural Operators learn operator mappings
    • They learn mappings between function spaces
    • Are resolution-invariant (discretization-invariant)
    • Can evaluate at arbitrary query points after training
  3. DeepONet Architecture
    • Branch network: Encodes the input function
    • Trunk network: Encodes query locations
    • Combination: Inner product for final prediction
    • This separation enables resolution invariance!

Real-World Applications

Neural Operators are revolutionizing: - Climate modeling: Learn weather patterns at multiple scales - Fluid dynamics: Predict flow fields at any resolution - Material science: Design materials with desired properties - Medical imaging: Super-resolution and reconstruction - Inverse problems: Parameter estimation from observations

Next Steps

To learn more about Neural Operators: 1. Fourier Neural Operators (FNO): Learn in Fourier space for better efficiency 2. 2D/3D problems: Extend to higher dimensions 3. Time-dependent PDEs: Learn evolution operators 4. Physics-informed approaches: Incorporate PDE constraints

References

  • Lu et al. (2021) “Learning nonlinear operators via DeepONet” (arXiv:1910.03193)
  • Li et al. (2020) “Fourier Neural Operator for Parametric PDEs” (arXiv:2010.08895)
  • Kovachki et al. (2021) “Neural Operator: Learning Maps Between Function Spaces” (arXiv:2108.08481)

This notebook demonstrates the power of Neural Operators for learning mappings between function spaces. The key insight: learn the operator, not just the function values!