Assignment 4 (released 28 Oct, due 8 Nov 7 PM)

Published

October 28, 2023

Instructions

  • Total Marks: 9
  • Use torch for the assignment.
  • For sampling, use torch.distributions and do not use torch.random directly.
  • The assignment has to be done in groups of two.
  • The assignment should be a single jupyter notebook.
  • The results from every question of your assignment should be in visual formats such as plots and tables. Don’t show model’s log directly in Viva. All plots should have labels and legends appropriately. If not done, we may cut some marks for presentation (e.g. 10%).

Questions

  1. [1 mark] Implement Logistic Regression using the Pyro library referring [1] for guidance. Show both the mean prediction as well as standard deviation in the predictions over the 2d grid. Use NUTS MCMC sampling to sample the posterior. Take 1000 samples for posterior distribution and use 500 samples as burn/warm up. Use the below given dataset.
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split

X, y = make_moons(n_samples=100, noise=0.3, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=0.2, random_state=42)
  1. [2 marks] Consider the FVC dataset example discussed in the class. Find the notebook link at [2]. We had only used the train dataset. Now, we want to find out the performance of various models on the test dataset. Use the given dataset and deduce which model works best in terms of error (MAE) and coverage? The base model is Linear Regression by Sklearn (from sklearn.linear_model import LinearRegression). Plot the trace diagrams and posterior distribution. Also plot the predictive posterior distribution with 90% confidence interval.
x_train, x_test, y_train, y_test = train_test_split(train["Weeks"], train['FVC'], train_size = 0.8, random_state = 0)
Pooled Model Older (wrong) version

\[\begin{align*} \alpha &\sim \mathcal{N}(0, 500) \\ \beta &\sim \mathcal{N}(0, 500) \\ \sigma &\sim \text{HalfNormal}(100) \end{align*}\] \[\begin{equation*} FVC_j \sim \text{Normal}(\alpha + \beta \cdot Week_, \sigma) \end{equation*}\] where \(j\) is the patient index.

Pooled Model Newer (corrected) version

\[\begin{align*} \alpha &\sim \mathcal{N}(0, 500) \\ \beta &\sim \mathcal{N}(0, 500) \\ \sigma &\sim \text{HalfNormal}(100) \end{align*}\] \[\begin{equation*} FVC_j \sim \text{Normal}(\alpha + \beta \cdot Week_j, \sigma) \end{equation*}\] where \(j\) is the week index.


Partially pooled model with the same sigma

\[\begin{align*} \mu_\alpha &\sim \mathcal{N}(0, 500) \\ \sigma_\alpha &\sim \text{HalfNormal}(100) \\ \mu_\beta &\sim \mathcal{N}(0, 3) \\ \sigma_\beta &\sim \text{HalfNormal}(3) \\ \alpha_i &\sim \mathcal{N}(\mu_\alpha, \sigma_\alpha) \\ \beta_i &\sim \mathcal{N}(\mu_\beta, \sigma_\beta) \\ \sigma &\sim \text{HalfNormal}(100) \end{align*}\]

\[\begin{equation*} FVC_{ij} \sim \text{Normal}(\alpha_i + \beta_i \cdot Week_j, \sigma) \end{equation*}\] where \(i\) is the patient index and \(j\) is the week index.

Partially pooled model with the sigma hyperpriors

\[\begin{align*} \mu_\alpha &\sim \mathcal{N}(0, 500) \\ \sigma_\alpha &\sim \text{HalfNormal}(100) \\ \mu_\beta &\sim \mathcal{N}(0, 3) \\ \sigma_\beta &\sim \text{HalfNormal}(3) \\ \alpha_i &\sim \mathcal{N}(\mu_\alpha, \sigma_\alpha) \\ \beta_i &\sim \mathcal{N}(\mu_\beta, \sigma_\beta) \\ \gamma_\sigma &\sim \text{HalfNormal}(30) \\ \sigma_i &\sim \text{Exp}(\gamma_\sigma) \end{align*}\]

\[\begin{equation*} FVC_{ij} \sim \text{Normal}(\alpha_i + \beta_i \cdot Week_j, \sigma_i) \end{equation*}\] where \(i\) is the patient index and \(j\) is the week index.

  1. [4 marks] Use your version of following models to reproduce figure 4 from the paper referenced at [2]. You can also refer to the notebook in the course.
    1. Hypernet [2 marks]
    2. Neural Processes [2 marks]
  2. [2 marks] Write the Random walk Metropolis Hastings algorithms from scratch. Take 1000 samples using below given log probs and compare the mean and covariance matrix with hamiltorch’s standard HMC and emcee’s Metropolis Hastings implementation. Use 500 samples as the burn/warm up samples. Also check the relation between acceptance ratio and the sigma of the proposal distribution in your from scratch implementation. Use the log likelihood function given below.
import torch.distributions as D
def log_likelihood(omega):
    omega = torch.tensor(omega)
    mean = torch.tensor([0., 0.])
    stddev = torch.tensor([0.5, 1.])
    return D.MultivariateNormal(mean, torch.diag(stddev**2)).log_prob(omega).sum()