Assignment 1 (released 10 Aug, due 18 Aug)
Instructions
- Total marks: 6
- Use torch for the assignment
- For distributions use torch.distributions and do not use torch.random directly
- The assignment has to be done in groups of two.
- The assignment should be a single Jupyter notebook.
- The results from every question of your assignment should be in visual formats such as plots, tables. Don’t show model’s log directly in Viva. All plots should have labels and legends appropriately. If not done, we may cut some marks for presentation (e.g. 10%).
- To know more about a distribution, just look at the Wikipedia page.
Questions
Optimise the following function using torch autograd and gradient descent, f(θ) = (θ₀ - 2)² + (θ₁ - 3)². In addition to finding the optima, you need to show the convergence plots. [0.5 marks]
Generate some data (100 data points) using a univariate Normal distribution with
loc=2.0
andscale=4.0
.Plot a 2d contour plot showing the Likelihood or the Log-Likelihood as a function of
loc
andscale
. Please label all the axes including the colorbar. [1 mark]Find the MLE parameters for the
loc
andscale
using gradient descent. Plot convergence plot as well. [1 mark]Redo the above question but learn
log(scale)
instead ofscale
and then finally transform to learnscale
. What can you conclude? Why is this transformation useful? [0.5 mark]
Generate some data (1000 data points) using a univariate Normal distribution with
loc=2.0
andscale=4.0
and using Student-T distributions with varying degrees (from 1-8) of freedom (1000 data points corresponding to each degree of freedom). Plot the pdf (and logpdf) at uniformly spaced data from (-50, 50) in steps of 0.1. What can you conclude? [1 mark]Analytically derive the MLE for exponential distribution. Generate some data (1000 data points) using some fixed parameter values and see if you can recover the analytical parameters using gradient descent based solution for obtaining MLE. [1 mark]
Generate some data (100 data points) using a univariate Normal distribution with
loc=2.0
andscale=4.0
. Now, create datasets of size 10, 20, 50, 100, 500, 1000, 5000, 10000. We will use a different random seed to create ten different datasets for each of these sizes. For each of these datasets, find the MLE parameters for theloc
andscale
using gradient descent. Plot the estimates ofloc
andscale
as a function of the dataset size. What can you conclude? [1 mark]