辅导案例-CMSC 25025 /-Assignment 4

CMSC 25025 / STAT 37601 Machine Learning and Large Scale Data Analysis Assignment 4 Due: Tuesday, May 19, 2020 at 2:00 pm. Please hand in this Homework in 4 files: 1. A pdf of your jupyter notebook for problem 1. The derivation of the solution to 1(a)(ii) can be written in the notebook markup. 2. The ipynb file for problem 1. 3. A pdf of your jupyter notebook for problem 2. 4. The ipynb file for problem 2. 1. Sparse coding of natural images and digits (40 points) In this problem you will implement the sparse coding procedure as described in class on image patches of size 12×12. This was proposed as a possible computational mechanism underlying the evolution of neural representations in the visual cortex of mammals.1 You will use the actual images used in this landmark paper. To run the sparse coding algorithm over the images, we have provided a function that selects random patches. This can be run on the Olshausen-Field images using the following code import scipy.io %matplotlib inline import matplotlib.pyplot as plt import random import numpy data =scipy.io.loadmat(’/project2/cmsc25025/sparsecoding/IMAGES_RAW.mat’) images = data[’IMAGESr’] #Show the first image. plt.imshow(images[:,:,0], cmap=’gray’) # Function to sample image patches from the large images. def sample_random_square_patches(image, num, width): patches =numpy.zeros([width,width,num]); for k in range(num): i, j = random.sample(range(image.shape[0]-width),2) patches[:,:,k] = image[i:i+width,j:j+width] return patches We want to run the sparse coding scheme over the images. (a) We will implement this with SGD alternating the following two steps. i. With the current codebook find the coefficients α(i), i = 1, . . . , b for each exampleX(i) in the batch as a Lasso problem. ii. Fix the coefficients α(i) and compute the gradient of the loss with respect to each vector of the codebook. Write the gradient of the codebook matrix V = [V (1), . . . , V (L)] ∈ Rd×L as one matrix computation using V and X ∈ Rd×b – the batch data, and A = [α(1), . . . , α(b)] ∈ RL×b. 1B. Olshausen and D. Field, “Emergence of simple-cell receptive field properties by learning a sparse code for natural images,” Nature 381, 607–609, 1996. 1 (b) Use the class sklearn.linear model.Lasso for the coefficient estimation step. We want to fully exploit the vector computation properties of python. Make sure there are no ‘loops’ inside the main loop that iterates over steps in the SGD. All computations should be done with matrix operartions. For the Lasso step, you can have the fit function of the Lasso class fit all training points in the batch at once. (You may wan to compare the time it takes to do this to a loop calling Lasso on each training point separately.) (c) Monitor the convergence of the SGD algorithm by checking the change in the code- book. How long does it take to converge? Experiment with a step size η for your algorithm, constant or decreasing. Display the codebook after initialization, after con- vergence, and at several intermediate stages. Comment on your results. Are they con- sistent with the results presented in paper? (d) Show reconstructions of patches images using the sparse representation. 2. Convolutional networks for MNIST (60 points) The code in this notebook allows you to train a particular convolutional neural network (which we call the original model) on MNIST data. It also saves the model in your directory and has code to reload the model and continue training or to simply test the model on a new data set. You have two options: • You can download the notebook into your computer and remove the first cell, which is specific to google drive. • Or, you can save this notebook to your google drive by going to the file menu and choosing save a copy in drive. This will be saved in your google colab folder as explained in an earlier message. Upload the MNIST data to your google drive and you’re ready to go. The advantage is that you can activate a GPU and have the algorithm run very fast. Once you open you own colab notebook, use the runtimemenu, choose the Change runtime type and pick gpu from the dropdown menu. (a) Compute the total number of parameters in the original model. And run this model. You shouldn’t run more than 20 epochs. (On the RCC with 8 cores it takes about 90 seconds per epoch with all the training data.) You can do this with only 10000 training data to expedite the experiments. For each experiment plot the error rate on training and validation as a function of the epoch number. Show an image with the 32 5× 5 filters that are estimated in the first layer of the model (b) Experiment with changing parameters of the network: i. Keep the same number of layers and change layer parameters reducing number of parameters by half in one experiment and doubling the number parameters in another. Try a few different options. Report the results. 2 ii. Design a deeper network with more or less the same number of parameters as the original network. Report the results. iii. Once you pick the best configuration try it on the full training set and report the result (c) Handling variability. A transformed data set has been created at /project2/cmsc25025/mnist/MNIST TRANSFORM.npy by taking each digit, rotating it by a random angle between [-40,-20] or [20,40], applying a random shift of +/− 3 pixels in each direction and applying a random scale between [.9, 1.1]. Display a few of these examples alongside the original digits. Using the original architecture to test on this data set. The classification rate drops dramatically. Try to propose changes to the network architecture so that still training on the original training set you would perform better on the transformed test set. Perform some exper- iments using a transformed validation set and show the final results on the transformed test set. 3

辅导案例-CMSC 25025 /-Assignment 4

Related

Previous Post辅导案例-MATH2011

Next Post辅导案例-CS210

Author admin