Week 6

$$\gdef \sam #1 {\mathrm{softargmax}(#1)}$$ $$\gdef \vect #1 {\boldsymbol{#1}} $$ $$\gdef \matr #1 {\boldsymbol{#1}} $$ $$\gdef \E {\mathbb{E}} $$ $$\gdef \V {\mathbb{V}} $$ $$\gdef \R {\mathbb{R}} $$ $$\gdef \N {\mathbb{N}} $$ $$\gdef \relu #1 {\texttt{ReLU}(#1)} $$ $$\gdef \D {\,\mathrm{d}} $$ $$\gdef \deriv #1 #2 {\frac{\D #1}{\D #2}}$$ $$\gdef \pd #1 #2 {\frac{\partial #1}{\partial #2}}$$ $$\gdef \set #1 {\left\lbrace #1 \right\rbrace} $$

Lecture part A

We discussed three applications of convolutional neural networks. We started with digit recognition and the application to a 5-digit zip code recognition. In object detection, we talk about how to use multi-scale architecture in a face detection setting. Lastly, we saw how ConvNets are used in semantic segmentation tasks with concrete examples in a robotic vision system and object segmentation in an urban environment.

Lecture part B

We examine Recurrent Neural Networks, their problems, and common techniques for mitigating these issues. We then review a variety of modules developed to resolve RNN model issues including Attention, GRUs (Gated Recurrent Unit), LSTMs (Long Short-Term Memory), and Seq2Seq.


We discussed architecture of Vanilla RNN and LSTM models and compared the performance between the two. LSTM inherits advantages of RNN, while improving RNN’s weaknesses by including a ‘memory cell’ to store information in memory for long periods of time. LSTM models significantly outperforms RNN models.