# Week 11

## Lecture part A

In this section, we discussed about the common activation functions in Pytorch. In particular, we compared activations with kink(s) versus smooth activations - the former is preferred in a deep neural network as the latter might suffer with gradient vanishing problem. We then learned about the common loss functions in Pytorch.

## Lecture part B

In this section, we continued to learn about loss functions - in particular, margin-based losses and their applications. We then discussed how to design a good loss function for EBMs as well as examples of well-known EBM loss functions. We gave particular attention to margin-based loss function here, as well as explaining the idea of “most offending incorrect answer.

## Practicum

This practicum proposed effective policy learning for driving in dense traffic. We trained multiple policies by unrolling a learned model of the real world dynamics by optimizing different cost functions. The idea is to minimize the uncertainty in the model’s prediction by introducing a cost term that represents the model’s divergence from the states it is trained on.