Book | Alfredo Canziani

Since Feb 2022 I’ve been writing our textbook on Deep Learning with an Energy perspective. It will come in two versions: an electronic one with a dark background for screens (freely available) and a physical one with white background for printers (for purchase).

I finished writing the first 3 chapters and corresponding Jupyter Notebooks:

Intro;
Spiral;
Ellipse.

Once the 4^th chapter and notebook are done (end of Aug?), the draft will be submitted to the reviewers (Mikael Henaff and Yann LeCun). After merging their contributions (end of Sep?), a first draft of the book will be available to the public on this website.

Book format

The book is highly illustrated using $\LaTeX$’s packages TikZ and PGFPlots. The figures are numerically generated, with the computations done in Python using the PyTorch library. The output of such computations is stored as ASCII files and then read by $\LaTeX$ that visualises them. Moreover, most figures are also rendered on the Notebook using the Matplotlib library.

Why plotting with $\LaTeX$?

Because I can control every single aspect of what is drawn. If I define the hidden vector $\green{\vect{h}} \in \green{\mathcal{H}}$ in the book, I can have a pair of axes lebelled $\green{h_1}$ and $\green{h_2}$ and the Cartesian plane labelled $\green{\mathcal{H}}$ without going (too) crazy. All my maths macros, symbols, font, font size, and colour are just controlled by one single stylesheet called maths-preamble.tex.

Why colours

Because I think in colours. Hence, I write in colours. And if you’ve been my student, you already know that at the bottom left we’ll have a pink-bold-ex $\pink{\vect{x}}$ from which we may want to predict a blue-bold-why $\blue{\vect{y}}$ and there may be lurking an orange-bold-zed $\orange{\vect{z}}$.

Illustrations sneak peeks

To keep myself motivated and avoid going crazy too much, I post the most painful drawings on Twitter, where my followers keep me sane by sending a copious amount of love ❤️. You can find here a few of these tweets.

Load tweets (may take a few seconds)

I think I've just acquired the title of TikZ-ninja. pic.twitter.com/dq43bvjcFG
— Alfredo Canziani (@alfcnz) February 9, 2022

18 hrs writing the book in a row… Let's go home 😝😝😝
Good night World 😴😴😴 pic.twitter.com/kLtw2yeG92
— Alfredo Canziani (@alfcnz) February 12, 2022

A small update, so I keep motivating myself to push forward 😅😅😅
Suggestions and feedback are welcome! 😊😊😊 pic.twitter.com/d5NeKieE5m
— Alfredo Canziani (@alfcnz) February 15, 2022

Last update: a preview of the book's “maximum likelihood” section and generating code.
🥳🥳🥳 https://t.co/JZeAHuuTnA pic.twitter.com/dgaUIw5bWN
— Alfredo Canziani (@alfcnz) February 18, 2022

Achievement of the day 🥳🥳🥳
Plenty of pain! 🥲🥲🥲 pic.twitter.com/5BBS5J59bC
— Alfredo Canziani (@alfcnz) March 16, 2022

Vectors and functions 💡💡💡
A vector 𝒆 ∈ ℝᴷ can be thought of as a function 𝒆 : {1, …, 𝐾} ⊂ ℕ → ℝ, mapping all 𝐾 elements to a scalar value.
Similarly, a function 𝑒 : ℝᴷ → ℝ can be thought of as an infinite vector 𝑒 ∈ ℝ^ℝᴷ, having ℝᴷ elements. pic.twitter.com/ccZREDAal1
— Alfredo Canziani (@alfcnz) March 18, 2022

One giant leap for Alf, one small step forward for the book 🥲🥲🥲#TeXLaTeX #EnergyBasedModel #DLbook pic.twitter.com/X3FU8Uijys
— Alfredo Canziani (@alfcnz) March 22, 2022

Just some free energy geometric construction. 🤓🤓🤓 pic.twitter.com/DsIevqzuv2
— Alfredo Canziani (@alfcnz) April 4, 2022

Negative gradient comparison for Fₒₒ and Fᵦ.

For super-cold 🥶 zero-temperature limit we have a single force pulling on the manifold per training sample.
For warmer temperatures ☀️😎 we pull on regions of the manifold.
For super-hot 🥵 settings we kill ☠️ all the latents 😥. pic.twitter.com/cFsGQ3FJFV
— Alfredo Canziani (@alfcnz) May 3, 2022

«The ellipse toy example» chapter is DONE. 🥳🥳🥳
7.5k words, 1.2k likes of TikZ, 0.8k lines of Python.
I think I got this! 🥲🥲🥲 pic.twitter.com/5uwwrLcXPf
— Alfredo Canziani (@alfcnz) May 17, 2022

A small glimpse from the book, achievement of the day 🤓🤓🤓
The two soft maxima and soft minima are compared to the minimum, average, and maximum of a real vector (of size 5). This is a fun plot because the y-axis does something funky 🤪🤪🤪 pic.twitter.com/tST48uxmL2
— Alfredo Canziani (@alfcnz) June 1, 2022

Another update from the book. 📖
A classifier 'moves' points around such that they can be separated by the output linear decision boundaries.
Usually one looks at how the net warps the decision boundaries around the data but I like to look at how the input is unwarped instead. 🤓 pic.twitter.com/M3ZGmUUZI6
— Alfredo Canziani (@alfcnz) June 8, 2022

When looking at a classifier, we can consider its energy as being the cross-entropy or its negative linear output (often called logits). The energy of a well-trained model will be low for compatible (x, y) and high for incompatible pairs. 📖📖📖 pic.twitter.com/HlfvXQvGWn
— Alfredo Canziani (@alfcnz) June 10, 2022

Maths operand order is often counterintuitive.
For example, 𝒔 = 𝑾 𝒓 = 𝑼𝚺𝑽 ᵀ 𝒓 can be more naturally represented by the following circuit. 🤓🤓🤓 pic.twitter.com/S6rdtBtzuy
— Alfredo Canziani (@alfcnz) July 7, 2022

We can use SVD to inspect 🔍 what a given linear transformation does. From the diagram below we can see how the lavender oriented circle with axes 𝒗₁ and 𝒗₂ gets morphed into the aqua oriented ellipse with axes 𝜎₁𝒖₁ and 𝜎₂𝒖₂. So, they are ‘stretchy rotations’. pic.twitter.com/0HpOwOPbpf
— Alfredo Canziani (@alfcnz) July 8, 2022

A neural net is a sandwich 🥪 of linear and non-linear layers. Last week we've learnt about the geometric interpretation of linear transformations, and now we're appreciating a few activation functions' morphings.
Almost done with the intro chapter! 🥳🥳🥳 pic.twitter.com/9SAIfkKUWk
— Alfredo Canziani (@alfcnz) July 19, 2022

Chapter 1 (2 and 3) completed! 🥳🥳🥳
We've seen a linear and a bunch of non-linear transformations. But what can a stack of linear and non-linear layers do? Here we have two fully-connected nets doing their nety stuff on some random points. 😀😀😀 pic.twitter.com/otExi5h7bb
— Alfredo Canziani (@alfcnz) July 22, 2022

Last update: 26 Jul 2022.

Oct 2022 update

For the entire month of Aug and half of Sep I got stuck on implementing a working sparse coding algo for a low-dimensional toy example. Nothing was working for a long while, although I managed to get the expected result (see tweets below). Then, I spent a couple of weeks on the new semester’s lectures, creating new content (slides below, video available soon) on back-propagation, which I’ve never taught at NYU, a topic that will make it to the book. Anyhow, now I’m back to writing! 🤓

Load tweets (may take a few seconds)

Zooming in a little, for some finer details. pic.twitter.com/i57E0rYwzH
— Alfredo Canziani (@alfcnz) September 9, 2022

Backpropagation ⏮ of the gradOutput throughout each network's module allows us to compute the rate of change of the loss 📈 wrt the model's parameter.
To inspect 🧐 its value we can simply check the gradBias of any linear layer. pic.twitter.com/buysxDBGD7
— Alfredo Canziani (@alfcnz) September 26, 2022

Last update: 26 Sep 2022.

May 2023 update

Oh boy, this 4^th chapter took me a while (mostly because I’ve focussed also on other things, including the Spring 2023 edition of the course)… but it’s done now! In these last few months I’ve written about undercomplete autoencoders (AE), denoising AE, variational AE, contractive AE, and generative adversarial nets. Thanks to Gabriel Peyré, I’ve developed a method to separate stationary sinks and sources for a dynamics field (which I may write an article about), and it’s an integral part of the book explanations.

Moreover, I’ve been pushing a few videos from the Fall 2022 edition of the course, which give a preview on the chapters I’ve been writing, e.g. neural nets components, backpropagation (first time teaching it), energy-based classification, PyTorch training, K-means, and sparse coding (at least for now). Finally, over the Winter break, I’ve been teaching 12 years-olds about the maths and computer science behind generative AI, and I’m considering using p5.js as a tool to teach programming to beginners.

What’s next? I’m sending this first draft, with its 4 chapters (Intro, Spiral, Ellipse, Generative) and companion Jupyter Notebooks to Yann for a review. Meanwhile, I’ll be writing down the Backprop chapter, possibly and article, and pushing a few more videos on YouTube. Once the review is completed, a first draft will pop to this website for the public.

Load tweets (may take a few seconds)

Figures from chapter 4

A 2 → 100 → 100 → 1 → 100 → 100 → 2 hyperbolic tangent undercomplete autoencoder trying to recover a 1d manifold from 50 2d data points. 📖📖📖 pic.twitter.com/ImKbpPTavY
— Alfredo Canziani (@alfcnz) November 12, 2022

Let’s get some sections done! 🤓🤓🤓 pic.twitter.com/13bllkQ3wx
— Alfredo Canziani (@alfcnz) December 13, 2022

A variational autoencoder (VAE) limits the low-energy region by mapping the inputs to fuzzy bubbles. The hidden representation can be made uninformative by increasing the temperature during learning, which induces the bubbles to be all centred at the origin and have unit size. pic.twitter.com/qpa8ptsJDD
— Alfredo Canziani (@alfcnz) March 16, 2023

Done with the VAE chapter! 🥳🥳🥳
Two sections to go and the first draft ships! 🥳🥳🥳
Yay! 🥳🥳🥳 pic.twitter.com/Lj30urRpZH
— Alfredo Canziani (@alfcnz) March 24, 2023

We have a caption now! The contractive autoencoder section is completed.
One section to go! 🥳🥳🥳 https://t.co/cpid936wDr pic.twitter.com/mTIDqkYSqm
— Alfredo Canziani (@alfcnz) April 18, 2023

Epoch 0 vs. epoch 18k.
Losses and generator gradients' norm.
Critic learnt energy. pic.twitter.com/7swifi5qNj
— Alfredo Canziani (@alfcnz) May 11, 2023

Videos from DLFL22

Let's end this year by starting to upload the first video of NYU Deep Learning Fall 2022 edition! 🥳🥳🥳
This is an incremental version based on DLSP21. Therefore, only new content will be uploaded.

Enjoy the view.https://t.co/TxaNhQgUbO pic.twitter.com/hVZYWEJMv8
— Alfredo Canziani (@alfcnz) December 30, 2022

Let's start the year by brushing up on the basics of neural nets: linear and non-linear transformations.
In this episode, we're concerned with inference only. Forward and backwards. We introduce the cost and the energy. 🔋
Website: https://t.co/3yY8CMLiXz https://t.co/zrqH4CG0mr pic.twitter.com/MrSeV3u40S
— Alfredo Canziani (@alfcnz) January 1, 2023

The first video of the «Classification, an Energy Perspective» saga shows two nets' data space transformation, introduces the data format, illustrates the predictor-decoder architecture, and explains how gradient descent is used for learning.
Enjoy 🤓❤️🤗https://t.co/glH2iGydIJ pic.twitter.com/S33JxwdH83
— Alfredo Canziani (@alfcnz) January 4, 2023

The second video of the «Classification, an Energy Perspective» saga teaches backprop, visualises the energy landscape, and explains how contrastive learning works. 🤓
This lecture alone was the reason DLFL22 has been pushed online. I hope you like it. ❤️https://t.co/5vVQRwLzxK pic.twitter.com/x0lQaT9hKz
— Alfredo Canziani (@alfcnz) January 9, 2023

The third and last video of the «Classification, an Energy Perspective» saga covers neural net 5-step training code in @PyTorch, gradient accumulation justification, reprodution of energy surface for different model, and ensembling uncertainty estimation.https://t.co/oyEGlgyhTE pic.twitter.com/MaZsSSRg8U
— Alfredo Canziani (@alfcnz) February 21, 2023

In this lecture, we start with two examples of decoder-only latent-variable EBM (𝐾-means and sparse coding), move to target-prop via amortised inference, to finally land the autoencoder architecture. 🤓
Back to using @AdobeAE for the animations! 🥳https://t.co/ATbVwuxmcC 🎥 pic.twitter.com/kWEF68cE9Q
— Alfredo Canziani (@alfcnz) February 28, 2023

Teaching Italian 7^th graders

I taught 4 hours of Deep Learning to a class of 7th graders. I didn’t dumb it down at all. I just used the same analogies and explanations I use with the grown ups. By the end I was in love with their young and fresh minds and total absolute attention. ❤️https://t.co/CFP4Mkarwx pic.twitter.com/Ng0veJLftq
— Alfredo Canziani (@alfcnz) January 18, 2023

Last update: 16 May 2023.

Aug 2023 update

Of course, during the Summer it was unrealistic expecting anyone to review anything… Anyhow, I’ve just got back from O‘ahu (ICML23) and Maui (2 days before Lahaina burnt down) and finished the Backprop chapter, therefore the first draft will have 5 chapters in total as of right now. Below, you can see a few diagrams I’ve developed over these summer months.

The new semester starts in two weeks, so I’ll be a bit busy with that. I need to plan a possible chapter on joint embedding methods and start working on PART II of the book: ‘geometric stuff’.

About books, I’ve just received my copy of The Little Book of Deep Learning by François Fleuret. I have to say it is really well-made, and I really like it. It’s a bit on the terse side, but I haven’t decided if it’s a pro or a con.

Load tweets (may take a few seconds)

Let's go fancy with inline diagrams!
LaTeX has no secretes to me, mhuahahaha! 🤪🤪🤪
(Writing the backprop chapter.) pic.twitter.com/bm1knKYCI7
— Alfredo Canziani (@alfcnz) May 22, 2023

Backprop, the key component behind training multi-layered deep nets, can be sometimes challenging to digest. Follows an attempt to illustrate it, starting from the last linear layer's gradWeight and gradBias computation in a regression setup. 🤓🤓🤓 pic.twitter.com/PLeiRjJVYb
— Alfredo Canziani (@alfcnz) May 26, 2023

A neural net is made of simple building blocks.
Learning how the output gradient is backpropagated through these basic components helps us understand how each part contributes to the final model performance.
Below we see how the node & sum complimentary modules behave. pic.twitter.com/tv5s5A1TFp
— Alfredo Canziani (@alfcnz) June 9, 2023

«Weights sharing implies tied gradients accumulation.» Since it's not obvious for half of you and only a small fraction can prove it (link to the poll below), let me share this latest book section with y'all! 😀😀😀
This also justifies the backward behaviour of the node module. pic.twitter.com/fA9dpYZiIp
— Alfredo Canziani (@alfcnz) June 14, 2023

The one-hot row routing and branching matrix G 🐢 is a peculiar object. When it's used in a left-multiplication, it acts as a selector and/or branching operator. When it's used in a right-multiplication, it acts as an accumulator via the paths that have previously branched out. pic.twitter.com/DEH7FOxhqr
— Alfredo Canziani (@alfcnz) June 21, 2023

Jun 2025 update

Oh boy… it’s been two years since the last update… Let met tell you what’s happened since the last time I wrote something here.

Autumn 2023

We left off at draft v0.5.0, with 5 chapters completed (Backprop being the last one). During autumn 2023 I completed the 6^th chapter (Signals, draft v0.6.0), and started working on recurrent nets.

Load tweets (may take a few seconds)

«Chapter 6»
In this chapter, we'll introduce several geometric structures, over which functions are defined, and whose properties can be exploited to reduce computations and ease learning, giving rise to several architecture families we'll cover in this part of the book. pic.twitter.com/sTPeSZnlx5
— Alfredo Canziani (@alfcnz) August 24, 2023

«Chapter 7»
*Recurrent neural nets* are characterised by the presence of *cyclic connections*. They have a *distributed hidden state* with *non-linear dynamics*. The network uses information from its previous state as part of its computation for the current state. pic.twitter.com/xpecYuHEsw
— Alfredo Canziani (@alfcnz) October 18, 2023

Text & maths vs. diagram & caption.
They convey the same information in a very different form. 🤓🤓🤓 pic.twitter.com/bmdFIEKNFP
— Alfredo Canziani (@alfcnz) October 24, 2023

Yesterday I wrote two pages of maths, with upper bounds for the computation of a gradient, using Cauchy-Schwarz inequality and other ‘tricks’.
Today I drew a picture, which summarises two pages of equations.
Although the maths was necessary, the figure is what I see in my mind. pic.twitter.com/35e9lPudRz
— Alfredo Canziani (@alfcnz) October 24, 2023

A or B and why? pic.twitter.com/LaZ9ruxZQe
— Alfredo Canziani (@alfcnz) October 30, 2023

A or B and why? pic.twitter.com/faz5nv4OQL
— Alfredo Canziani (@alfcnz) October 30, 2023

My coworker, Brian McFee publishes Digital Signals Theory, an introductory textbook for non-technical people.

Spring 2024

I’m promoted to full-time teaching faculty, with two courses a semester. More precisely, I’m co-teaching a (classical?) symbolic and statistical AI course (don’t ask) to 130 students with no teaching assistant. (In addition to my 80-student graduate Deep Learning course, for a gran total of 210 students.) Therefore, I start working 7 days a week, 12 hours a day.

We decide to split the duties across the semester: I’m in charge of the second ‘learning’ part.

Now the fun part. Students don’t come to class (it’s not necessary for solving the first part’s homework), my slides do not have text, students complain they cannot ‘read’ the slides by their own, my exam is about the knowledge covered in class, I get a tonne of negative reviews on Rate My Professor. A few months later, I become the target of several angry, hateful students. I almost lost my job.

Book? What book? Who has time to focus on anything else?

Yet, I publish some of my lectures as NYU-AISP24.

Summer 2024

I interview Yann LeCun and Léon Bottou. I put together 3 blog posts on SN, Yann and Léon’s 1988 Simulateur de Neurones learning framework. Furthermore, I write a blog about ‘visual requirements’ for my grad course.

Load tweets (may take a few seconds)

I wrote two blog posts about SN, Léon Bottou and @ylecun's 1988 Simulateur de Neurones.
One is an English translation of the original paper, for which I've reproduced the figures. The other is a tutorial on how to run their code on Apple silicon.https://t.co/YEARKgePSK pic.twitter.com/7ZTdAZEZBz
— Alfredo Canziani (@alfcnz) August 9, 2024

Dropping a new blog on «Visual prerequisites for learning deep learning». Nothing new. Just my recommendations, explicitly listed for former and future students’ benefit.https://t.co/qihKsZ9iNr pic.twitter.com/BICD7KIyha
— Alfredo Canziani (@alfcnz) September 6, 2024

Simulateur de Neurones (SN), one of the earliest deep learning frameworks, already had interactive and graphic capabilities ~30 years ago.
In this blog post, you can learn more about a PyTorch ancestor, used to train the first convnet.https://t.co/xEjpMQIypD pic.twitter.com/XqWAearY2X
— Alfredo Canziani (@alfcnz) October 1, 2024

Autumn 2024

I’m still teaching two courses a semester, but one is an offering for alumni of my graduate Deep Learning course. Therefore, there’s minimal overhead and I can get back to writing.

I complete the History (7^th) chapter (draft v0.7.0). I push a little more, and complete the RNN (8^th) chapter (draft v0.8.0). Finally, I start the TikZ (9^th) chapter, where I explain how I draw all my book’s figures.

I become a hate crime victim, target of a psychopath, who verbally threats me. I fear for my safety and file a police report. In retaliation, the psychopath proceeds with a defamation campaign, trying to destroy my public figure.

Load tweets (may take a few seconds)

It's so funny… 😬 This past Spring semester I found myself forced to teach GOFAI… and now I am actually able to share my understanding and perspective in the historical chapter of my book. 🥲
I guess knowledge is always a good thing. 😅 pic.twitter.com/P4Wagseuds
— Alfredo Canziani (@alfcnz) August 15, 2024

By encoding memories as attractors in a dynamical system, one can retrieve them when presented with corrupted or partial stimuli. From a high-energy configuration, the system will spontaneously relax to a low-energy state.
Can anyone guess what model we're talking about? 🤓 pic.twitter.com/slJpjd7VJN
— Alfredo Canziani (@alfcnz) August 20, 2024

I'm really having a blast at writing the historical section side notes! 🤩🤩🤩
After the 1969 Minsky & Papert book, we went through the first AI winter. Widrow, kept working on neural nets but rebranded them as adaptive filters, which are now ubiquitous. pic.twitter.com/MD7eVuZZOq
— Alfredo Canziani (@alfcnz) September 3, 2024

Putting all together, we have the following result. 🤓🤓🤓

𝚙𝚕𝚘𝚝_𝚠𝚎𝚒𝚐𝚑𝚝𝚜 allows us to inspect the dependencies of the hidden state wrt the input and the previous hidden representation. It also allows us to view the output linear combination of hidden units. https://t.co/tlADa7jd7D pic.twitter.com/xUEVqnlBhh
— Alfredo Canziani (@alfcnz) October 14, 2024

One more chapter completed! 🥳🥳🥳
This one actually ended is a funny way 😅😅😅
Anyhow, posting this to share the gates' histogram overlay with the activation function to show the operation mode (biasing) of the soft switches. pic.twitter.com/DinSLTcAsb
— Alfredo Canziani (@alfcnz) October 17, 2024

You asked me to show you my secretes… so here we go!
Taking a small break from DL for writing an appendix on procedural graphics. I hope you'll find it useful! 😊😊😊 pic.twitter.com/UjQioTWbKB
— Alfredo Canziani (@alfcnz) October 21, 2024

Thursday I tried to teach something I couldn't see clearly… oh man… what a drag… 😭😭😭
I had to relearn how to see what I was talking about. 😩😩😩
And now that I can see, let me draw it, so I won't unsee it again! 🤓🤓🤓 pic.twitter.com/79x1mnWCux
— Alfredo Canziani (@alfcnz) November 5, 2024

Daily TikZ show off. 😁😁😁 pic.twitter.com/E3FCt0HsDF
— Alfredo Canziani (@alfcnz) November 7, 2024

In 1962, Hubel and Wiesel uncovered how neurons in cats' brain respond to specific visual stimuli, such as edges, lines, and movement.
The visual cortex processes info hierarchically, simple cells respond to basic features and complex cells build location invariance. pic.twitter.com/KyKQekOCcz
— Alfredo Canziani (@alfcnz) December 9, 2024

Spring 2025

I’m given the opportunity to decide what my second course is. Therefore, I put together an undergraduate ‘Introduction to Deep Learning’ blackboard course. Mum gifts me a chalk holder. I have zero registered students one week prior to the beginning of the semester. The admin tells me they will likely have to cancel my course, and I’ll be assigned some other random stuff to teach.

We’re having fun in class. Students are easily amused by this silly prof.

For the second lecture, I spend roughly 4 hours tweaking my slides. I go to class and decided to give an introduction before turning on my laptop and the projector. One hour and a half later… the blackboard is a copy of the slides I planned to use 😅

Load tweets (may take a few seconds)

I think it's going well. At least we're having fun! 😁😁😁 pic.twitter.com/fgg8NR1HSQ
— Alfredo Canziani (@alfcnz) January 23, 2025

Tue morning: *prepares slides*
Tue class: *improv blackboard lecture*
Outcome: unexpectedly great lecture.
Thu morning: *prep handwritten notes*
Thu class: *executes blackboard lecture*
Students: 🤩🤩🤩🤩🤩🤩🤩🤩🤩 pic.twitter.com/pXgPVz8ajB
— Alfredo Canziani (@alfcnz) January 30, 2025

This is fun! 🤩 I get the hang of it and start crafting live coloured blackboard. Students are enthusiastic and hyped.

Load tweet

In today's episode, we review the concepts of loss ℒ(𝘄, 𝒟), per-sample loss L(𝘄, x, y), binary cross-entropy cost ℍ(y, ỹ) = y softplus(−s) + (1−y) softplus(s), ỹ = σ(𝘄ᵀ𝗳(x)).
Then, we minimised the loss by choosing convenient values for our weight vector 𝘄. pic.twitter.com/axI0Jje8JC
— Alfredo Canziani (@alfcnz) February 11, 2025

I go teach in Santiago of Chile for Khipu 2025, and I get Yann to cover for me.

Load tweet

While I was away, teaching for @Khipu_AI, I got ‘someone’ to teach my blackboard undergrad course.
It turns out teaching (undergrad) is like riding a bike. Even though you're out of practice, you still know how to do it! 😀😀😀 pic.twitter.com/Uz8PzAJC4H
— Alfredo Canziani (@alfcnz) March 26, 2025

I get back to FPGA Verilog programming, Spice CMOS simulation, and digital electronics.

Load tweets (may take a few seconds)

Getting my toes wet with FPGA prototyping. 🤓
There are two always blocks:
• the first counts up to 13.5M, which takes 0.5 seconds with a clock of 27MHz;
• the second reset the LED configuration to 6'b111110 and every 0.5s moves the 0 on step to the left. pic.twitter.com/jLrEy62tiU
— Alfredo Canziani (@alfcnz) April 11, 2025

Digital, by Helmut Neemann, allows you to design and simulate digital logic, and it's designed for educational purposes. It has a Verilog export feature that helps you to understand how hardware description languages work. 🤓🤓🤓https://t.co/kPOx30GvFw pic.twitter.com/d1WxA5VcMj
— Alfredo Canziani (@alfcnz) April 11, 2025

Alright, getting the hang of it! 🥲
I haven't seen a less intuitive GUI in a while… yet, it *is* functional. I guess the author really wants you to switch to the keyboard shortcuts rather than right-clicking your way through! 🥹
BTW, LTspice is free of charge! pic.twitter.com/CvwgDP6xQ5
— Alfredo Canziani (@alfcnz) April 29, 2025

Today we're playing with diode logic.
This component allows us only to perform a logic AND and OR. There is no NOT unless active components are used.https://t.co/lAmeqgqGAM pic.twitter.com/b30dIN2WH8
— Alfredo Canziani (@alfcnz) May 1, 2025

In class, I experiment a lot with the guided discovery pedagogical technique and having the students being the major actors, to a point that lecture 20 got completely derailed by a student, who kept steering the thread, prompted by his own curiosity. I was so ecstatic about the outcome (it was pure jazz), that I decided to publish the lecture to advertise the course to other students.

Load tweet

In this lecture from my new undergrad course, we review linear multiclass classification, leverage backprop and gradient descent to learn a linearly separable feature vector for the input, and observe the training dynamics in a 2D embedding space. 🤓https://t.co/k4p0JwPtB7 pic.twitter.com/sCgnkiPenA
— Alfredo Canziani (@alfcnz) April 9, 2025

Finally, I create a new animation about training a neural network for classification, reviving code written 5 years ago.

Load tweet

Training of a 2 → 100 → 2 → 5 fully connected ReLU neural net via cross-entropy minimisation.
• it starts outputting small embeddings
• around epoch 300 learns an identity function
• takes 1700 epochs more to unwind the data manifold pic.twitter.com/gzCMnA5rb0
— Alfredo Canziani (@alfcnz) April 8, 2025

For this course, I had the pleasure to have an unofficial assistant, Gabriele Pintus, who has been writing his Master’s thesis on JEPA models with me, here at NYU. Thanks to him, the homework were spectacularly well-made, and students extremely happy.

Book? No time.

Summer 2025

Yann agrees to review the book in July, finally allowing me to release the first book’s draft. I complete and release the 9^th chapter, TikZ, bumping the draft to v0.9.0. The next update should happen around the end of July, where I should be able to share with you the first draft of the book. Now, I’m getting started with the 10^th chapter, Control.

Last update: 9 Jun 2025.

Book format

Why plotting with $\LaTeX$?

Why colours

Illustrations sneak peeks

Oct 2022 update

May 2023 update

Figures from chapter 4

Videos from DLFL22

Teaching Italian 7th graders

Aug 2023 update

Jun 2025 update

Autumn 2023

Spring 2024

Summer 2024

Autumn 2024

Spring 2025

Summer 2025

Teaching Italian 7^th graders