MIT Reading Group (Fall 2022):
The Science of Deep Learning

Welcome to the Science of Deep Learning Reading Group! This fall at MIT (2022), we will be reading papers relevant to understanding what deep neural networks do and why they work so well. While deep learning has driven a huge amount of progress in AI in the last decade, this progress has primarily been the result of trial and error, guided by loose heurististics, rather than by a principled understanding of these systems. We will be reading works which may help us gain a more principled understanding of deep neural networks.

We are running three sections:

For sections which are two hours, the first hour will mostly be for reading and the second hour mostly for discussion. So if you've already done the reading, it may make sense to show up an hour late.

This reading group is being organized by Eric Michaud, Kaivu Hariharan, and Guilhermo Cutrim Costa, with support from the MIT AI Alignment Club (MAIA).

Syllabus

Week 1 - Outlook for the Science of Deep Learning

Is the task of understanding deep learning more like physics or biology?

Week 2 - Scaling Laws

Bigger networks are better, predictably.

Week 3 - Explanations of Scaling Laws

Why are bigger networks better, predictably?

Week 4 - Emergent Capabilities

Neural network performance not so predictable after all?

Week 5 - Induction bumps

A "phase change" in model performance during training

Week 6 - Grokking

Neural Networks sometimes need a while to think

Week 7 - Adversarial Examples

Deep neural networks are alien minds

Week 8 - The Infinite-Width Limit

Super wide neural networks become simple

Week 9 - Information theory

Compression vs fitting in deep networks

Week 10 - The Lottery Ticket Hypothesis

You could have trained a much smaller network, if only you initialized it right

Week 11 - The Generalization Mystery

Neural Networks learn things which generalize when they could have memorized