| |
|
Part I: Basic gradient methods |
|
|
|
| 1 |
08.(23/25) |
Intro & Preliminaries |
.pdf |
.pdf |
.ipynb .ipynb |
| 2 |
08.30-09.(01/06) |
Gradient method |
.pdf |
.pdf |
.ipynb |
| 3 |
09.(08/13/15) |
Gradient method & Convexity |
.pdf |
.pdf |
.ipynb |
| 4 |
09.(20/22/27) |
Conditional gradient (Frank-Wolfe) |
.pdf |
.pdf |
.ipynb |
| |
|
Part II: Going faster than basic gradient descent |
|
|
|
| 5 |
09.29-10.04 |
Beyond first-order methods |
.pdf |
.pdf |
.ipynb |
| 6 |
10.(06/13) |
Momemtum acceleration |
.pdf |
.pdf |
.ipynb |
| 7 |
10.(18/20) |
Stochastic motions in gradient descent |
.pdf |
.pdf |
.ipynb |
| |
|
Part III: Provable non-convex optimization |
|
|
|
| 8 |
10.(25/27)-11.01 |
Sparse feature selection and recovery |
.pdf |
.pdf |
.ipynb |
| 9 |
11.(03/08) |
Low-rank recovery |
.pdf |
.pdf |
.ipynb |
| |
|
Part IV: Optimization methods in modern ML |
|
|
|
| 10 |
11.(17/22) |
Landscape properties of general functions |
.pdf |
.pdf |
— |
| 11 |
11.(22) |
Distributed computing + Algorithms for NN training |
/schedule/images/chapter11-12.pdf |
.pdf/.pdf |
— |
| 13 |
11.29-12.01 |
Project presentations |
— |
— |
— |
| |
|
Part V: Final exam |
|
|
|