Proceedings 2019 - Part 1 Proceedings 2019 - Part 2
Proceedings 2020 - Part 1 Proceedings 2020 - Part 2
Proceedings 2021 - Part 1 Proceedings 2021 - Part 2
The course project can be categorized as a literature review, original research, or a literature review that ends up as original research (there is flexibility to that).
-
Literature review. This includes an in-depth review and analysis of a paper (to be selected from a list of papers provided by the instructor or you after the instructor’s approval). The review should provide an in-depth summary, exposition, and discussion of the paper (which will often include reading other related papers on that subject).
-
Original research. You are strongly encouraged to combine your current study with the course project. Otherwise, the instructor will provide some ideas to follow. It can be either theoretical or experimental.
Milestones
-
Pick a project the sooner as possible. Deadline: February 2nd (Friday).
-
Submit a one-page description of the project, what it is about, your opinion, what needs to be done (related papers to read), and whether you have any ideas to improve the ideas involved. Describe why they are important or interesting, and provide some appropriate references. If it is original research, provide a plan for the next steps and what needs to be done by the end of the semester to finish the project. Deadline: February 16th (Friday).
-
We will probably have in-class presentations towards the end of the semester. These will be spotlight talks (~5-10mins). Prepare an oral presentation with slides. Focus on high-level ideas, and leave most technical details to your report.
-
A written report. A LaTeX template will be provided (most probably in ICML format). The report should be at least six pages (excluding references). Deadline: End of the semester. Note that the project can continue beyond the end of the semester if it deserves publication.
Suggested list of projects/papers (to be updated)
Project ideas
- Computer-assisted worst-case analysis of gradient-based algorithms
- Modernized view of learning rates in optimization methods
- Recent advances in adaptive methods in ML
- Adaptive proximal algorithms for convex optimization under local Lipschitz continuity of the gradient
- Stochastic polyak step-size for SGD: An adaptive learning rate for fast convergence
- Learning-Rate-Free Learning by D-Adaptation
- Automatic Gradient Descent: Deep Learning without Hyperparameters
- Adaptive FL with auto-tuned clients
- Sparse post-training pruning and update in Neural Network training
- Quantum Approximate Optimization Algorithm from an Optimization perspective
- Empirical evaluation of derivative-free optimization methods in quantum objectives
- Homotopy methods, graduated optimization, and quantum annealing
- Transformer alternatives
- (Recent) advances in asynchrony on distributed SGD
- Review of classical continual learning methods
- Review of modern continual learning methods
- One Size Fits All for Semantic Shifts: Adaptive Prompt Tuning for Continual Learning
- Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt
- FedJETs: Efficient Just-In-Time Personalization with Federated Mixture of Experts
- Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation
- Review of ML models on weather forecasting
- Learning skillful medium-range global weather forecasting
- ClimaX: A foundation model for weather and climate
- FourCastNet: Accelerating Global High-Resolution Weather Forecasting using Adaptive Fourier Neural Operators
- WeatherBench 2: A benchmark for the next generation of data-driven global weather models
- FENGWU: PUSHING THE SKILLFUL GLOBAL MEDIUM-RANGE WEATHER FORECAST BEYOND 10 DAYS LEAD
- (probably more papers will be needed…)
- Review of large-scale ML models in AI
- Literature review on adapters in neural network training and optimization
- Literature review of recent developments on Frank-Wolfe methods
- Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization
- (More papers to be announced)
- Recent advances in acceleration methods
- Review of Byzantine Distributed optimization
- Machine learning with adversaries: Byzantine tolerant gradient descent
- Byzantine-resilient SGD in high dimensions on heterogeneous data
- The hidden vulnerability of distributed learning in byzantium
- Byzantine machine learning made easy by resilient averaging of momentums
- An equivalence between data poisoning and byzantine gradient attacks
- Byzantine-robust learning on heterogeneous datasets via resampling
- Review on efficient distributed protocols: independent subnetwork training (IST)
- Distributed learning of fully connected neural networks using independent subnet training
- GIST: Distributed training for large-scale graph convolutional networks
- Resist: Layer-wise decomposition of resnets for distributed training
- Efficient and Light-Weight Federated Learning via Asynchronous Distributed Dropout
- Federated Learning Over Images: Vertical Decompositions and Pre-Trained Backbones Are Difficult to Beat
- Review of theoretical results for various pruning methods