Maksym Andriushchenko

prof_pic.jpg
Enjoying the gorgeous 🇨🇭 peaks! This one is Rochers de Naye.

EmailTwitter/XGoogle ScholarGitHubCV

Short bio. I’m a fifth-year PhD student in computer science at EPFL 🇨🇭 advised by Nicolas Flammarion. My research is supported by the Google and Open Phil AI PhD Fellowships. I did my MSc at Saarland University and the University of Tübingen, and interned at Adobe Research.

Research interests. My primary research goal is to understand generalization in deep learning. Towards this goal, I’ve worked on adversarial robustness, out-of-distribution generalization, implicit regularization, and sharpness-aware minimization. These days, I’m looking more into optimization and generalization properties of language models. My full publication list is available here.

On Ukraine. Since I’m from Ukraine, I’m often asked about the situation in my country and how one can help. The most effective way is to donate to local Ukrainian organization helping on the ground, e.g., see this list which includes both trusted military and humanitarian organizations. You can also host displaced scholars and students from Ukraine, e.g., see the #ScienceForUkraine project where I’m involved as a volunteer. You can also help simply by spreading the word about the war and going to demonstrations in your city. It’s very important that we don’t normalize annexations of territories, numerous war crimes, mass deportations, and nuclear threats. Otherwise, we’ll end up in a world we don’t really want to be in.

news

Sep 1, 2024 Looking for a postdoc position to start in September 2024. If you think my background can be a good fit, I’d be happy to discuss! I’m also attending NeurIPS in New Orleans and would be happy to chat there.
Jan 5, 2024 A talk at the Deep Learning: Classics and Trends (organized by ML Collective) about our recent work Why Do We Need Weight Decay in Modern Deep Learning?
Dec 10, 2023 Going to NeurIPS’23 in New Orleans. Feel free to ping me if you want to chat!
Nov 14, 2023 A talk at the Deep Learning and Optimization Seminar (organized by faculties from Westlake University, City University of Hong Kong, Peking University) about our recent work Why Do We Need Weight Decay in Modern Deep Learning?
Nov 9, 2023 A talk at the University of Tübingen about our recent work Why Do We Need Weight Decay in Modern Deep Learning?
Oct 30, 2023 A talk at the Efficient ML Reading Group (organized by TU Graz) about our recent work Why Do We Need Weight Decay in Modern Deep Learning?
Oct 23, 2023 Excited to have participated in red teaming of OpenAI models as an external expert! I hope my findings will help improving the safety of their models/services.
Oct 9, 2023 Our new paper Why Do We Need Weight Decay in Modern Deep Learning? is available online. Also check out our new preprint on layer-wise linear mode connectivity.
Sep 21, 2023 Both Sharpness-Aware Minimization Leads to Low-Rank Features and Transferable Adversarial Robustness for Categorical Data via Universal Robust Embeddings got accepted to NeurIPS 2023! See y’all in New Orleans! 🎶🎷
Aug 23, 2023 A talk at the ELLIS Mathematics of Deep Learning reading group about our ICML 2023 paper SGD with Large Step Sizes Learns Sparse Features. Slides: pdf, pptx.
Jul 23, 2023 Going to ICML 2023 in Hawaii to present SGD with Large Step Sizes Learns Sparse Features and A Modern Look at the Relationship Between Sharpness and Generalization at the main track and Sharpness-Aware Minimization Leads to Low-Rank Features at a workshop. Feel free to ping me if you want to chat!
Jul 21, 2023 A talk at the Tatsu’s lab group meeting at Stanford about our ICML 2023 paper A modern look at the relationship between sharpness and generalization. Slides: pdf, pptx.
Jun 5, 2023 A talk at the Efficient ML Reading Group (organized by TU Graz) about our ICML 2023 paper A modern look at the relationship between sharpness and generalization. Slides: pdf, pptx.
May 30, 2023 A talk at a mini-symposium of the 93rd Annual Meeting of the International Association of Applied Mathematics and Mechanics about our ICML 2022 and ICML 2023 papers on robustness/flatness in the parameter space.
May 26, 2023 Our new paper Sharpness-Aware Minimization Leads to Low-Rank Features is available online! We investigate the low-rank effect of SAM which occurs in a variety of settings (regression, classification, contrastive learning) and architectures (MLPs, CNNs, Transformers).
May 5, 2023 A talk at the Amazon Research Reading Group about our ICML 2023 paper A modern look at the relationship between sharpness and generalization. Slides: pdf, pptx.
Apr 25, 2023 Both SGD with large step sizes learns sparse features and A modern look at the relationship between sharpness and generalization got accepted to ICML 2023! See you in Hawaii! 🌴
Apr 12, 2023 A talk at the Deep Learning and Optimization Seminar (organized by faculties from Westlake University, City University of Hong Kong, and Peking University) about our paper SGD with large step sizes learns sparse features. Slides: pdf, pptx.
Mar 13, 2023 A talk at the OOD Robustness + Generalization Reading Group at CMU about our paper A modern look at the relationship between sharpness and generalization. Slides: pdf, pptx.
Feb 15, 2023 Our new paper A modern look at the relationship between sharpness and generalization is available online! Do flatter minima generalize better? Well, not really.
Sharpness-vs-generalization summary
Dec 9, 2022 A talk at the University of Luxembourg about our work with Adobe: ARIA: Adversarially Robust Image Attribution for Content Provenance.
Dec 1, 2022 A talk in the ML and Simulation Science Lab of the University of Stuttgart about RobustBench and SGD with large step sizes learns sparse features.
Nov 28, 2022 Going to NeurIPS’22 in New Orleans. Feel free to ping me if you want to chat!
Oct 28, 2022 A talk at the ELLIS Mathematics of Deep Learning reading group about our ICML’22 paper Towards Understanding Sharpness-Aware Minimization. Slides: pdf, pptx.
Oct 12, 2022 Our paper SGD with large step sizes learns sparse features is available online! TL;DR: loss stabilization achieved via SGD with large step sizes leads to a hidden dynamics that promotes sparse feature learning. Also see this twitter thread for a quick summary of the main ideas.
Summary
Oct 7, 2022 Recognized as one of the top reviewers at NeurIPS’22. Yay! 🎉
Sep 7, 2022 A talk at Machine Learning Security Seminar hosted by University of Cagliari about our paper ARIA: Adversarially Robust Image Attribution for Content Provenance (available on youtube).
Sep 1, 2022 Truly excited to be selected for the Google PhD fellowship and OpenPhil AI fellowship!
Jun 13, 2022 Our paper Towards Understanding Sharpness-Aware Minimization got accepted to ICML’22!
SAM summary
Apr 1, 2022 Our paper ARIA: Adversarially Robust Image Attribution for Content Provenance is accepted to the CVPR’22 Workshop on Media Forensics. One of (a few?) applications where \(\ell_p\) adversarial robustness is well-motivated from the security point of view.
ARIA summary
Mar 25, 2021 A talk at the NLP club of Grammarly about our paper On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines (available on youtube).