Talk_large_ss_sgd_westlake
A talk at the Deep Learning and Optimization Seminar (organized by faculties from Westlake University, City University of Hong Kong, and Peking University) about our paper SGD with large step sizes learns sparse features. Slides: pdf, pptx.