Arindam Banerjee

 

 

Arindam Banerjee

 

Talk: Benefits of Structure and Randomness

in Deep Learning

We consider the high-dimensional geometry associated with the first and second order structure, i.e., gradients and Hessians, of deep learning (DL) losses. We start with a review of ongoing work which is starting to highlight the unique geometry associated with spectral decay of such gradients and Hessians, arguably highlighting the substantial redundancy in modern DL models.

We then discuss two lines of advances which are utilizing such structure along with randomness, in particular random projections or sketching. First, we revisit the challenges of differentially private (DP) DL, and demonstrate how DP-DL with spectral or even random gradient projections hold considerable promise. Such new approaches can be extended to the private federated learning setting, where state-of-the-art performance in federated DP-DL is achieved. Second, we revisit the challenges of distributed and federated DL, especially the communication costs of high-dimensional gradients. We show that random gradient projection can be provably effective for communication efficient DL with convergence determined by the intrinsic dimension of the loss Hessians, which is substantially smaller than the high ambient dimension. We conclude with an outlook of future work.

Workshop Home Page

Return to the TAU-UIUC Workshop Home Page

Home→