Scalable Cost-Efficient Techniques for Machine Learning via Sketching

The ever-increasing size and complexity of modern datasets have created a significant challenge for machine learning theorists and experimentalists, as the computational resources and time required to process and analyze these datasets are immense. Randomized Linear Algebra (randNLA) techniques offer a promising solution, enabling a quantifiable trade-off between accuracy and computation time. Sketching and sampling methods, which are central components of modern algorithm design, serve as powerful techniques for compressing high-dimensional datasets into lower-dimensional representations while preserving properties of interest. This dissertation aims to exploit and leverage the advantages of randNLA techniques, with a primary focus on sketching methods, in common machine learning models to address the challenges posed by large-scale, high-dimensional datasets. In this thesis, our research investigates the following areas: (1) designing sparse graph-based sketching for fast numerical linear algebra computations; (2) developing convergent and efficient tensor decomposition algorithms through adaptive sketching; (3) obtaining low-rank approximations for matrix completion using sketching techniques, as exemplified by the novel two-modality sampling NoisyCUR algorithm and (4) proposing novel algorithms with reduced label complexity for Kernel Ridge Regression Models. 
Date
Location
Lally 102
Speaker: Dong Hu Advisor: Prof. Alex Gittens
Back to top