PI
Core/Dual
Tim Kraska
Room
32-G914Professor Tim Kraska, an Associate Professor of Electrical Engineering and Computer Science in MIT CSAIL, aims to dramatically increase the efficiency of data-intensive systems and democratize data science through machine learning. In CSAIL, he co-leads the Data Systems Group and is part of the Systems Community of Research (CoR), and he is the founding co-director of the Data System and AI Lab (DSAIL) at MIT and co-founder of Einblick Analytics, Inc. Much of his work in systems has significantly impacted both academia and industry.
His research using machine learning in systems takes two main approaches:
- Systems for ML: Building systems to make the recent advances in machine learning more accessible
- ML for Systems: Leveraging machine learning to improve systems
The goal of Systems for ML is to make it easier for people to use the complex analytics techniques of machine learning, as well as enabling a broader audience to do more with the data and make data-driven decisions. For example, the Northstar project explores new user interfaces and infrastructure to help experts and non-experts alike to become citizen data scientists, enabling visual, interactive, and assisted data exploration and model building. More recently, he is interested in exploring natural language interfaces for analytic tasks.
In the second category of applying ML for Systems, Prof. Kraska and the Data Systems Group are looking into instance-optimized systems; systems that self-adjust automatically to the data and the workload. For example, SageDB is a new type of system that aims through machine learning and other optimization techniques to significantly increase the efficience of data processing systems.
With Moore’s law ending, data continues to increase at an unprecedented pace. Prof. Kraska is interested in investigating new methods to account for the increase in data and still be able to efficiently analyze that data. Instance-optimized systems, applying machine learning to improve systems, and tailoring them for workloads and data distributions are potential ways to address these issues.
Related Links
Last updated Jul 19 '21