Close Menu

Distributed Analytics for Big Data; Streamlined Data Curation; Use of Provenance

Boris Glavic
Assistant Professor of Computer Science

Several of the projects conducted by the IIT DBGroup led by Glavic address fundamental challenges faced by data scientists today. The HRDBMS system [1] provides efficient distributed analytics that easily scale to Big Data dimensions. Vizier [2] is a system that streamlines the data curation process, making it easier and faster to explore and analyze raw data. The GProM system [3] uses provenance to help data scientists understand how data was derived by complex data processing pipelines.