We study problems at the intersection of data management and machine learning to build trustworthy and responsible decision-making systems. Our aim is to develop systems that enable explainability, fairness, and accountability of data-driven decision-making systems. We are particularly interested in:
- Explaining and debugging fairness violations in machine learning models and data science pipelines:
- How can we determine sources of unexpected errors and bias in machine learning model outcomes?
- How can we decompose unexpected or discriminatory behavior of data science pipelines in terms of the different pipeline stages?
- Can we effectively generate post hoc explanations for the outcomes of machine learning models?
- Data integration and data quality:
- How can we leverage expert feedback to improve data cleaning techniques for machine learning?
- Can we use the final outcomes in data science pipelines to inform intermediate pipeline choices?
- How can we intertwine pipeline stages with downstream analytics to improve upon the end goals?
We are always looking for motivated Ph.D. students to collaborate with. If you are interested in data management and/or responsible data analytics, feel free to contact us with your CV/resume and a couple of sentences describing your research interests, and consider applying to Purdue CIT!
|Nov 20, 2023
|Tanmay defends his M.S. thesis. Congrats, Tanmay!
|Nov 16, 2023
|Dr. Pradhan gave an invited talk at Brandeis University Computer Science seminar.
|Aug 11, 2023
|The group welcomes Ambarish and Jahid as our newest Ph.D. students.
|Jun 29, 2023
|Excited to receive an NSF CAREER Award.
|Mar 23, 2023
|Dr. Pradhan gave a talk on fairness debugging using Gopher at MIT CSAIL’s Causality reading group.
- Explainable AI: Foundations, Applications, Opportunities for Data Management ResearchIn Proceedings of the 2022 International Conference on Management of Data, 2022
- Interpretable Data-Based Explanations for Fairness DebuggingIn Proceedings of the 2021 International Conference on Management of Data, 2022
- Explaining Black-Box Algorithms using Probabilistic Contrastive CounterfactualsIn Proceedings of the 2021 International Conference on Management of Data, 2021
- Staging User Feedback toward Rapid Conflict Resolution in Data FusionIn Proceedings of the 2017 ACM International Conference on Management of Data, 2017