Responsible Data Science Lab at Purdue
We study problems at the intersection of data management and machine learning to build trustworthy and responsible decision-making systems. Our aim is to develop systems that enable explainability, fairness, and accountability of data-driven decision-making systems. We are particularly interested in:
- Explaining and debugging fairness violations in machine learning models and data science pipelines:
- How can we determine sources of unexpected errors and bias in machine learning model outcomes?
- How can we decompose unexpected or discriminatory behavior of data science pipelines in terms of the different pipeline stages?
- Can we effectively generate post hoc explanations for the outcomes of machine learning models?
- Data integration and data quality:
- How can we leverage expert feedback to improve data cleaning techniques for machine learning?
- Can we use the final outcomes in data science pipelines to inform intermediate pipeline choices?
- How can we intertwine pipeline stages with downstream analytics to improve upon the end goals?
We are always looking for motivated Ph.D. students to collaborate with. If you are interested in data management and/or responsible data analytics, feel free to contact us with your CV/resume and a couple of sentences describing your research interests, and consider applying to Purdue CIT!
Sponsors We are thankful for the generous funding award and gift from our sponsors: NSF, Google, and CASMI.
news
Sep 5, 2025 | Dr. Pradhan attended the VLDB conference in London, UK (and also presented Shashank’s paper). |
---|---|
Aug 25, 2025 | Welcoming Jingya to the group! Jingya has joined as a Ph.D. student and comes in with the prestigious Frederick N. Andrews fellowship designed for outstanding Ph.D.-track students to graduate programs at Purdue. Congratulations, Jingya! |
Jun 25, 2025 | Jahid received a Microsoft fellowship to attend the 2025 EDBT Summer School on AI and Data Management. Congrats Jahid! |
Jun 20, 2025 | Our paper on Label Flipping for Group Fairness got accepted to the 14th International Workshop on Quality in Databases (QDB) at the 51st VLDB conference. |
May 7, 2025 | Our paper on Explanations for Machine Learning Pipelines under Data Drift got accepted to the 2025 ACM International Conference on Management of Data (SIGMOD). |