Responsible Data Science Lab at Purdue
We study problems at the intersection of data management and machine learning to build trustworthy and responsible decision-making systems. Our aim is to develop systems that enable explainability, fairness, and accountability of data-driven decision-making systems. We are particularly interested in:
- Explaining and debugging fairness violations in machine learning models and data science pipelines:
- How can we determine sources of unexpected errors and bias in machine learning model outcomes?
- How can we decompose unexpected or discriminatory behavior of data science pipelines in terms of the different pipeline stages?
- Can we effectively generate post hoc explanations for the outcomes of machine learning models?
- Data integration and data quality:
- How can we leverage expert feedback to improve data cleaning techniques for machine learning?
- Can we use the final outcomes in data science pipelines to inform intermediate pipeline choices?
- How can we intertwine pipeline stages with downstream analytics to improve upon the end goals?
We are always looking for motivated Ph.D. students to collaborate with. If you are interested in data management and/or responsible data analytics, feel free to contact us with your CV/resume and a couple of sentences describing your research interests, and consider applying to Purdue CIT!
Sponsors We are thankful for the generous funding award and gift from our sponsors: NSF, Google, and CASMI.
news
Nov 13, 2024 | Kevin defends his M.S. thesis. Congrats, Kevin! |
---|---|
Nov 12, 2024 | Tejendra defends his M.S. thesis. Congrats, Tejendra! |
Nov 11, 2024 | Shashank defends his M.S. thesis. Congrats, Shashank! |
Aug 12, 2024 | Welcoming Omkar and Ananya to the group! |
Jul 22, 2024 | Ekta’s paper on Valuation-based Data Acquisition for Machine Learning Fairness accepted to the 13th International Workshop on Quality in Databases (QDB) at the 50th VLDB conference. Congrats, Ekta! |