Share Data Debiasing with Datamodels (D3M): Improving Subgroup Robustness via Data Selection

Copy link

December 11, 2024

Data Debiasing with Datamodels (D3M): Improving Subgroup Robustness via Data Selection

20 minutes

This research paper describes a new method called D3M, which aims to improve the fairness and accuracy of machine learning models. Machine learning models can sometimes perform poorly on certain groups, especially if those groups are underrepresented in the data used to train the model. For example, a model trained to predict age might be less accurate for older women if the training data mostly contains images of younger women and older men. D3M tries to fix this problem by identifying and removing specific examples from the training data that are causing the model to be biased against certain groups. The researchers found that D3M is effective at improving the accuracy of models on underperforming groups while only needing to remove a small number of examples from the training data. The researchers also developed a variation of D3M called AUTO-D3M that can be used even when information about group labels is not available. They tested their methods on several datasets and found that they performed well compared to other methods for improving model fairness.

https://arxiv.org/pdf/2406.16846

...more

View all episodes

By AIPPD

December 11, 2024

Data Debiasing with Datamodels (D3M): Improving Subgroup Robustness via Data Selection

20 minutes

https://arxiv.org/pdf/2406.16846

...more

Sign up to save your podcasts