1. Surveying the basics of bias and fairness in machine learning. The students will learn the basics from the two review articles “A Survey on Bias and Fairness in Machine Learning” by NINAREH MEHRABI, FRED MORSTATTER, NRIPSUTA SAXENA, KRISTINA LERMAN, and ARAM GALSTYAN, and “An Introduction to Algorithmic Fairness” arXiv:2105.05595v1 [cs.CY] by Hilde J.P. Weerts.
2. Searching for possible fairness libraries that can be used in the industry. We will use three libraries created by big technology companies, so that they are trustable to be used in industry.
• Fairlearn (By Microsoft)
• AIF360 (By IBM)
• What-if-tool (By Google)
3. Selecting a published structured and unstructured dataset. The main goal of the project is to identify bias in the structured (tabular) dataset. If possible, we will extend our bias analysis to the unstructured data such as text and image.
• Tabular Dataset: TitanicSexism (fairness in ML), https://www.kaggle.com/code/garethjns/titanicsexism-fairness-in-ml/input
• Text Dataset: Fake and real news dataset, https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset
• Imaged Dataset: UTKFace, https://www.kaggle.com/datasets/jangedoo/utkface-new
4. Choosing the proper fairness metrics to identify the bias. Below is an example of the metrics that will be used in each library.
• Fairlearn: Demographic parity, Equalized odds, Equal opportunity
• AIF360: Dataset Metric, Binary Label Dataset Metric, Classification Metric, Sample Distortion Metric, MDSS Classification Metric.
• What-If-Tool: It is still under study
5. Discussing the results and summarizing the comparison among the libraries. In the result discussion, we will classify the fairness metrics as group and individual, also as metrics to measure fairness in dataset and others for the model performance.
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
CR-Penn State
{Empty}
CR-Penn State
{Empty}
No
Already behind3Start date is flexible
{Empty}
{Empty}
05/08/2024
{Empty}
06/14/2024
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}