FairSYN-Edu: A Fairness-Aware, Privacy-Preserving Diffusion Model for Educational Data Synthesis

Kadir Kesgin

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The increasing demand for privacy-preserving, ethically aligned synthetic data generation in education has highlighted the limitations of existing tabular data generators. Traditional approaches often sacrifice fairness or privacy in pursuit of predictive accuracy, rendering them unsuitable for high-stakes academic settings. In this paper, we propose FairSYN-Edu, a novel diffusion-based synthetic data generation framework designed for educational data. By integrating adversarial debiasing and differentially private training into the generative process, FairSYN-Edu jointly optimizes utility, fairness, and privacy. We evaluate our approach on three real-world educational datasets spanning MOOC, K–12 tutoring, and LMS environments. Experimental results demonstrate that FairSYN-Edu achieves significantly lower demographic disparities, maintains competitive predictive performance (RMSE = 0.402), and provides moderate resistance to membership inference attacks (AUC = 0.705). Ablation studies, error gap analysis, and SHAP-based interpretability evaluations confirm the robustness and ethical soundness of our method. We release the full implementation, synthetic benchmark suite, and documentation to foster reproducibility and responsible AI practices in education.

Version published to 10.21203/rs.3.rs-6631139/v1 on Research Square
May 28, 2025

Impact of Data Distillation on Fairness in Machine Learning Models

This article has 5 authors:
1. Kamil Sabbagh
2. Hadi Salloum
3. Rafik Hachana
4. Marko Pezer
5. Manuel Mazzara
This article has no evaluationsLatest version Jun 30, 2025
Fairness, Justice, and Social Inequality in Machine Learning

This article has 2 authors:
1. Ruben L. Bach
2. Christoph Kern
This article has no evaluationsLatest version May 28, 2025
From Fair Graphs to Fair Data: A DAG-Based Approach to Mitigating Bias in AI Systems

This article has 3 authors:
1. Vivian Wei Jiang
2. Gustavo Batista
3. Michael Bain
This article has no evaluationsLatest version Jul 3, 2025

Listed in

Abstract

Article activity feed

Related articles

Impact of Data Distillation on Fairness in Machine Learning Models

Fairness, Justice, and Social Inequality in Machine Learning

From Fair Graphs to Fair Data: A DAG-Based Approach to Mitigating Bias in AI Systems