GenAI systems use Large Language Models (LLMs) that are trained on extensive datasets, primarily consisting of human-generated content. Consequently, these models inherently incorporate human and societal biases into the applications and outputs they produce. Dr. Punya Mishra from Arizona State University has highlighted a concern that most contemporary datasets are WEIRD, that is, they "disproportionately represent Western, Educated, Industrialized, Rich, and Democratic societies" (Mishra, 2023; Shulz et al., 2018). This imbalance can distort perspectives, perpetuate narrow worldviews, exacerbate biases that marginalize minority and underrepresented communities, and cause harm to individuals.
According to the Artificial Intelligence and the Future of Teaching and Learning: Insights and Recommendations (U.S. Department of Education, Office of Education Technology), "Datasets are used to develop AI, and when they are non-representative or contain undesired associations or patterns, resulting AI models may act unfairly in how they detect patterns or automate decisions. Systemic, unwanted unfairness in how a computer detects patterns or automates decisions is called 'algorithmic bias.' Algorithmic bias could diminish equity at scale with unintended discrimination."
Student tracking based on data containing inherent bias can lead to unfair treatment that perpetuates inequities.
Use of biased GenAI output absent critical evaluation can affirm narrow worldviews and exacerbate societal biases.
AI-generated teaching materials may reflect bias.