Abstract
Talk Description: The rapid rise of large language models (LLMs) in recent years has transformed language data into a highly sought-after commodity for training these artificial intelligence technologies. Given the relative lack of available Arabic-language datasets, scores of companies have turned to creating, curating, and categorizing language data while also working to train linguistically and culturally localized AI models. The linguistic and technical labor animating Arabic LLMs, however, takes place in friction with broader histories and transnational processes: colonial framings of Arabic as antiquated and inimical to progress, the privileging of English within the knowledge economy, Euro-American assumptions about language embedded into AI design, and inequitable resource distribution within the global tech industry. The process of making Arabic LLMs, in short, is thoroughly shaped by prevailing language ideologies, political-economic relations, and social milieus.
This talk highlights the experiences and perspectives of Arabic-language model makers as they navigate this complex linguistic, ideological, and political-economic landscape. Drawing on 15 months of ethnographic fieldwork in Amman, Jordan—a regional hub for information technology outsourcing—I trace how model makers conceptualize language in relation to their work with AI tech. While some reproduce tropes about Arabic’s “backwardness,” another camp highlights the structural factors stifling the region’s AI industry; other makers, meanwhile, argue that LLMs are now “language-agnostic,” given that they are trained on digits and vectors rather than letters and words. As they see it, we have already entered an era of technological progress unencumbered by language, Arabic or otherwise. By analyzing such stances, this talk highlights how ideologies about language and communication shape how emerging technologies like LLMs are designed and imagined. The experiences of model makers, moreover, illuminate how social theories about language are crafted and reconfigured in relation to novel technologies and the labor undergirding them.

About the speaker
Tariq Adely is a PhD candidate in the Department of Anthropology at George Washington University. His research focuses on the intersection of language, technology, and labor in the Arabic-speaking world. His dissertation project examines the everyday work and decision-making rubrics involved in the production of AI language models for Arabic, as well as the social, political, and ethical dimensions of this labor. Before beginning his graduate study, Tariq worked as a reporter and translator in Amman, Jordan.