Research Associate @ UBC

AbdelRahim A.
Elmadany, Ph.D.

Research Scientist at the Deep Learning & NLP Group. Specializing in Large Language Models (LLMs), Data Mining, and Low-Resource NLP for African and Arabic languages.

AbdelRahim Elmadany
30+
Top-Tier Papers (ACL, EMNLP)
10+
Large-Scale Models (BERT, T5)
Top 10
Global Projects (UNESCO/IRCAI)
Ph.D.
Computer Science

Research Focus

My research lies at the intersection of NLP, Data Mining, and Scalable Data Management, aiming to democratize AI for underrepresented communities.

Large Language Models

Developing novel techniques to mine and manage massive-scale datasets for training LLMs. Creator of models like ARBERT, MARBERT, and SERENGETI.

Low-Resource NLP

Pioneering benchmarks for African, Arabic, and Indigenous languages. Leading the "Voice of a Continent" initiative to map speech technology frontiers.

Scalable Data Mining

Analyzing billions of data points using PySpark and custom pipelines. Automating data cleaning and validation for high-stakes domains.

Selected Publications (2025)

Recent work accepted in EMNLP and ACL

View All
EMNLP 2025

Voice of a Continent: Mapping Africa's Speech Technology Frontier

A. Elmadany, S. Kwon, H. Toyin, A. Inciarte, H. Aldarmaki, M. Abdul-Mageed

ACL 2025

Palm: A Culturally Inclusive and Linguistically Diverse Dataset for Arabic LLMs

Best Resource Paper

F. Alwajih, S. Magdy, A. Mekki... A. Elmadany... M. Abdul-Mageed

ACL 2025

Where Are We? Evaluating LLM Performance on African Languages

I. Adebara, H. Toyin... A. Elmadany, M. Abdul-Mageed

Awards & Recognition

Best Resource Paper Award

ACL 2025 - Vienna, Austria

For the "Palm" dataset: A Culturally Inclusive and Linguistically Diverse Dataset for Arabic LLMs.

Top 10 Outstanding Projects

IRCAI-UNESCO, May 2023

Afrocentric-NLP ranked among the global Top 10 outstanding projects.

Best Paper Award

OSACT5, June 2022

For "TURJUMAN: A Public Toolkit for Neural Arabic Machine Translation."

Research Grants

Google & AMD

Secured Google Education Research Grant (2025) and TPU/GPU compute grants for large-scale model training.