Project Description
Though there are only 20 languages that fall into the high-resource category, most natural language processing (NLP) advancements have been accomplished in these 20 languages, excluding thousands of the low-resource languages spoken by millions of people in the world. It's not only a technological problem; equity is also in danger. This study seeks to fill this gap. The lack of low-resource language corpora and other linguistic resources is one of the causes of this knowledge gap. We must create a corpus of the African Igbo language to solve this problem. We will employ NLP machine and deep learning techniques to analyze the corpus. The outcome of this project could be applications like text categorization, information extraction, summarization, dialogue systems, and machine translation in the Igbo language. Currently, we have started building the Igbo_News corpus with Sketch Engine.