Future of Information and Communication Conference (FICC) 2024
4-5 April 2024
Publication Links
IJACSA
Special Issues
Future of Information and Communication Conference (FICC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 15 Issue 4, 2024.
Abstract: Word Sense Disambiguation (WSD) serves as an intermediate task for enhancing text understanding in Natural Language Processing (NLP) applications, including machine translation, information retrieval, and text summarization. Its role is to enhance the effectiveness and efficiency of these applications by ensuring the accurate selection of the appropriate sense for polysemous words in diverse contexts. This task is recognized as an AI-complete problem, indicating its longstanding complexity since the 1950s. One of the earliest proposed solutions to address polysemy in NLP is the Lesk algorithm, which has seen various adaptations by researchers for different languages over the years. This study proposes a simplified, Lesk-based algorithm to resolve polysemy for Setswana. Instead of combinatorial comparisons among candidate senses that Lesk is based on that cause computational complexity, this study models word sense glosses using Bidirectional Encoder Representations from Transformers (BERT) and Cosine similarity measure, which have been proven to achieve optimal performance in WSD. The proposed algorithm was evaluated on Setswana and obtained an accuracy of 86.66 and an error rate of 14.34, surpassing the accuracy of other Lesk-based algorithms for other languages.
Tebatso Gorgina Moape, Oludayo O. Olugbara and Sunday O. Ojo, “Integrating Lesk Algorithm with Cosine Semantic Similarity to Resolve Polysemy for Setswana Language” International Journal of Advanced Computer Science and Applications(IJACSA), 15(4), 2024. http://dx.doi.org/10.14569/IJACSA.2024.0150479
@article{Moape2024,
title = {Integrating Lesk Algorithm with Cosine Semantic Similarity to Resolve Polysemy for Setswana Language},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2024.0150479},
url = {http://dx.doi.org/10.14569/IJACSA.2024.0150479},
year = {2024},
publisher = {The Science and Information Organization},
volume = {15},
number = {4},
author = {Tebatso Gorgina Moape and Oludayo O. Olugbara and Sunday O. Ojo}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.