Currently Available GenAI-Powered Large Language Models and Low-Resource Languages: Any Offerings? Wait Until You See
Abstract
A lot of hype has accompanied the increasing number of generative artificial intelligence-powered large language models (LLMs). Similarly, much has been written about what currently available LLMs can and cannot do, including their benefits and risks, especially in higher education. However, few use cases have investigated the performance and generative capabilities of LLMs in low-resource languages. With this in mind, one of the purposes of the current study was to explore the extent to which seven, currently available, free-to-use versions of LLMs (ChatGPT, Claude, Copilot, Gemini, GroqChat, Perplexity, and YouChat) perform in five low-resource languages (isiZulu, Sesotho, Yoruba, M?ori, and Mi’kmaq) in their generative multilingual capabilities. Employing a common input prompt, in which the only change was to insert the name of a given low-resource language and English in each case, this study collected its datasets by inputting this common prompt into the seven LLMs. Three of the findings of this study are noteworthy. First, the seven LLMs displayed a significant lack of generative multilingual capabilities in the five low-resource languages. Second, they hallucinated and produced nonsensical, meaningless, and irrelevant responses in their low-resource language outputs. Third, their English responses were far better in quality, relevance, depth, detail, and nuance than their low-resource language only and English responses for the five low-resource languages. The paper ends by offering the implications and making the conclusions of the study in terms of LLMs’ generative capabilities in low-resource languages.
https://doi.org/10.26803/ijlter.23.12.9
Keywords
Full Text:
PDFReferences
AFP. (2024). ChatGPT faces Austria complaint over ‘uncorrectable errors’. https://dxjournal.co/2024/04/chatgpt-faces-austria-complaint-over-uncorrectable-errors
Aharoni, R., Narayan, S., Maynez, J., Herzig, J., Clark, E., & Lapata, M. (2024). Multilingual summarization with factual consistency evaluation. https://arxiv.org/pdf/2212.10622.pdf
AIContentfy Team. (2023). Evaluating the effectiveness of AI detectors: Case studies and metrics. https://aicontentfy.com/en/blog/evaluating-of-ai-detectors-case-studies-and-metrics
AI Phrase Finder. (2024). The 100 most common AI words. https://aiphrasefinder.com/common-ai-words/
Akula, B., Andrews, P., Ayan, N. F., Barrault, L., Bhosale, S., Costa-jussa, M. R., James Cross, J. … Youngblood, A. (2024). 200 languages within a single AI model: A breakthrough in high-quality machine translation. https://ai.meta.com/blog/nllb-200-high-quality-machine-translation/
Author. (2022).
Author. (2023).
Author. (2024a).
Author. (2024b).
Author. (2024c).
Author & Author. (2023).
Captain Words. (2024). Testing AI detection tools – Our methodology. https://captainwords.com/ai-detection-tools-test-methodology/
Cave, S., & Dihal, K. (20202). The Whiteness of AI. Philosophy & Technology, 33, 685–703. https://doi.org/10.1007/s13347-020-00415-6
Dale, D., Voita, E., Lam, J., Hansanti, P., Ropers, C., Kalbassi, E., Gao, C., Barrault, L., & Costa-jussà, M. R. (2023). HalOmi: A manually annotated benchmark for multilingual hallucination and omission detection in machine translation. https://arxiv.org/pdf/2305.11746.pdf
Delve, H. L., & Limpaecher, A. (2022,). Qualitative content analysis: Manifest content analysis vs. latent content analysis. https://delvetool.com/blog/manifest-content-analysis-latent-content-analysis
Gray, A. (2024). ChatGPT “contamination”: Estimating the prevalence of LLMs in the scholarly literature. https://arxiv.org/pdf/2403.16887
Hadi et al. (2023). Large language models: A comprehensive survey of its applications, challenges, limitations, and future prospects. https://d197for5662m48.cloudfront.net/documents/publicationstatus/181139/preprint_pdf/edf41a1f2a93aadb235a3c3aff2dcf08.pdf
Heugh, K. A. (2021). Southern multilingualisms, translanguaging and transknowledging in inclusive and sustainable education. In P. Harding-Esch & H. Coleman (Eds.), Language and the sustainable development goals (pp. 37-47). British Council.
Huang, H., Tang, T., Zhang, D., Zhao, W. X., Song, T. Xia, Y. & Wel, F. (2023). Not all languages are created equal in LLMs: Improving multilingual capability by cross-lingual-thought prompting. https://arxiv.org/abs/2305.07004
IBM. (2024). What are AI hallucinations? https://www.ibm.com/topics/ai-hallucinations
Kalai, A. T, & Vempala, S. S. (2024, revised version). Calibrated language models must hallucinate. https://arxiv.org/abs/2311.14648
Kassner, M. (2013). Search engine bias: What search results are telling you (and what they’re not). https://www.techrepublic.com/article/search-engine-bias-what-search-results-are-telling-you-and-what-theyre-not/
Kleinheksel, A. J., Rockich-Winston, N., Tawfik, H., & Wyatt, T. R. (2020). Demystifying content analysis. American Journal of Pharmaceutical Education, 84(1), 127-137.
Lankford, S., Afli, H., & Way, A. (2023). adaptMLLM: Fine?tuning multilingual language models on low?resource languages with integrated LLM playgrounds. Information, 14, 638. https://doi.org/10.3390/info14120638
Lee, N. T., Resnick, P., & Barton, G. (2019). Algorithmic bias detection and mitigation: Best practices and policies to reduce consumer harms. https://www.brookings.edu/research/algorithmicbias-detection-and-mitigation-best-practices-andpolicies-to-reduce-consumer-harms/
Leffer, L. (2024). AI chatbots will never stop hallucinating. https://www.scientificamerican.com/article/chatbot-hallucinations-inevitable/
Lin, C., Gao, Y., Ta, N., Li, K., & Fu, H. (2023). Trapped in the search box: An examination of algorithmic bias in search engine autocomplete predictions. Telematics and Informatics, 85, 102068.
Lorandi, M., & Belz, A. (2023). Data-to-text generation for severely under-resourced languages with GPT-3.5: A bit of help needed from Google Translate. https://aclanthology.org/2023.mmnlg-1.9.pdf
Navigli, R., Conia, S., & Ross, B. (2023). Biases in large language models: Origins, inventory, and discussion. ACM Journal of Data and Information Quality, 15(2), 1-21. https://doi.org/10.1145/3597307
Nguyen, X. P., Aljunied, S. M., Joty, S., & Bing, L. (2023). Democratizing LLMs for low-resource languages by leveraging their English dominant abilities with linguistically-diverse prompts. https://arxiv.org/abs/2306.11372
Perkins, M. (2023). Academic integrity considerations of AI large language models in the post-pandemic era: ChatGPT and beyond. Journal of University Teaching & Learning Practice, 20(2), 07. http://dx.doi.org/10.53761/1.20.02.07
Popenici, S. (2023). The critique of AI as a foundation for judicious use in higher education. Journal of Applied Learning and Teaching, 6(2), 378-384. https://doi.org/10.37074/jalt.2023.6.2.4
Qin, L., Chen, Q., Zhou, Y., Chen, Z., Li, Y., Liao, L., Li, M., Che, W., & Yu, P. S. (2024). Multilingual large language model: A survey of resources, taxonomy and frontiers. https://arxiv.org/abs/2404.04925
Guerreiro, N. M., Alves, D. M., Waldendorf, J., Haddow, B., Birch, A., Colombo, P., Martins, A. F. T. (2023). Hallucinations in large multilingual translation models. https://arxiv.org/abs/2303.16104
Rudolph, J., Tan, S., & Tan, S. (2023). ChatGPT: Bullshit spewer or the end of traditional assessments in higher education?. Journal of Applied Learning and Teaching, 6(1), 342-363. https://doi.org/10.37074/jalt.2023.6.1.9
Rudolph, J., Ismail, M. F., & Popenici, S. 2024). Higher education’s generative artificial intelligence paradox: The meaning of chatbot mania. Journal of University Teaching and Learning Practice, 21(6). https://doi.org/10.53761/pzd17z29
Snyder, A. (2023). AI’s language gap. https:// studies www.axios.com/2023/09/08/ai-language-gap-chatgpt
Tavani, H., & Zimmer, M. (2020). Search engines and ethics. https://plato.stanford.edu/entries/ethics-search/#SeaEngBiaProOpa
Vaismoradi, M., Turunen, H., & Bondas, T. (2013). Content analysis and thematic analysis: Implications for conducting a qualitative descriptive study. Nursing & Health Sciences, 15(3), 398-405. https://doi.org/10.1111/nhs.12048
Vashee, K. (2023). Making generative AI effectively multilingual at scale. https://blog.modernmt.com/making-generative-ai-multilingual-at-scale/
Wu, J., Yang, S., Zhan, R., Yuan, Y., Wong, D. F., & Chao, L. S. (2023). A survey on LLM-generated text detection: Necessity, methods, and future directions. https://arxiv.org/pdf/2310.14724.pdf
Refbacks
- There are currently no refbacks.
e-ISSN: 1694-2116
p-ISSN: 1694-2493