Comparative Evaluation of Zero-Shot and Few-Shot Performance of Large Language Models in Low-Resource Language Machine Translation

Authors

  • Minhao Li Master of Science in Computer Engineering, University of California, Davis, CA, USA Author
  • Xu Wang Computer Science, Beijing University of Posts and Telecommunications, Beijing, China Author
  • Mingzhuo Yu Computer Science, Northeastern University, MA, USA Author

DOI:

https://doi.org/10.66372/JGER.v3i2.5

Keywords:

large language models, low-resource machine translation, few-shot learning, zero-shot translation

Abstract

Large language models (LLMs) have demonstrated remarkable translation capabilities for high-resource languages, yet their effectiveness on low-resource languages under varying prompting conditions remains insufficiently understood. This study presents a comparative evaluation of four LLMs—GPT-4, GPT-3.5-Turbo, LLaMA-2-70B, and BLOOM-176B—alongside NLLB-200-3.3B as a supervised baseline, across ten translation directions spanning four resource levels. Using the FLORES-200 devtest set as the primary benchmark and NTREX-128 for cross-validation, we assess zero-shot, one-shot, five-shot, and eight-shot configurations with BLEU, chrF++, and COMET-22 metrics. Our results reveal three principal findings. The few-shot advantage is most pronounced for low-resource languages, with GPT-4 achieving an average BLEU gain of 5.3 points when moving from zero-shot to five-shot on low-resource pairs. One-shot prompting consistently degrades performance below zero-shot baselines, with an average BLEU reduction of 1.4 points across low-resource directions. The supervised NLLB-200 baseline outperforms all LLMs in zero-shot on eight of ten directions, while five-shot GPT-4 narrows this gap to within 1.0 BLEU on mid-resource pairs. These findings provide empirical guidance for practitioners selecting prompting strategies for LLM-based translation in resource-constrained settings.

Author Biography

  • Mingzhuo Yu, Computer Science, Northeastern University, MA, USA

     

     

Downloads

Published

2025-07-17

How to Cite

Comparative Evaluation of Zero-Shot and Few-Shot Performance of Large Language Models in Low-Resource Language Machine Translation. (2025). Journal of Global Engineering Review, 3(2), 59-68. https://doi.org/10.66372/JGER.v3i2.5