Here’s the latest paper published on the comparison of decoder-only LLMs (ChatGPT, Claude, and Gemini) on how well they perform sentiment analysis for short text across multiple languages. We compared how accurate they were when asked to evaluate the text in its original language versus when that text was translated to English. Additionally, we compared the decoder-only LLMs to encoder-only LLMs (such as BERT and its variants), Recurrent Neural Networks, and lexicon-based sentiment analysis methods (such as VADER) to see which produced the most accurate results.
We found that decoder-only LLMs achieved the highest accuracy across all sentiment analysis methods when working with the original language data. The only exception was with the French data, where an RNN was the most accurate. Among the three decoder-only LLMs, ChatGPT had the highest accuracy in four of the seven languages, Claude in two, and Gemini, which ranked second in six of the seven languages.
Link to full paper: https://jsomer.org/index.php/pub/article/view/38