1. 서론
2. 이론적 배경
3. 연구 설계 및 방법
4. 연구 결과 및 분석
5. 결론
This research compares and analyzes the responses of different Large Language Models (LLMs) to prompts related to common keywords between Mainland China and Taiwan, controlling for hyperparameters and languages. The study used four major LLMs: Qwen2-72B-Instruct (Mainland China), Llama-3-Taiwan-70B-Instruct (Taiwan), GPT-4o (United States), and Mistral-large-2411 (France). Questions on topics such as Taiwan (the island), history, politics, and cross-strait relations were presented in Simplified Chinese, Traditional Chinese, Korean, and English. Each question was repeated three times with different random seeds, resulting in a total of 576 responses. These responses were then quantified on a scale: Mainland China perspective (-1), neutral (0), and Taiwan perspective (+1). The results showed that even within the same model, responses to political issues such as cross-Strait relations varied depending on the model, language, and terminology used in the questions. In particular, Mainland China's Qwen2 consistently provided answers that strongly reflected the Mainland Chinese position, sometimes answering in Simplified Chinese even when questions were asked in Traditional Chinese. Language differences were also significant-when questions were asked in Simplified Chinese, most models produced responses aligned with Mainland China's perspective or maintained neutrality, whereas questions in Traditional Chinese elicited pro-Taiwan responses from all models except Qwen2.
Qualitative differences between the models were also noted. GPT-4o and Mistral-large generally produced balanced responses that considered both sides and provided detailed information in English. Interestingly, while Korean responses showed limitations due to insufficient training data in some models, Qwen2 showed relatively neutral positions in certain Korean queries. This study confirms the political biases of LLMs and demonstrates how political and cultural contexts can be shaped by the language used. The findings suggest that models developed by state-led or state-based AI companies may reflect the political positions of their countries, highlighting the importance of considering language-specific biases when developing and using multilingual LLMs. |