[인공지능인문학연구 19권] 중국 대륙⋅대만 관련 프롬프트를 통한 LLM 응답 데이터 비교_김여주

중앙대학교 인공지능인문학연구소

HK+인공지능인문학

eISSN: 2951-388X

Print ISSN: 2635-4691 / Online ISSN: 2951-388X


제목	[인공지능인문학연구 19권] 중국 대륙⋅대만 관련 프롬프트를 통한 LLM 응답 데이터 비교_김여주2025-06-24 13:48
작성자	aihumanities
첨부파일	인공지능 인문학연구 19_01.김여주.pdf (1.06MB)
1. 서론 2. 이론적 배경 3. 연구 설계 및 방법 4. 연구 결과 및 분석 5. 결론 This research compares and analyzes the responses of different Large Language Models (LLMs) to prompts related to common keywords between Mainland China and Taiwan, controlling for hyperparameters and languages. The study used four major LLMs: Qwen2-72B-Instruct (Mainland China), Llama-3-Taiwan-70B-Instruct (Taiwan), GPT-4o (United States), and Mistral-large-2411 (France). Questions on topics such as Taiwan (the island), history, politics, and cross-strait relations were presented in Simplified Chinese, Traditional Chinese, Korean, and English. Each question was repeated three times with different random seeds, resulting in a total of 576 responses. These responses were then quantified on a scale: Mainland China perspective (-1), neutral (0), and Taiwan perspective (+1). The results showed that even within the same model, responses to political issues such as cross-Strait relations varied depending on the model, language, and terminology used in the questions. In particular, Mainland China's Qwen2 consistently provided answers that strongly reflected the Mainland Chinese position, sometimes answering in Simplified Chinese even when questions were asked in Traditional Chinese. Language differences were also significant-when questions were asked in Simplified Chinese, most models produced responses aligned with Mainland China's perspective or maintained neutrality, whereas questions in Traditional Chinese elicited pro-Taiwan responses from all models except Qwen2. Qualitative differences between the models were also noted. GPT-4o and Mistral-large generally produced balanced responses that considered both sides and provided detailed information in English. Interestingly, while Korean responses showed limitations due to insufficient training data in some models, Qwen2 showed relatively neutral positions in certain Korean queries. This study confirms the political biases of LLMs and demonstrates how political and cultural contexts can be shaped by the language used. The findings suggest that models developed by state-led or state-based AI companies may reflect the political positions of their countries, highlighting the importance of considering language-specific biases when developing and using multilingual LLMs.

이전	[인공지능인문학연구 19권] RAG를 활용한 한국어교육 챗봇 구현 방안- 한국어 학습자의 문법 교정을 중심으로_이승호, 이찬규	aihumanities	2025-06-24
-	[인공지능인문학연구 19권] 중국 대륙⋅대만 관련 프롬프트를 통한 LLM 응답 데이터 비교_김여주	aihumanities	2025-06-24
다음	[인공지능인문학연구 18권] The Need for an International Data-Based Systems Agency (IDA) at the UN - Governing “AI” Globally by Keeping the Planet Sustainably and Protecting the Weaker from the Powerful_Peter G.	aihumanities	2025-01-13