제목 | [인공지능인문학연구 20권] 초거대 언어모델의 안전성 평가를 위한 국내⋅외 데이터 세트 분석 연구_강조은, 강채안, 박서윤, 최규리, 정가연, 김민선, 최혜지, 안수빈, 김희재, 왕소남, 나예찬, 변은영, 김한샘2025-08-31 12:19 |
---|
1. 서론
2. 관련 연구 및 분석 기준
3. AI 안전성을 위한 국내⋅외 데이터 세트
4. 국내 안전성 관련 연구 동향
5. 결론
Identifying, detecting, addressing, and preventing the potential risks associated with artificial intelligence (AI) models before they lead to real-world issues is essential to ensure user safety. This requires the implementation of a comprehensive safety evaluation system and a proactive response framework from the early stages of model development. In this context, this study aims to contribute to the ongoing discussion on AI model safety by examining both domestic and international benchmarks related to safety evaluations and training datasets. The primary focus is on the datasets used to evaluate AI safety and on identifying current trends in safety-related research in South Korea. This study analyzes a range of datasets from both domestic and international sources, focusing on their construction purposes, methodologies, formats, and the inclusion of qualitative evaluation criteria. The findings indicate a shift from simple, single-purpose datasets to more complex designs that consider multiple evaluation factors. Prominent trends include the use of prompt-based input data aligned with the characteristics of large language models and the large-scale construction of datasets utilizing generative AI technologies.
In South Korea, the safety evaluation benchmarks remain limited. Most datasets are created through the direct collection of posts and comments or by incorporating AI-generated data, reflecting a shift toward combining real and synthetic sources. This trend is also evident in recent research presented at the Korea Natural Language Processing Conference. Furthermore, studies on reliability are being increasingly conducted in addition to efforts to develop datasets that focus on harmful content. |