중앙대학교 인공지능인문학연구소

HK+인공지능인문학

학술지지난 호 보기

지난 호 보기

eISSN: 2951-388X
Print ISSN: 2635-4691 / Online ISSN: 2951-388X
제목[인공지능인문학연구 20권] 초거대 언어모델의 안전성 평가를 위한 국내⋅외 데이터 세트 분석 연구_강조은, 강채안, 박서윤, 최규리, 정가연, 김민선, 최혜지, 안수빈, 김희재, 왕소남, 나예찬, 변은영, 김한샘2025-08-31 12:19
작성자 Level 10
첨부파일인공지능 인문학연구 20_05.강조은_outline.pdf (23MB)

1. 서론


2. 관련 연구 및 분석 기준


3. AI 안전성을 위한 국내외 데이터 세트


4. 국내 안전성 관련 연구 동향


5. 결론



Identifying, detecting, addressing, and preventing the potential risks associated with artificial intelligence (AI) models before they lead to real-world issues is essential to ensure user safety. This requires the implementation of a comprehensive safety evaluation system and a proactive response framework from the early stages of model development. In this context, this study aims to contribute to the ongoing discussion on AI model safety by examining both domestic and international benchmarks related to safety evaluations and training datasets. The primary focus is on the datasets used to evaluate AI safety and on identifying current trends in safety-related research in South Korea. This study analyzes a range of datasets from both domestic and international sources, focusing on their construction purposes, methodologies, formats, and the inclusion of qualitative evaluation criteria. The findings indicate a shift from simple, single-purpose datasets to more complex designs that consider multiple evaluation factors. Prominent trends include the use of prompt-based input data aligned with the characteristics of large language models and the large-scale construction of datasets utilizing generative AI technologies.

In South Korea, the safety evaluation benchmarks remain limited. Most datasets are created through the direct collection of posts and comments or by incorporating AI-generated data, reflecting a shift toward combining real and synthetic sources. This trend is also evident in recent research presented at the Korea Natural Language Processing Conference. Furthermore, studies on reliability are being increasingly conducted in addition to efforts to develop datasets that focus on harmful content.

중앙대학교 인문콘텐츠연구소
06974 서울특별시 동작구 흑석로 84 중앙대학교 310관 828호  TEL 02-881-7354  FAX 02-813-7353  E-mail : aihumanities@cau.ac.krCOPYRIGHT(C) 2017-2023 CAU HUMANITIES RESEARCH INSTITUTE ALL RIGHTS RESERVED