1. 서론 2. 연구 설계 및 방법 3. 연구 결과 및 분석
4. 논의 및 결론
This study evaluated GPT-4’s ability to perform inductive qualitative coding on Korean documentary scripts address in glow fertility. Unlike prior research that relied on deductive coding schemes, we examined whether GPT-4 could generate emergent codes across 497 quotations. Approximately 90 % of the AI-generated codes aligned with core meaning units in the human-coded gold standard, with complete semantic mismatches being rare. An embedding-based metric (BERTScore F1 = 0.791) further confirmed the strong semantic correspondence. Model performance declined as textual complexity increased, with character-level F1 scores decreasing from 0.486 for less complex passages to 0.352 for more complex ones. Notably, Few-shot learning with only a single example improved performance by 7 %, which highlights the efficiency of the model. While GPT-4 coded concrete concepts effectively, it showed reduced capability with abstract ideas and context-dependent interpretations. Despite these limitations, our findings suggest that large language models could be used to replace human labor in inductive qualitative research to some extent. We consider that this approach could enhance scalability and reproducibility while preserving interpretive depth. |