Abstract
This paper explores the impact of the mainstreaming of cyberspace on language use and proposes an internet meme translation system as a solution to communication problems between speakers familiar and unfamiliar with cyberspace language, which is a complex compound of various cultural elements, including text, images, sounds, music, gestures, postures, facial expressions, and composition. Prior to building the internet meme translation system, experiments were conducted to validate the internet meme data. First, a dictionary of internet memes was constructed, which contained 1035 lemmas and 1654 inflected forms. Based on this dictionary, the proportion of internet memes used in actual cyberspace language data was analyzed. The paper then discusses how to build a translation system using the internet meme data collected in this study. This paper begins by addressing the task of distinguishing target data from general data. It then explores the classification of memes that share similar semantic characteristics with idiomatic expressions and their correspondence to existing idiomatic expressions. Furthermore, it examines the amplification and reproduction of socially stigmatized and provocative content, stemming from the unethical nature of internet meme data. Lastly, the paper delves into the methodology for the continuous construction of internet meme data. Ultimately, it is hoped that a comprehensive approach to internet memes will pave the way for significant research on communication in both cyberspace and physical space. |