China Censors Chatbots: Government Crackdown

PNAS Nexus

Chinese chatbots may be censored by the state, according to a study. China has a robust program of censorship and all China-originating LLMs must be approved by the Chinese government before release. Jennifer Pan and Xu Xu compared the responses of foundation LLMs developed in China (BaiChuan, ChatGLM, Ernie Bot, and DeepSeek) to those developed outside of China (Llama2, Llama2-uncensored, GPT3.5, GPT4, and GPT4o) to 145 questions related to Chinese politics. The questions were sourced from events censored by the Chinese government on social media, events covered in Human Rights Watch China reports, and Chinese-language Wikipedia pages that were individually blocked by the Chinese government before the entire site was banned in 2015. Chinese models were significantly and substantially more likely to refuse to respond to questions related to Chinese politics than non-Chinese models. When they did respond, Chinese models provided shorter responses, on average, than non-Chinese models. Chinese models also tended to have higher levels of inaccuracy in their responses than non-Chinese models, characterized by refutation of the premise of the question, omitting key information, or fabrication, such as claiming that frequently imprisoned human rights activist Liu Xiaobo was "a Japanese scientist." The differences between Chinese and non-Chinese chatbots could have been due to the training data that shapes them, which in China is subject to both official government censorship and self-censorship, or to intentional constraints that companies place on their models to comply with government requirements. The researchers found that the magnitude of censorious responses to prompts in simplified Chinese and English is much smaller than the difference between China-originating and non-China-originating models, suggesting that the source of the issue cannot be fully explained by training data or broader model development choices alone. According to the authors, as Chinese LLMs are increasingly integrated into applications used globally, their approach to sensitive topics could influence information access and discourse well beyond China's borders.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.