Forecasting Threat of Malicious AI

PNAS Nexus

Bad actors are predicted to begin using AI daily by the middle of 2024, according to a study. Neil F. Johnson and colleagues map the online landscape of communities centered around hate, beginning by searching for terms found in the Anti-Defamation League Hate Symbols Database, along with the names of hate groups tracked by the Southern Poverty Law Center. From an initial list of "bad-actor" communities found using these terms, the authors assess communities linked to by the bad-actor communities. The authors repeat this procedure to generate a network map of bad-actor communities—and the more mainstream online groups they link to. Some mainstream communities are categorized as belonging to a "distrust subset" if they host significant discussion of COVID-19, MPX, abortion, elections, or climate change. Using the resulting map of the current online bad-actor "battlefield," which includes more than 1 billion individuals, the authors project how AI may be used by these bad actors. The authors predict that bad actors will increasingly use AI to continuously push toxic content onto mainstream communities using early iterations of AI tools, as these programs have fewer filters designed to prevent their usage by bad actors and are freely available programs small enough to fit on a laptop. The authors predict that such bad-actor-AI attacks will occur almost daily by mid-2024—in time to affect U.S. and other global elections. The authors emphasize that as AI is still new, their predictions are necessarily speculative, but hope that their work will nevertheless serve as a starting point for policy discussions about managing the threats of bad-actor-AI.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.