Generative AI Lowers Costs, Raises Cyberattack Risks

Cell Press

Using generative AI to design, train, or perform steps within a machine-learning system is risky, argues computer scientist Micheal Lones in a paper publishing April 22 in the Cell Press journal Patterns. Though large language models (LLMs) could expand the capabilities of machine-learning systems and decrease costs and labor needs, Lones warns that using them reduces transparency and control for the people developing and using these systems and increases the risk of malicious cyberattacks, data leaks, and bias against underrepresented groups.

"Machine-learning developers need to be aware of the risks of using GenAI in machine learning and find a sensible balance between improvements in capability and the risks that might come with that," says Lones, a computer scientist at Heriot-Watt University in Edinburgh, UK. "Given the current limitations of generative AI, I'd say this is a clear example of just because you can do something doesn't mean you should."

Machine-learning systems are algorithms that learn to recognize patterns in data, which they can then use to make predictions and decisions regarding new data. Machine learning has been around for decades, and most people encounter it in their daily lives in the form of spam filters, product recommendations on e-commerce websites, and social media newsfeeds. In the last two or so years, there has been a push to incorporate generative AI (in the form of LLMs) into machine-learning systems, but doing so carries risks and limitations that developers and the general public should be aware of, Lones says.

Lones explores four ways in which generative AI is currently being applied in machine learning: as a component within a machine-learning pipeline, to design and code machine-learning pipelines, to synthesize training data, and to analyze machine-learning outputs. All of these applications carry risks, Lones says, and these risks are compounded if LLMs are used for multiple tasks within a machine-learning system, or if LLMs are "agentic"—meaning they can autonomously use external tools to solve problems.

"If you have GenAI working in a number of different ways within your machine-learning workflows or system, then they can interact in unpredictable and hard to understand ways," says Lones. "My advice at the moment is to avoid adding too much complexity in terms of how we use GenAI in machine learning, particularly if you're in a sector that has high stakes that impact people's lives and livelihood."

One of the biggest risks is simply that LLMs sometimes make mistakes, bad decisions, and fabricate or "hallucinate" information. Lones says that these errors aren't necessarily predictable and may be difficult to evaluate because LLMs operate in a non-transparent way, which presents an additional issue for legal compliance.

"In areas like medicine or finance, there are laws about being able to show that the machine-learning system is reliable, and that you can explain how it reaches decisions," says Lones. "As soon as you start using LLMs, that gets really hard, because they're so opaque."

Lones advises machine-learning developers to always manually evaluate LLM-generated code and outputs. He also warns that bigger, remotely hosted LLMs often store and share data, which means that using them opens up opportunities for cyber-security breaches and the leakage of data and sensitive information.

"It's important for people in the general public to be aware of the limitations of GenAI systems," says Lones. "Companies will deploy these systems to do things like cut costs, and this may improve the experience that end users get, but it may also have negative consequences, such as bias and unfairness."

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.