CSIRO, Australia's national science agency, has analysed data from a 10-month trial conducted by global cybersecurity firm eSentire exploring how large language models (LLMs) like ChatGPT-4 can support cybersecurity analysts to detect and stop threats while reducing fatigue.
The anonymised data was collected at eSentire's Security Operations Centres (SOCs) in Ireland and Canada, where analysts identify, investigate and respond to cyberattacks.
During the trial, 45 cybersecurity analysts asked ChatGPT-4 more than 3000 questions, mostly for routine, low-risk tasks such as interpreting technical data, editing text and analysing malware code.
Dr Mohan Baruwal Chhetri, Principal Research Scientist at CSIRO's Data61, said the study shows AI can be embedded into real workflows to support, not replace, human expertise.
"ChatGPT-4 supported analysts with tasks like interpreting alerts, polishing reports, or analysing code, while leaving judgement calls to the human expert," Dr Baruwal Chhetri said.
"This collaborative approach adapts to the user's needs, builds trust, and frees up time for higher-value tasks."
The study was conducted under CSIRO's Collaborative Intelligence (CINTEL) program , which explores how human-AI collaboration can enhance performance and wellbeing across domains including cybersecurity, where analyst fatigue is a growing challenge.
SOC teams face rising volumes of alerts, many of them false positives, leading to missed threats, reduced productivity, and potential burnout.
Dr Baruwal Chhetri said human-AI collaboration could also benefit other high-pressure environments, such as emergency response and healthcare.
Dr Martin Lochner, Data Scientist and Research Coordinator, explained the trial is the first long-term industrial study to show how LLMs can be used in real-world cybersecurity operations, helping shape the next generation of AI tools for SOC teams.
"This collaboration uniquely combined academic rigor with industry reality, producing insights that neither pure laboratory studies nor industry-only analysis could achieve," Mr Locher said.
For instance, we found that only four per cent of analyst requests to ChatGPT-4 asked for a direct answer, such as 'is this malicious?'. Instead, analysts preferred receiving evidence and context to support their own decision making.
"This highlights the value of LLMs as decision-support tools that enhance analyst autonomy rather than replace it."
Building on the 10-month study, the next phase of research will be a longer-term investigation using a two-year dataset to examine how analysts' use of ChatGPT-4 evolves over time. This phase will also incorporate qualitative analysis of analyst experiences, comparing outcomes with log data to better understand how AI tools can improve productivity and be refined for broader adoption in SOC environments.
Discover the insights from the study: LLMs in the SOC: An Empirical Study of Human-AI Collaboration in Security Operations Centres.