Opinion Piece: When Empirical Strikes Back

Australian Treasury

When a German bakery chain wanted to boost sales, it didn't hire consultants or launch a splashy rebrand. Instead, it did something more radical: it ran a randomised trial. Half its stores offered staff a modest group bonus. The other half didn't. After a few months, the results were in: a 3 per cent sales increase in the bonus group. For every dollar spent on bonuses, the company reaped $3.80 in revenue and $2.10 in operational profit.

It's a reminder that in business - and in government - the most powerful tool may not be charisma or instinct, but curiosity. Randomised trials help us figure out not just what sounds good, but what actually works.

In Australia's public service, that ethos is taking root. We're seeing an emerging culture of testing and learning: through small‑scale trials, behavioural nudges, and rigorous evaluation. From tax compliance letters to SMS reminders, government is using evidence to improve how it delivers. Not by guessing. By learning.

Public sector productivity isn't about profit margins. It's about outcomes that matter: fewer people stuck in long‑term unemployment, shorter hospital wait times, better school completion rates. And improving those outcomes begins with one key question: what works?

Randomised trials give us answers. They compare 2 versions of a program - one with a new tweak, one without - and show whether the tweak made a difference. A redesigned letter. A new prompt. A brief coaching call. Some ideas turn out to be duds. Others change everything.

Take Services Australia. In one trial, the department sent a simple confirmation text message to people who'd submitted a form. Just a short note effectively saying 'we've got it'. That tiny tweak cut follow‑up calls by 11 percentage points, saving time for both staff and callers. Another trial found that a well‑worded SMS reminder to income support recipients boosted on‑time earnings reporting by 13 percentage points and cut payment suspensions nearly in half. The message saved 6,000 hours of staff time a year.

The Tax Office tried something similar. Letters to tax agents that gently flagged possible over‑claiming of work‑related deductions resulted in average claims falling by $191 per taxpayer. Across the sample, that meant more than $2 million in reduced deductions.

These are what behavioural economists call low‑friction interventions. They don't require new laws or billion‑dollar budgets. They just make systems work better, at minimal cost. And they let us measure the difference between what we intended and what we actually achieved.

Importantly, not every trial works. And that's the point. A few years ago, the Behavioural Economics Team of the Australian Government developed an app to help university students build resilience. It was well‑designed, backed by solid theory. And had no discernible impact. But because it was a properly structured trial, we learned something useful: that this approach wasn't effective.

The team didn't stop there. They tested another idea: sending short, supportive text messages to students. That worked. Recipients reported a 7 per cent improvement in life satisfaction. The failure helped shape the success.

Governments in other countries are learning this way too. In the UK, the Business Basics Programme funded dozens of trials to improve small business productivity. Some interventions flopped. But others - such as targeted management training - showed promise. Knowing what doesn't scale is just as valuable as knowing what does.

This is what it means to treat government as a learning machine. It's not about being cautious. It's about being smart. Better trials make for better policy. Not in theory, but in practice.

That includes policy design, not just service delivery. Right now, the Department of Employment is working with the Australian Centre for Evaluation to test whether offering online jobseekers a voluntary one‑on‑one coaching session leads to better employment outcomes. If it works, it could reshape how we think about delivering digital services. If not, it still helps us decide where best to focus effort.

Of course, individual trials are just the start. We also need the infrastructure to support learning: places to share findings, spot patterns and build on what's already known. That's why the Australian Centre for Evaluation recently launched a new evaluation library - an open, searchable database of government evaluation reports. Instead of lessons gathering dust in filing cabinets, they can now shape future decisions.

Globally, institutions like the Campbell Collaboration and the UK's What Works Network are helping governments synthesise trial results and provide real‑time guidance. Some are even experimenting with AI‑powered tools that update evidence reviews as new data comes in. These aren't just research projects. They're blueprints for a smarter public sector.

Ultimately, this is about trust. A government that tests its assumptions and learns from failure is one that earns confidence. One that values rigour over rhetoric. One that puts public value at the heart of public service.

As my colleague Zhi Soon MP put it in his first speech to parliament, governments must be humble enough to recognise what we don't know, and committed to test, learn and adapt.

In a world of rising expectations and finite budgets, we can't afford to rely on guesswork. We need to know - really know - what works. That means making learning part of the system. Not a side project, but a core function.

Because the most productive government isn't the one that tries the hardest. It's the one that learns the fastest.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.