A group of researchers, co-led by researchers from Google DeepMind and the University of Nottingham, have launched a world-first AI model that can contextualise ancient inscriptions.
The research, published today in Nature, explains how Aeneas, the AI model, could greatly reduce the workload of researchers and draw connections from a wide range of historical evidence.
When working with ancient inscriptions, historians traditionally rely on their expertise and specialised resources to identify 'parallels' - texts that share similarities in wording, standardised formulas or provenance.
Aeneas greatly accelerates this complex and time-consuming work. It reasons across thousands of Latin inscriptions, retrieving textual and contextual parallels in seconds that allow historians to interpret and build upon the model's findings.
The model can also be adapted to other ancient languages, scripts and media, from papyri to coinage, expanding its capabilities to help draw connections across a wider range of historical evidence.
Google DeepMind co-developed Aeneas with the University of Nottingham, and in partnership with researchers at the Universities of Warwick, Oxford and Athens University of Economics and Business (AUEB). This work was part of a wider effort to explore how generative AI can help historians better identify and interpret parallels at scale.
To train Aeneas, the research team curated a large and reliable dataset, drawing from decades of work by historians to create digital collections. They cleaned, harmonised and linked these records into a single machine-actionable dataset, referred to as the Latin Epigraphic Dataset (LED), comprising over 176,000 Latin inscriptions from across the ancient Roman world.
Aeneas helps historians interpret and contextualize a text, give meaning to isolated fragments, draw richer conclusions and piece together a better understanding of ancient history.
The model's advanced capabilities include:
- Parallels search: It searches for parallels across a vast collection of Latin inscriptions. By turning each text into a kind of historical fingerprint, Aeneas identifies deep connections that can help historians situate inscriptions within their broader historical context.
- Processing multimodal input: Aeneas is the first model to determine a text's geographical provenance using multimodal inputs. It analyzes both text and visual information, like images of an inscription.
- Restoring gaps of unknown length: For the first time, Aeneas can restore gaps in texts where the missing length is unknown. This makes it a more versatile tool for historians dealing with heavily damaged material.
- State-of-the-art performance: Aeneas sets a new state-of-the-art benchmark in restoring damaged texts and predicting when and where they were written.
Ancient inscriptions offer rare, direct insights into past civilizations, but they often survive as incomplete fragments, lacking crucial context. We've developed Aeneas, an AI model that transforms how historians approach these texts.
"Much like finding connections between jigsaw pieces, Aeneas quickly identifies shared names, phrases, and formulas across thousands of Latin inscriptions, enabling us to reconstruct lost information and gain a more complete understanding of ancient history."
Professor Dame Mary Beard, DBE FSA FBA FRSL, Professor of Classics at the University of Cambridge, said: "Aeneas is a really exciting and expert "test run" for the use of AI in the study of Roman inscriptions (epigraphy). Breakthroughs in this very difficult field have tended to rely on the memory, the subjective judgement and the hunch/guesswork of individual scholars, supported by traditional, encyclopaedic databases. Aeneas opens up entirely new horizons, and it is here tested by some of the leading scholars in the field. It promises to be transformative."