Cosmic Dataset Opens to Scientists, Novices, AI

Pennsylvania State University

Newly released data from the one of the largest surveys ever taken of the early universe will allow astronomers to study how the first galaxies formed and evolved, measure how gas and stars were distributed within these galaxies, map the large-scale structure of the cosmos and investigate rare and unexpected objects not easily found in traditional surveys. The Hobby-Eberly Telescope Dark Energy Experiment (HETDEX) has released all of its immense, information-rich database to the public - more than half a petabyte of raw and processed data, according to the research team.

A paper describing the data release published June 3 in the Astrophysical Journal.

"HETDEX observations cover the era that occurred from 10 to 12 billion years ago that astronomers have named 'cosmic noon,'" said Robin Ciardullo, professor of astronomy and astrophysics at Penn State, co-author of the paper and the observations manager of the HETDEX project. "This was the time when star formation was most vigorous and we believe that galaxies were being assembled."

HETDEX observations make use of a technique called spectroscopy, where light is broken into its various wavelengths to produce a spectrum. The HETDEX dataset contains 600 million of these spectra, which can reveal an object's chemistry, temperature, mass, movement through space and distance from Earth.

"This is a spectral map of the universe," said Erin Mentuch Cooper, HETDEX data manager and lead author of the paper. "It turns every point of light into a barcode of physics. The real excitement is what happens when thousands of astronomers start exploring it."

From 2017 to 2024, the Hobby-Eberly Telescope at McDonald Observatory in Texas surveyed a region of night sky equivalent in size to about 2,000 full moons, creating a map of the distant universe.

"The primary scientific goal of HETDEX is to use the map of approximately one million galaxies to investigate the expansion history of the universe, and thereby understanding its composition, including the nature of dark energy," said Caryl Gronwall, research professor of astronomy and astrophysics at Penn State and co-author of the paper. "The existence of dark energy was discovered three decades ago, when observations revealed that the universe's rate of expansion was increasing, but dark energy remains as much a mystery today as when it was first identified."

While the focus of HEXDEX has been distant galaxies, the survey has also gathered data on all of the space within its view.

"The survey is untargeted," said Karl Gebhardt, HETDEX principal investigator, chair of the University of Texas at Austin's astronomy department and co-author of the paper. "We aren't picking and choosing specific objects to observe. Instead, we're pointing one of the world's largest telescopes at the sky and seeing what's out there. We fully expect to find some really cool, wild stuff hiding in the data."

The database consists of 431,000 data cubes that map information into three-dimensional space. When measured on the sky, each is roughly one-thirtieth the size of the full moon. Most correspond to regions around the Big Dipper and Orion.

"The scope of HETDEX allows us to investigate an era when dark energy is predicted to be a minor constituent of the universe, compared to today, when it dominates," said Donghui Jeong, professor of astronomy and astrophysics at Penn State, a co-author of the paper. "The new observations should place strong constraints on evolutionary models of the universe."

In addition to raw data, the release also contains a catalog of every object HETDEX has found so far: over one million distant galaxies, half a million nearby star-forming galaxies, 18,000 supermassive blackholes and over 150,000 stars. Scientists, students and citizen researchers can download customized subsets of data based on sky location. Or, thanks to a close collaboration with the University of Texas at Austin's Texas Advanced Computing Center, they can perform large-scale analysis using high-performance, cloud-based supercomputing resources, dramatically lowering the barrier to entry for working with data of this scale.

While the release is based on half a petabyte of data, the equivalent storage of three full years of high-quality video, the team was able to process it down to a more manageable 10 terabytes. It also developed extensive tutorials and tools to help users - both human and artificial intelligence (AI) - to make the most of this massive, complex dataset.

"It's been so important for me to make it as accessible as possible," Mentuch Cooper said. "We've turned more than half a billion spectra into something you can actually explore. It's like compressing a universe of information into something you can hold in your hands."

Due to the size of the HETDEX database, AI is playing a significant role in the creation and analysis of the data set; for example, software provided by RAIC Labs automatically removed contamination from satellites and meteors crossing in front of the telescope. HETDEX also used automated methods to comb through its observations and identify early galaxies. In parallel, more than 24,000 citizen scientists helped confirm the presence of these galaxies through the Dark Energy Explorers program.

Donald Schneider, distinguished professor of astronomy and astrophysics at Penn State, was also a co-author of the paper.

To access the data and learn more about the HETDEX project, visit the HETDEX website.

The HET is a joint project of Penn State, the University of Texas at Austin, Ludwig-Maximilians-Universität München and Georg-August-Universität Göttingen.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.