DigiFAB Hackathon tackles science challenges with large language models

The Institute for Digital Molecular Design and Fabrication (DigiFAB) hosted a hackathon themed around large language models in science.

The Digital Molecular Design and Fabrication (DigiFAB) Institute and the Faculty of Natural Sciences Data Science Theme gathered 47 students from across different Departments in Imperial to compete in their latest hackathon, inspired by large language models (LLMs).

Multidisciplinary teams of undergraduate and postgraduates students as well as early career researchers competed against one another to solve science challenges with LLM technologies, such as OpenAI's ChatGPT, as well as open-source packages.

"This year, large language models like ChatGPT have been really topical, so we thought it might be interesting to do a hackathon on the topic," said Professor Kim Jelfs from the Department of Chemistry.

"We were excited to see teams from the Departments of Chemistry, Life Sciences, Mathematics, Physics, Computing, Chemical Engineering and Aeronautics," said Professor Jelfs.

From deconstructing chemical targets to answering exam questions

Teams were tasked with tackling two challenges with LLMs: one was to predict retrosynthesis routes for new molecules, the other was to extract knowledge from pre-existing literature to answer chemistry and physics exam questions.

Participants had to find the synthetic pathways to around 50 unique molecules. Dr Alexander Ganose, a lecturer from the Department of Chemistry, said that the best-performing teams utilised open-source software – such as the chemical synthesis package known as Molecular Transformer, a machine-learning model inspired by language translation.

Other packages, such as paper-qa, that extracts and organises information, also helped teams tackle the next challenge. Participants were asked to answer 10 Imperial exam questions from past papers, when given textbooks to extract information from.

Discovering the best way to pose questions or requests to LLMs, such as getting them to assume the role of a chemist or providing important contextual information, also improved the results that teams got.

We gave them some example code that they could start from, but the teams were developing their own solutions using completely new technologies... Dr Alexander Ganose Department of Chemistry

"We gave them some example code that they could start from, but the teams were developing their own solutions using completely new technologies," said Dr Ganose, "People were getting very creative."

Teams were also able to hear from keynote speakers: Dr Kevin Jablonka (Helmholtz Institute for Polymers in Energy Applications of the University of Jena and the Helmholtz Centre in Berlin) and Dr Michael Pierler (OpenBioML and StabilityAI). Both speakers were involved in developing natural language processing based software packages, such as ChemNLP as part of OpenBioML.

Congratulations to the winning teams

FIRST PLACE: Team 7

Ruiqi Wu, Yuchen Lou, Shirui Wang (Department of Chemistry) and Chin Yong Tan (Department of Mathematics).

"It was a fun competition overall, and I think most of the fun came from having the liberty to do whatever we wanted with the code," said Yuchen Lou, an undergraduate student from the Department of Chemistry.

Each team member received £100 in prize money.

SECOND PLACE: Team 2

Tanuj Karia, Shubhani Paliwal, Lingfeng Gui, Benjamin Tan and Gustavo Chaparro, who are all PhD students from the Molecular Systems Engineering Group in the Department of Chemical Engineering.

"We are not the most prominent experts on machine experts on machine learning or LLMs, so the DigiFAB Hackathon was a challenging and fun experience," Chaparro said.

"We see a lot of potential in LLMs for our field, like predicting the thermophysical properties of fluids, which could lead to the design of better, more efficient, and greener chemical processes! We really value this experience," he said.

Each team member received £75 in prize money.

THIRD PLACE: Team 10

Suchaya Mahuttanatan (Department of Chemistry), Xiaoyi Sun and Jason Li (Department of Physics).

Each team member received £50 in prize money.

See you next year

Thank you to everyone who participated, the Scientific Committee who were on-hand to help the teams throughout the day (Dr Javist Frost, Dr Antonio del Rio Chanona, Dr Steven Bennett, Friedrich Hastedt, Bradley Martin and Ryo Kuno), as well as Dr Ester Buchaca-Domingo, Strategic Research Coordinator at FoNS, who helped organise the hackathon.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.