Sample management for biologists

In biology, even small errors in labelling and data management can lead to severe consequences, particularly when dealing large complex projects like tracking COVID-19. To reduce potential error, researchers at Queen’s University in Canada have developed a new software package called ‘baRcodeR’. The research is published in Methods in Ecology and Evolution.

Examples of conventional,hand-labelled tubes compared with baRcodeR-generated labelsof various types for different use-cases. Image taken from research paper.

How many times have you struggled to interpret messy handwriting – maybe a note from a friend or co-worker, or a label on a meal preserved in your freezer? Biologists face similar labelling challenges but with far more severe consequences. To help address this problem, Yihan Wu and colleagues at Queen’s University in Canada have developed a new research software package called ‘baRcodeR’. The capital Rs are a nod to the R statistical programming environment, which runs the software.

Scientists who work with biological samples might typically record additional information including date, location, measurements, test results, and other observations. Large collaborative projects like those tracking COVID-19 can require samples and data to be coordinated among hundreds or even thousands of scientists and students working collaboratively from around the world.

Errors with labelling or data management that inevitably arise can have serious consequences. For example, a mere 1% labelling error in the 80+ million COVID-19 tests conducted worldwide could yield hundreds of thousands of misdiagnoses, including tens of thousands of infected patients erroneously cleared to return to work. Human errors at this scale are inevitable. In one recent example, hundreds of positive cases near Toronto were not disclosed to public health officials for weeks.

To reduce human error, the open-source baRcodeR software from Wu and her colleagues helps scientists to quickly generate unique identifier codes and print scannable barcodes on a basic laser printer. Digital barcodes, like the ones used on consumer packaging, are standard practice for tracking samples in commercial enterprises but the software lacks flexibility for the complexity of many experimental designs, and it can be prohibitively expensive.

baRcodeR will help to make labels from our estimated 10,000 sample collections over the next few months

– explains Dr. Christopher Barnes, Director of Clinical and Translational Science Informatics and Technology at the University of Florida. His team is testing frontline health workers in the State.

Dr. Robert Colautti, a Queen’s faculty member and senior author on the paper said: “A reproducible analysis on erroneous data is not reproducible science. The emerging field of data science uses computer programming to process and analyze data in a way that can be reproduced by anyone with the right skillset. But earlier stages of sample collection and measurement have not received the attention they deserve.”

Kevin Hanson, Assistant Director of Clinical and Translational Science Informatics and Technology at the University of Florida said:”Now the nurse can confirm the name and date of birth of the person driving up, peel and affix the labels to the swab collection tube, and the lab can read the barcode for their robot to perform the tests.”

You can read the research for free here:

Wu Y, Lougheed DR, Lougheed SC, Moniz K, Walker VK, Colautti RI. baRcodeR: An open-source R package for sample labelling. Methods Ecol Evol. 2020;00:1-6.

/Public Release. View in full here.