Home Science Biotechnology Using Sex to Identify Mislabeled Samples

Using Sex to Identify Mislabeled Samples

Study Results

In 2015, Miriam Lohr and her group from Dortmund University in Germany decided to quantify the frequency of mislabeled samples in 45 publicly available transcriptomic datasets with data obtained from cancer patients. They accomplished this using sex-specific identifiers—genes that are expressed from either X- or Y-chromosomes. They analyzed these gene expression patterns to determine whether the sample was, in fact, from a male or female patient, then cross-referenced those results to the actual sex of the patient. Of the 4913 patients they evaluated, they found that 1.1% were “misclassified” and 3.0% were “unconfident,” meaning that the sex could not be confirmed based on transcriptomic analysis. In 18 of the 45 datasets (40%) tested, they detected at least one “misclassified” sample. To demonstrate the effect these mislabeled samples could have on actual study results, Lohr et al assessed which genes had prognostic value from the cohorts. They found that by incorporating mislabeling errors, 12% to 53% of the genes significantly associated with patient survival were no longer significant, while another 9% to 39% of genes appeared as newly significant.1

Another similar study was performed in 2016 by Lilah Toker and colleagues. They used a similar methodology, applying sex-specific genes to identify mislabeled samples in 70 transcriptomic datasets, which included both cancer-related and non-cancer–related studies. This group confirmed Lohr’s initial findings, as they discovered mislabeled samples in 46% of the datasets analyzed, with an average mismatch rate of 2%. Though the source of error was usually difficult to determine, they found that the most common source appeared to be samples that had been physically mixed up, and not mistakes due to improper recording of the participants’ sex.2

The main point of both studies was to shed light on how pervasive mislabeling can be in transcriptomic datasets. These mislabeled samples are extremely distressing, as they might wrongly guide any number of research groups who use them to erroneous conclusions. The authors also suggest that while sex-specific identifiers could be used to correct mismatches, mislabeled samples between patients of the same sex can’t be identified with these methods, likely leading to a greater amount of error than what was reported. Altogether, the importance of appropriate labeling can’t be overstated. Every precaution should be taken to ensure that samples are labeled correctly, including making sure your labels are tailored for their environment. Barcoded labels, radio-frequency identification (RFID) labels, and laboratory information management systems (LIMS) can also help reduce errors during the processing of high-throughput data generated from transcriptomic analyses.

LabTAG by GA International is a leading manufacturer of high-performance specialty labels and a supplier of identification solutions used in research and medical labs as well as healthcare institutions.


  1. Lohr M, Hellwig B, Edlund K, et al. Identification of sample annotation errors in gene expression datasets. Arch Toxicol. 2015;89:2265-2272.
  2. Toker L, Feng M, Pavlidis P. Whose sample is it anyway? Widespread misannotation of samples in transcriptomics studies. F1000Research. 2016;5:1-13.
Alexander Goldberg, Ph.D.
The scientific writer and social media manager at GA International. Dr. Alex Goldberg earned his Ph.D. in biology and previously worked as a post-doc in toxicology and medicine, studying chronological lifespan in yeast, anti-neoplastic small molecules, and the genetics of tuberous sclerosis complex.


Please enter your comment!
Please enter your name here

About LabTAG

LabTAG is the worldwide leader in cryogenic and chemical-resistant label manufacturing. With over 20 years of experience in the industry, and a catalog of 6000+ products, we have the selection and know-how to meet your labeling needs.

Learn more about LabTAG

Most Popular

Bringing Research to Life with 3D Cell Culture

Cell culture is an invaluable tool in biology, drug discovery, cancer research, as well as in the study of stem cells. Currently, most labs...

Research Spotlight: Luke Mansfield

This week’s Research Spotlight focuses on Luke Mansfield, a Ph.D. candidate from the Oncology and Metabolism Department of the University of Sheffield. His research...

Breakthrough: GA International Announces the Development of Direct Thermal CryoSTUCK Labels

Summary: GA International has developed a new line of patented direct thermal CryoSTUCK® labels for laboratories that resist chemical exposure as well as adhere...

Laboratory Labels Help Optimize Equipment Tracking & Maintenance

Many labs prioritize sample tracking; after all, most labs focus on generating mass quantities of samples, each of which requires a unique label that...

Connect with us


More Categories

Recent Comments

Central BioHub GmbH on The History and Function of Biobanks
Michelle Yin on The Science of Cryogenics