Preclinical Research: Translating Results from Mice to Humans


Translating Results from Mice to Humans

All clinically approved therapeutics have passed through preclinical testing at some point. Drugs must be tested in animal subjects before human trials are conducted; the key is to translate those findings into meaningful results for patients. However, the success rate for new drugs tested in clinical trials remains extremely small.1 This waste of both time, money, and millions of animals has led scientists to develop a variety of programs and models to improve the success of translating results from the bench to the bedside.

Internal Validity

Discussions about how to improve the bench-to-bedside pathway of preclinical research often boil down to 2 things: internal and external validity.

Internal validity refers to the basic principles of study design, including conduct, analysis, and reporting. Preclinical tests using animal models should, at the very least, contain the proper controls, be replicated on independent experimental sets of animals with enough statistical power to render results meaningful, and utilize treatment randomization (both experimental and control) as well as a blinded outcome assessment. Failure to incorporate these factors can yield biased conclusions and produce false positive or negative results.1,2

In a paper by Jankovic et al from 2019, they found that only 7% of the published studies they analyzed maintained a strong framework of internal validity. The studies they assessed belonged to journals with relatively high impact factors, so they had all undergone a rigorous review process as well. They highlighted two major points where internal validity was missing: lack of randomization and the use of pseudo replication. Pseudo replication refers to repeating an experiment using the same experimental unit (i.e. the same mouse) and can lead to misleading data, as the actual sample size is overestimated.2

The lack of internal validity necessitated the creation of not one but three distinct sets of guidelines. This includes the Animal Research Reporting of In Vivo Experiments (ARRIVE) in 2010, the European Quality in Preclinical Research (EQIPD) in 2017, and the Planning Research and Experimental Procedures on Animals: Recommendations for Excellence (PREPARE) in 2018.3,4 High-impact journals like Nature, Science, and Stroke as well as the National Institutes of Health (NIH) have also implemented programs that aim to improve scientific rigor and reporting.

In fact, Nature and Stroke now have mandatory checklists for authors to disclose all pertinent study design elements relevant to the article. In a paper by Ramirez et al in 2019, they measured the effect these initiatives had on improving the quality of preclinical studies accepted by these journals. Though it appeared that these initiatives worked to a certain degree, problems still remain, as around 50% of the studies did not use randomization or blinded results, and a staggering 75% had insufficient statistical power to confirm their hypotheses.5

The use of toxicologic pathologists with a background in veterinary medicine could also help with standardizing protocols and reporting. Often, these specialists have a wealth of experience working with different animal models and are adept at handling subjective, quantitative data, which would help reduce errors when it comes to analyzing pathological findings.6

translating results for clinical trials

External Validity

Though some feel that if internal validity were to significantly improve, the rate of clinically successful therapies would increase, it’s been shown that it takes more than just strong internal validity to achieve translatable research. In 1999, the Stroke Foundation adopted more stringent standards for preclinical studies; by 2012, the rates of translation had not increased, with only one therapy out of 1000 seeing modest clinical success.1

To boost the rates of successful trials, external validity, which refers to how applicable a study is to other relevant settings, must also improve. While a crucial aspect of any preclinical study, it’s also the hardest to account for, as scientists often don’t have access to the tools they need to improve external validity, such as animal models that fully mimic the characteristics of a given disease.1

There are 3 facets that account for the total strength of external validity: face validity, construct validity, and predictive validity.4

  • Face validity refers to how closely the animal model recapitulates the symptoms of the disease.
  • Construct validity refers to the underlying cause of the disease and how similar it is between the animal model and the disease in humans.
  • Predictive validity refers to how closely the animal response to standard-of-care drugs mimics the response in humans.

Unfortunately, there is no standardized method of ensuring all three criteria are met. This led one group from Utrecht University and the Utrecht Medicines Evaluation Board to devise a framework for assessing the validity of preclinical models. Their framework, titled Framework to Identify Models of Disease (FIMD), uses a weighted score of 100 points, distributed across 8 domains: epidemiology, symptomatology, and natural history, genetics, biochemistry, etiology, histology, pharmacological, and endpoints. Ultimately, the FMID is supposed to indicate how close a given preclinical model is to humans. A high score is not necessarily needed either, as the best model might be one that just closely mimics the pathways related to the activity of the drug and its mechanism.4

Introducing systemic heterogeneity has also been proposed as something that might help increase external validity. Preclinical experimentation is often performed in a fairly homogenous manner; for example, testing is usually performed on mice of the same sex, age, and genetic background. While this may facilitate the use of as few test subjects as possible to yield a statistically significant result, it doesn’t really project the reality of human nature, which is based on a collection of individuals from various backgrounds, both genetic and environmental. Using a heterogeneous population of subjects combined with the right analytical techniques might make results more applicable regardless of the animal’s (or human’s) characteristics.7

Source: DocWire News

Translating Results – Humanized Models of Disease

Though all the above may ultimately shore up many of the shortcomings facing preclinical research, the primary problem with using animal models remains the same: they just aren’t human.1 Scientists, fully aware that they need better models to avoid repeated clinical trial failures, have spent a lot of time and effort to develop humanized models, which incorporate human tissues into animal models.

These models are especially prevalent for immunological diseases, starting in the 1980s with immunodeficient mice engrafted with human hematopoietic stem cells and peripheral blood mononuclear cells. These models have improved over the years, some capable of developing a human thymus organoid and generating strong peripheral immune responses.8 These models have also been useful for testing immunotherapies for cancer, specifically for chimeric antigen receptor (CAR) T cell therapies, where patient T cells are extracted and engineered to attack cancer cells, and for immune checkpoint inhibitors, such as programmed cell death-1 (PD-1) and programmed cell death ligand-1 (PD-L1).9 These therapies have all been approved by the FDA in one form or another, with a variety of PD-1/PD-L1 inhibitors available for many different tumor types and a handful of CAR T cell therapies approved mainly for hematological cancers.

Alzheimer’s disease (AD) research is an area that could potentially benefit most from more humanized models of disease. Hundreds of compounds have been clinically tested for their ability to stop neuronal degeneration in patients with AD-related dementia and none have worked. The problem likely stems from the fact that while it’s extremely difficult to study AD pathophysiology in humans, mice models aren’t yet designed to answer the question of what exactly causes the disease, and most don’t fully reflect the nature of how long it takes to develop signs and symptoms of dementia.10 Here, induced pluripotent stem cell (iPSC) models can be especially powerful, as tissues derived from patients with AD can be reverse engineered into stem cells, then re-differentiated to produce 3D co-culture models that show similar pathological characteristics to AD when aged over just a few months, including amyloid beta and tau aggregation.11 Though this technology is in its infancy, these types of models provide hope for preclinical researchers that their data can be more meaningful when it comes time to testing therapeutics in clinical trials.

LabTAG by GA International is a leading manufacturer of high-performance specialty labels and a supplier of identification solutions used in research and medical labs as well as healthcare institutions.


  1. Pound P, Ritskes-Hoitinga M. Is it possible to overcome issues of external validity in preclinical animal research? Why most animal models are bound to fail. J Transl Med. 2018;16(1):1-8.
  2. Jankovic SM, Kapo B, Sukalo A, Masic I. Evaluation of Published Preclinical Experimental Studies in Medicine: Methodology Issues. Med Arch (Sarajevo, Bosnia Herzegovina). 2019;73(5):298-302.
  3. Veening-Griffioen DH, Ferreira GS, van Meer PJK, et al. Are some animal models more equal than others? A case study on the translational value of animal models of efficacy for Alzheimer’s disease. Eur J Pharmacol. 2019;859:1-12.
  4. Ferreira G, Veening-Griffioen D, Boon W, et al. A standardised framework to identify optimal animal models for efficacy assessment in drug development. bioRxiv. 2018;14(6):1-17.
  5. Ramirez FD, Jung RG, Motazedian P, et al. Journal Initiatives to Enhance Preclinical Research: Analyses of Stroke, Nature Medicine, Science Translational Medicine. Stroke. 2020;51(1):291-299.
  6. Everitt JI. The Future of Preclinical Animal Models in Pharmaceutical Discovery and Development:A Need to Bring In Cerebro to the In Vivo Discussions. Toxicol Pathol. 2015;43(1):70-77.
  7. Richter H. Systematic heterogenization for better reproducibility in animal experimentation. Lab Anim (NY). 2017;46:343-349.
  8. Shultz LD, Keck J, Burzenski L, et al. Humanized mouse models of immunological diseases and precision medicine. Mamm Genome. 2019;30(5-6):123-142.
  9. Carrillo MA, Zhen A, Kitchen SG. The use of the humanized mouse model in gene therapy and immunotherapy for HIV and cancer. Front Immunol. 2018;9:1-8.
  10. Mullane K, Williams M. Preclinical Models of Alzheimer’s Disease: Relevance and Translational Validity. Curr Protoc Pharmacol. 2019;84(1):1-28.
  11. Penney J, Ralvenius WT, Tsai LH. Modeling Alzheimer’s disease with iPSC-derived brain cells. Mol Psychiatry. 2020;25:148-167.


Please enter your comment!
Please enter your name here