Genomic platforms and databases have become integral parts of genome informatics, enjoying exponential growth in the post-genomic era. A variety of platforms know exist, of varying sizes, and collect a range of information to serve general medical diagnostic needs or for more specific applications. These resources organize our knowledge of the genetic etiology of human disorders and the identification of numerous genomic variants such that it could eventually be useful not only for molecular diagnosis but also for clinicians and researchers in advancing translational medicine.
Genomic Platforms and Datasets
Genomic databases allow genome sequence data to be stored, shared, and compared across research studies, data types, individuals, and even organisms. The earliest genomic databases were created in order to provide broad access to DNA sequencing data of living organisms without any restrictions on the use or distribution of the data. Some of these initial databases are still in function, available at no cost, and contain sequences for more than 300 000 living organisms collected from an array of sources. A vital feature of any genomic platform is the extent to which they facilitate data sharing between different research groups. This widespread availability of genomic data to the scientific community allows a broad range of biomedical research activities to be supported by each genomic database, permitting research groups with different interests to reapply evolving computational techniques and approaches over time. The scope of genomic databases can, however, be highly variable, with some containing previously collected data derived from residual clinical or research samples, while others are created with newly-collected data from individuals, families, or sometimes thousands of individuals from the general population.
The recent proliferation of large-scale genomic platforms is tied primarily to the growing interest in population-based research following the publication of the first complete human genome sequence. Studies using these databases rely on the availability of large quantities of data in order to draw accurate conclusions regarding health status and disease outcomes. As such, automated molecular analysis and bioinformatics tools are necessary for the rapid mass analysis required for such studies to help elucidate the role of genomic, genetic, and environmental factors in determining a patient’s susceptibility to both common and complex diseases. The ultimate goal is to obtain an improved understanding of how individuals will respond to various treatments for common diseases and potentially develop new therapeutic interventions as well.
Increasing Genomic Diversity with New Datasets
Unfortunately, the majority of the genomic data collected in existing large databases is from participants of European descent. This lack of general population diversity has hindered scientific discovery and requires that these platforms be updated with more genetic data from a diverse subset of participants. Ensuring these databases have an equitable representation of genomic information will allow them further to become an indispensable resource for health and medical research. As such, this new genomic platform should help address certain questions that have remained unanswered about health and disease, enabling more precise approaches to health care for all populations and reducing persistent health disparities.
The National institute of health (NIH) has recently established a new platform, the “All of Us” database, in order to help overcome this issue. This new database includes around 50% of its data from individuals who identify with racial or ethnic groups that have historically been underrepresented in research. Available via a cloud-based platform provides information about almost all of an individual’s genetic makeup and also utilizes genotyping arrays to capture a specific subset of the genome. In addition, the database contains further medical information about the participants that may help researchers better understand how genes can cause or influence diseases in the context of other health determinants. This database also offers the opportunity to receive individual DNA results at no cost to its participants, including genetic ancestry data and trait results.
Advancements in Translational Medicine
These new genomic platforms are being utilized along with clinical data to advance the field of personalized medicine in an attempt to establish a link between biomolecular characterizations, patient conditions, and treatment effectiveness to provide patients with the best possible individual treatment. This has not only been made possible by advancements in high-throughput genome sequencing but also by analyses of the transcriptome, epigenome, and proteome. Comprehensive population-based databases that contain biomarker data along with patient medical history and lifestyle information are potent tools for translational medicine in understanding the links between genetic and environmental factors responsible for disease, as well as for estimating allele frequency of gene variants in different ethnic groups. Human genomic databases aim to facilitate diagnosis at the DNA level and to correlate genomic variants with specific phenotypic patterns and clinical features. Locus-specific and national/ethnic genetic databases are the main types of human genomic databases frequently used in genomic and translational medicine.
With the growing number of databases, as well as the increasing amount of information being collected, researchers and clinicians require tools to find the relevant information in the maze of biological data available. Several systems have been developed for this purpose, though the data should be analyzed along with available clinical data to be useful for translational medicine and prove beneficial to patients. Most of these new systems do not yet provide a solution for integrating these two data sets. Conversely, clinical data warehouses (CDWs) can now largely incorporate data from various clinical sources, presenting a unified view. As such, they allow populations with common characteristics to be identified and discover significant associations among phenotypes.
As our understanding of diseases becomes ever more classified by their genetic makeup, larger datasets will be required to develop accurate diagnoses and treatments. Therefore, systems that can integrate genome sequence and clinical data from individual patients could help drive the development of a more precise classification of disease and ultimately lead to more precise personalized treatments. However, this also requires that these large genomic databases be appropriately maintained, with researchers and clinicians having unrestricted access to the recorded patient genomic data.
LabTAG by GA International is a leading manufacturer of high-performance specialty labels and a supplier of identification solutions used in research and medical labs as well as healthcare institutions.