Genome-wide association studies (GWAS) have been a major advancement in the field of genomic medicine, allowing researchers to identify genetic variations associated with specific diseases and conditions upon which they can build targeted therapies. However, there is a significant problem with the representation of minority populations in these studies—even though genetics plays a role in the health and well-being of all individuals, regardless of race or ethnicity, the overwhelming majority of participants in GWAS are white and European ancestry.

According to a 2016 study, 81% of all GWAS samples came from individuals of European ancestry [1]. This number is certainly still an improvement from 2009 when 96% of all samples were from people of European ancestry, but the fact remains that minority populations are being left behind and underserved in terms of genomic sampling. The lack of diversity in GWAS samples is detrimental to the development of precision medicine, which relies on a diversity of genetic data to tailor treatments and therapies to individuals. Without data on the genetic variations that may be specific to certain populations, we risk missing opportunities to address the unique health needs of these communities.

The lack of representation of minority populations in GWAS can be attributed to a number of factors, including historical and systemic racism, socioeconomic disparities, and a lack of trust in the medical system among minority communities. Without a diverse range of participants in GWAS, genomic medicine will only be able to benefit a historically privileged few who are often already at an advantage in terms of access to healthcare. This is a major concern as different populations are often disproportionately affected by certain diseases and conditions, and may have unique genetic variations that would not be identified or served by the current forms of diagnostic tests, genomic therapies, or whole exome sequencing.



Why the disparity?

There are various logistical and systemic factors at play in creating the disparity of representation in genomics studies. One of the primary logistical factors is that homogenous samples are often easier for researchers to acquire and analyze. Researchers will use existing cohorts or other large data sets that come from people of the same location, simplifying the process of error analysis as these people have the same environmental factors to account for [2]. This approach typically overlooks minority individuals who might not have access to certain medical centers where this data is being collected.

In addition to the perceived ease of access, geneticists also seem to be preferentially using cohorts of European ancestry, despite datasets from diverse populations being available [3]. This may be due to the difficulties of getting certain kinds of studies funded, a preference for larger sample sizes, a perception that the analysis will be simplified by using data from one ancestry group, or simply a lack of awareness of the diversity of data sets available. For example, the Database of Genotypes and Phenotypes (dbGaP) is a public database of genotypes and phenotypes with a diversity of populations sampled, but it is not well-utilized.

In addition, those who are trying to do the work of increasing diversity in GWAS samples often face systemic disadvantages. In the grant application process, genetic analyses in minority populations in the United States are sometimes criticized by the National Institutes of Health (NIH) because reviewers consider these populations more difficult to analyze than more genetically homogenous European populations [3]. Some NIH reviewers also see diverse genetic ancestry largely as a potential confounder and do not appreciate that it can be leveraged to reveal new risk factors. Similarly, publishing results from GWAS studies that include minority populations is difficult—most high-impact journals require that an association be found in samples from two independently recruited studies, which is a straightforward demand in European populations because many samples exist, but it is much harder to meet this requirement for other groups.


Furthermore, these disparities are self-perpetuating. Minority scientists are often best placed to gain community “buy-in” and trust in minority populations, but these scientists are at a disadvantage in the field overall. According to one analysis, black scientists in the United States were 13% less likely to get NIH funding than white researchers [4], further exacerbating the lack of representation of minority populations in GWAS.

Correcting Steps

The lack of diversity in GWAS samples can have significant implications for the development of precision medicine, which relies on a diversity of genetic data to tailor treatments and therapies to individuals. Without data on the genetic variations that may be specific to certain populations, we risk missing opportunities to address the unique health needs of these communities. Therefore, steps must be taken to increase the diversity of GWAS samples to improve precision medicine and ensure that all individuals have access to the benefits of genomic medicine. This includes increasing outreach and education to minority communities, addressing structural barriers to participation, and fostering a culture of trust and inclusivity in the medical research system.

Dr. Anil Shanker, an immunologist, professor, and Senior Vice President for Research and Innovation at Meharry Medical College cites the importance of community outreach in bridging the gap in representation. According to him, public-private partnerships will play a crucial role in increasing diversity in GWAS samples. Through these partnerships, community leaders and organizations can serve as trusted intermediaries to help overcome any cultural or structural barriers that may prevent individuals from participating in these studies.

In addition, Dr. Shanker emphasizes the need for funding and resources to support the efforts to increase diversity in GWAS samples. He points out that historically black colleges and universities (HBCUs) and black researchers often have fewer resources and less funding compared to other universities that have historically received support from large endowment funds. This disparity in funding can make it difficult for HBCUs and black researchers to participate in and contribute to genetic research and precision medicine initiatives.


To address this issue, Dr. Shanker calls for increased investment in HBCUs and black researchers to support their efforts to bridge the gap in representation and ensure that all communities have equal access to the benefits of genomic medicine. The funding barrier, along with the need to increase diversity in genomic data, is currently being addressed by many research organizations. The NIH is currently in the process of overhauling its review process to reduce bias, preventing reviewers from rating researchers’ expertise or their institutions’ access to resources [5]. In addition, the NIH’s All of Us Research Program, which released nearly 100,000 whole genome sequences in March 2022, aims to be more representative of the US population and includes genomic sequencing, array data, and linked electronic health and survey data. Nearly half of the data is from individuals from underrepresented racial or ethnic groups, and the program is now part of a group of large-scale genomic research efforts, including the UK Biobank, the Million Veteran Program, and the TOPMed program.

The National Human Genome Research Institute (NHGRI) and the National Institute on Minority Health and Health Disparities (NIMHD) are also currently working on projects to increase minority sampling in genomics studies and to facilitate minority participation in large-scale projects. The Texome project, funded by the NHGRI, is a program aimed at addressing disparities in genomic medicine between racial and ethnic groups in Texas. Over four years, the initiative will provide free whole exome sequencing to 400 patients with undiagnosed disorders and offer them genetic counseling resources. The project also aims to uncover the barriers preventing low-resource communities from accessing genomic medicine by conducting longitudinal follow-up studies with participants.

Overall, GWAS is an incredibly powerful bioinformatics tool that has the potential to revolutionize the way we understand and treat diseases and conditions. However, this potential can only be fully realized if we ensure that all populations are represented in these studies. We must take action to increase the diversity of GWAS samples, to improve precision medicine, and ensure that all individuals have access to the benefits of genomic medicine.


  1. Popejoy, A. B., & Fullerton, S. M. (2016). Genomics is failing on diversity. Nature, 538(7624), 161–164. https://doi.org/10.1038/538161a
  2. Peterson, R. E., Kuchenbaecker, K., Walters, R. K., Chen, C. Y., Popejoy, A. B., Periyasamy, S., Lam, M., Iyegbe, C., Strawbridge, R. J., Brick, L., Carey, C. E., Martin, A. R., Meyers, J. L., Su, J., Chen, J., Edwards, A. C., Kalungi, A., Koen, N., Majara, L., Schwarz, E., … Duncan, L. E. (2019). Genome-wide Association Studies in Ancestrally Diverse Populations: Opportunities, Methods, Pitfalls, and Recommendations. Cell179(3), 589–603. https://doi.org/10.1016/j.cell.2019.08.051
  3. Knerr, S., Wayman, D., & Bonham, V. L. (2011). Inclusion of racial and ethnic minorities in genetic research: advance the spirit by changing the rules?. The Journal of law, medicine & ethics: a journal of the American Society of Law, Medicine & Ethics39(3), 502–512. https://doi.org/10.1111/j.1748-720X.2011.00617.x
  4. Ginther, D. K., Schaffer, W. T., Schnell, J., Masimore, B., Liu, F., Haak, L. L., & Kington, R. (2011). Race, ethnicity, and NIH Research awards. Science, 333(6045), 1015–1019. https://doi.org/10.1126/science.1196783
  5. Kozlov, M. (2022, December 9). NIH plans grant-review overhaul to reduce bias. Nature News. Retrieved February 8, 2023, from https://www.nature.com/articles/d41586-022-04385-x