UCLA Health researchers identified 376 population clusters based on shared genetic ancestry by leveraging information from nearly 36,000 patients enrolled in the UCLA ATLAS Precision Health Biobank. The ATLAS biobank, one of the world’s most diverse genetic repositories, links de-identified electronic health records from consenting UCLA Health patients with information about their genetic makeup obtained from donated biological samples. Using a machine-learning algorithm, researchers were able to identify groups linked by common genetic ancestry, including some that aren’t commonly studied in medicine like Iranian Jews and Lebanese Christians.
Researchers found substantial differences in the rates of diagnoses, hospital utilization, and genetic disease among the clusters. The findings underscore the limitations of the health care system’s frequent reliance on broad self-reported race and ethnicity data to assess patients’ risk of developing disease, and the findings also support expanding genetic screening to more groups, the researchers said.
Notably, however, shared genetic ancestry is just one factor to consider when trying to understand a population’s disease risk. While people with shared genetic ancestry may share genetic risk for disease, they may also share an environment – including structural factors like discrimination – that could influence their disease risk and how they interact with the health care system.
“The combination of your genetic risk and your environmental risk are the two most important things in determining whether you get a disease. It’s best for your doctor to have the best understanding of exactly what populations you might be coming from in order to assess things like disease risk or the need for genetic testing,” said lead author Christa Caggiano, a PhD student in the lab of Noah Zaitlen, a professor of computational medicine and neurology at UCLA Health who is also the study’s corresponding author.
Included in the researchers’ findings:
- There was a great amount of genetic diversity within Mexican and Central American patients, which was partly explained by individual patients having different indigenous groups as ancestors. Researchers identified several subclusters of Mexican patients, which correlated with geographic region of Mexico and Guatemala. Only the Guatemalan subcluster was associated with pregnancy complications, while the Central Mexican subcluster was more prone to nutritional deficiencies. The finding provides further evidence that the health system’s usual practice of grouping patients by Hispanic and Latino is overly broad.
- Researchers also examined specific genetic mutations associated with diseases. While sickle cell disease is widely known to be associated with African ancestry, researchers also found genetic variants implicated in sickle cell disease and thalassemia, another genetic blood disorder, that were at higher frequencies in the Chinese ancestry cluster. Although previous research has indicated this cluster is at higher risk of such blood diseases, the new finding reinforces how this group may benefit from genetic screening that isn’t widely available.
- Other findings revealed how communities across the region access care. For instance, researchers found higher rates of people of Iranian Jewish descent being diagnosed with adjustment disorder, a psychiatric disorder not commonly coded for in medical records. Digging deeper, the researchers found the diagnoses were connected with a single medical center, indicating how social forces strongly affect care.
The researchers also created a website, ibd.la, to allow people to search through the data themselves. Caggiano said the website could be of interest to a patient curious about disease risk in their ethnic group or for a researcher working with a specific community.
Other authors include Joel Mefford, Ella Petter, Alec Chiu, Defne Ercelen, Rosemary He, Daniel Tward, Kimberly C. Paul, Timothy S. Chang, Bogdan Pasaniuc, Eimear E. Kenny, Brunilda Balliu, Valerie A. Arboleda and Gillian Belbin, all of UCLA; Arya Boudaie, Ruhollah Shemirani, Jonathan A. Shortt and Christopher R. Gignoux.
C.C. was supported by NIH F31NS122538. C.C., N.Z., D.E., and E.P. were supported by NIH grants R01CA227237, R01ES029929, R01MH122688, U01HG009080, R01HL155024, R01HL151152. R01GM142112 and R01HG006399. C.R.G. is supported by NIH grants R01HL151152 and R01HG010297. JAS and CRG are supported by NIH grant U01HG011715. N.Z., E.K., C.G., V.A., G.B. were supported by NIH grant R01HG011345. A.C. was supported by NIH grant T32HG002536 and NSF DGE-1829071. V.A. was supported by NIH grant DP5OD024579.
CRG owns stock in 23andMe, Inc. EEK has received personal fees from Regeneron Pharmaceuticals, 23&Me, Allelica, and Illumina; has received research funding from Allelica; and serves on the advisory boards for Encompass Biosciences, Overtone, and Galateo Bio. All other authors declare no competing interests.