Geography of Ancestry

Share:

  

What’s your geographic ancestral origin?  A team of researchers from Tel Aviv University (TAU) and University of California, Los Angeles (UCLA) have created a method for more precisely pinpointing the geographic origin of a person’s ancestry by developing an understanding of the spatial diversity of genes.

Single nucleotide polymorphisms, or SNPs, according to the Human Genome Project, are “DNA sequence variations that occur when a single nucleotide (A,T,C,or G) in the genome sequence is altered.”  Those mutations can be linked to a geographic location when that mutation was passed on to a larger population.  Prof. Eran Halperin of TAU’s Blavatnik School of Computer Science and Department of Molecular Microbiology and Biotechnology explains:

“We wanted to ask, for example, about the probability of having the genetic mutation ‘A’ in a particular position on the genome based on geographical coordinates. When you look at many of these positions together in a bigger picture, it’s possible to group populations with the same mutation by point of origin.”

The DNA of 1,157 individuals from across Europe was analyzed for the research.  From UCLA’s press release:

“If we know from where each individual in our study originated, what we observe is that some variation is more common in one part of the world and less common in another part of the world,” said Eleazar Eskin, an associate professor of computer science at UCLA Engineering. “How common these variants are in a specific location changes gradually as the location changes.

“In this study, we think of the frequency of variation as being defined by a specific location. This gives us a different way to think about populations, which are usually thought of as being discrete. Instead, we think about the variant frequencies changing in different locations. If you think about a person’s ancestry, it is no longer about being from a specific population — but instead, each person’s ancestry is defined by the location they’re from. Now ancestry is a continuum.”

The study was able to identify the location of a person’s ancestry of both the maternal and paternal lineage, providing a more accurate pinpointing of ancestry by using a newly designed mathematical probabilistic model of the SNP geocoded by place of origin.  Previous methods were unable to isolate the ancestral origin according to UCLA’s John Novembre, an assistant professor in the department of ecology and evolution.  Calling this method and the developed tool spatial ancestry analysis (SPA), the researchers describe it as a “model-based approach for analysis of spatial structure in genetic data.”  

SPA Genetic Mapping: Model-based mapping convergence with random initialization. Colors represent the true country of origin of the individual (also represented by country internet code). (a–d) A map generated by SPA. Iteration 1 starts with random positioning of individuals (a). By iteration 4, the northern and southern populations are separated (b). By iteration 7, the positioning of individuals is close to convergence (c). In iteration 10, individuals have reached their final positions (d). (e) A map generated by PCA9. (f) Map of Europe.

SPA Genetic Mapping: Model-based mapping convergence with random initialization. Colors represent the true country of origin of the individual (also represented by country internet code). (a–d) A map generated by SPA. Iteration 1 starts with random positioning of individuals (a). By iteration 4, the northern and southern populations are separated (b). By iteration 7, the positioning of individuals is close to convergence (c). In iteration 10, individuals have reached their final positions (d). (e) A map generated by PCA9. (f) Map of Europe.

The research enables the identification of the geographic origins of an individual based on their genes:

“If the location of an individual is unknown, our model can actually infer geographic origins for each individual using only their genetic data with surprising accuracy,” said Wen-Yun Yang, a UCLA computer science graduate student.

The results of the research have been published in the Journal Nature Genetics.  The SPA software tool is a command-line program written in C and compiled versions for Mac, PC, and Linux operating systems are available from the SPA download page.

The further potential of this research is to build a better understanding of the geographic origins of human populations as well as migration history. The same approach can also be applied to understand animal migration.

Spatial Ancestry Analysis results. Different colors represent different continents.

Spatial Ancestry Analysis results. Different colors represent different continents.

 




Like this article and want more?

Enter your email to receive the weekly GIS Lounge newsletter: