Sunday, August 18, 2019

The Limits of Triangulation

I recently wrote about my difficulties with triangulating DNA cousin matches (see The Trials and Tribulations of Triangulation). Among my challenges were that:
  • Most of my DNA matches tested with AncestryDNA, which does not presently give access to the segment data necessary for triangulation.
  • Encouraging these cousins to transfer their data to a third-party site that does provide access to segment data is sloooow going. 
I really stepped in it, though, when I said that triangulation was the gold standard for identifying a common ancestor among genetic cousins.

While these challenges are real and resonated with many, genetic genealogists were quick to flag that triangulation “was never the gold standard.”

I’ve spent the last couple weeks boning up on the shortcomings of triangulation to better understand its limits.

Let’s look at those limits in the context of the research question I am investigating.

An 18th Century Siblingship Hypothesis


I believe my fifth great-grandfather Thomas Kirk (1778-1846) was the younger brother of his neighbor Mary (Kirk) Geiger (1774-1832).

From beginning to end, their lives followed similar geographical trajectories:
  • Both were born in Virginia
  • Both settled in the southern part of Licking County shortly after Ohio gained statehood 
  • Both are buried in a small cemetery just yards apart 
I've blogged a considerable amount about Thomas and Mary and the many curious tangential links that led to speculation about a family relationship. Recently, I've turned to genetic genealogy to complement the traditional research.

With more than 40 one-to-one autosomal DNA matches between descendants of Thomas and Mary – and each of those matches sharing amounts of DNA expected for their speculated relationship level if Thomas and Mary were in fact siblings – it seemed there was strong supporting evidence for my theory within reach.

Was there a more conclusive way to interpret what the DNA was telling me and credibly substantiate my hypothesis?

An Example Case Study


I got an idea after I read a case study published in the National Genealogical Society Quarterly.

In a nutshell, a professional genealogist was trying to identify the father of a woman born in 1789. Although traditional genealogy had surfaced a candidate for paternity within her small community, it failed to provide a definitive answer. Turning to genetics, there were significant amounts of shared autosomal DNA between descendants of the woman and her suspected paternal family. To reinforce the argument, these one-to-one matches were supplemented with a handful of triangulated matches among third to sixth great-grandchildren of the alleged father. The case study concluded that the paired and triangulated DNA matches supported the theory that the woman was the daughter of the possible paternal candidate.

This example seemed like a perfect fit that was analogous to my own research hypothesis. If this approach was good enough for a professional case study, then surely it could help me conclude that Thomas and Mary were siblings. I determined I would replicate the model, and search for triangulated matches among the dozens of paired matches already found between Thomas and Mary.

That’s when I crashed head-on with the triangulation trials and tribulations and realized that - while the methodology offers potential value - it was not a gold standard.

Approach with Caution


The odds for finding a triangulated match between descendants of my 5th great-grandfather and his possible sister are not in my favor. In fact, they're downright daunting. I knew that would be the case, but I didn’t realize how poor my chances were.

My DNA is too far removed from Thomas and Mary and their parents, the Most Recent Common Ancestor (MRCA), to be an effective magnet for cousins. As a fifth great-grandson, I could expect – on average - to share less than 1% of Thomas’ DNA (0.78% to be exact).1 It's easy to imagine how grim the odds are for inheriting any DNA from his parents, my sixth great-grandparents. They're edging on genealogical ancestors who no longer have an imprint on my genetic family tree.2

I needed proxies closer in time to Thomas and Mary who inherited more of their DNA and could boost the odds for matches. Fortunately, several third great-grandchildren of Thomas and Mary are still alive and have tested (all would be fourth great-grandchildren to the parents of Thomas and Mary - the MRCA).


Odds of Inheriting DNA


According to AncestryDNA,3 there's a 100% likelihood that we've inherited DNA from our ancestors up to five generations removed. This means that the tested descendants almost certainly inherited some of their DNA from Thomas and Mary. Although we’re only talking about, on average, 3.12%.

At six generations, AncestryDNA estimates that there’s still a 99.99% chance that we’ve inherited DNA from our ancestors (meaning it’s probable that the tested descendants have inherited DNA from the parents of Thomas and Mary).

Knowing that there’s a 99.99% chance that we inherited DNA dating back at least six generations, but mindful that it’s a slim amount, how likely are these fourth great-grandchildren of the MRCA to share DNA with each other?

Odds of Sharing DNA


According to AncestryDNA, there's a 100% chance that you'll share DNA with siblings through second cousins. Beginning with third cousins there’s a 98% chance, but then the odds dip and decline dramatically for each subsequent generation:
  • fourth cousins (71%)
  • fifth cousins (32%) 
  • sixth cousins (11%) 
  • seventh cousins (3.2%) 
Fourth great-grandchildren of the Kirk MRCA would be fifth cousins and, at least at AncestryDNA as a result of their phasing methodologies,4 would have a 32% chance of sharing DNA.

While the odds aren't great, they're not entirely insignificant and may explain why I’ve found over 40 one-to-one matches between descendants of Thomas and Mary at AncestryDNA.

Odds of a Triangulated DNA Match


When it comes to triangulation, however, it’s not enough to just find a match. It requires that a group of cousins share the same segments of DNA.

What are the chances of that happening?

Not good.

The process of recombination – how our DNA is randomly mixed up before it’s passed to each new generation – ensures we inherit a mishmash of our parents’ DNA.5

This amalgamation of DNA passed from one generation to the next makes it unlikely that three fourth great-grandchildren of the Kirk MRCA would all share overlapping DNA segments.

AncestryDNA found - with a threshold of five centiMorgans (cM) - that three random first cousins shared the same DNA segment 84% of the time. However, when five first cousins were compared, there was only a 40% chance that they each shared the same segment. The likelihood of sharing the same DNA segment among ten first cousins plummeted to 0%.

Even though first cousins will share DNA 100% of the time, the recombination process keeps us on our toes and makes it unlikely that cousins will inherit the exact same pieces of DNA.

Bottom line, the odds are daunting and not in favor of triangulation for recent relationships. Imagine how much grimmer the prospects are for finding triangulated segments among descendants of the Kirk MRCA.

The lesson learned, according to AncestryDNA's computational scientist was that, "We can't rely on those pieces of [matching segments of] DNA in order to bring people together." The odds aren’t in our favor.

And this doesn’t even touch on other complicating factors like
  • larger shared segments potentially coming from more distant ancestors, or
  • the DNA match actually coming from shared ancestors who are not currently known nor mapped out in the pedigree, or 
  • misleading pile ups on certain DNA segments. 
Clearly, triangulation is not the gold standard for identifying and verifying ancestral matches.

The dim probabilities (coupled with challenges inherent to DNA-testing at AncestryDNA) likely explain why I’ve only found, to-date, two triangulated matches between descendants of Thomas and Mary from the 40+ paired matches.

I'm not discounting these triangulated matches or the value of triangulation in certain cases, but I am meditating on alternative approaches to determine how best to leverage these DNA matches to prove or disprove that Thomas Kirk and Mary (Kirk) Geiger were siblings.



[1] https://dna-explained.com/2017/06/27/ancestral-dna-percentages-how-much-of-them-is-in-you/

[2] https://gcbias.org/2013/11/11/how-does-your-number-of-genetic-ancestors-grow-back-over-time/
[3] https://www.ancestry.com/academy/course/ancestry-dna-circles
[4] https://cruwys.blogspot.com/2016/01/autosomal-dna-triangulation-part-2.html
[5] https://cruwys.blogspot.com/2016/01/autosomal-dna-triangulation-part-1.html

8 comments:

  1. Have you tried Jonathan Brecher's Shared Clustering, an Open Source tool for generating cluster diagrams from DNA match lists on Ancestry? This may be an alternate approach to finding similar matches, i.e. people who share a segment which comes from one ancestor as opposed to an ancestral couple. I'll send you the link to the GitHub wiki if you are interested.

    ReplyDelete
    Replies
    1. I'm not familiar with Brecher's tool, but in reading about it I see how it clusters AncestryDNA's shared match list.

      I'd like to give this a go. Is this the link that you used:

      https://github.com/jonathanbrecher/sharedclustering/wiki ?

      Delete
  2. Great post, Michael. You've done a wonderful job of explaining all this. It is so hard to find anything too conclusive about those descended from a MCRA going that far back based on DNA because of all the reasons (and math) you described. But if the descendants of the two presumed siblings share a fairly telling amount of DNA, that would seem to bring you pretty close to your conclusion without triangulation, doesn't it? Or are the amounts shared too small to be reliable?

    ReplyDelete
    Replies
    1. The amounts of shared DNA vary depending on the relationship level, but all fit within the expected range for the particular relationship level (so 4th great-grandchildren of the speculated MRCA share more than the 5th and 6th great-grandchildren who I've located).

      For those 4th great-grandchildren matches the average shared cM is 20. Seems like a reliable amount to me.

      Delete
    2. I agree (especially since your ancestors appear not to have been endogamous so those amounts are more reliable).

      Delete
  3. You are correct. Working top down from a specific ancestor 5 or more generations back will likely not result in enough triangulations to work with.

    Triangulations work better when you do them bottom up, by determining for a particular triangulation group, first the parent, then grandparent, working up the ancestral path as best you can. This is like DNA painting but using triangulations instead of single matches.

    ReplyDelete
    Replies
    1. Thank you, Louis. Can you point me to any online guidance on how best to follow this approach? Would my starting point still be these 4th great-grandchildren of the MRCA from whom I speculate Thomas and Mary descended or myself?

      Delete
  4. A great blog post. I would add that since you are tracing back to common ancestors in Virginia, where the population was very small at that time, you also need to consider the possibility that, because of pedigree collapse and the founder effect, you are related to your matches not just through the ancestral couple of interest but through other pathways as well. Sharing multiple ancestral pathways increases the chances of matching a cousin but also makes it more difficult to determine which pathway the match is on. Bottlenecks and pedigree collapse also have the effect of reducing genetic diversity so you end up with segments that are widely shared in a particular population.

    We also need to remember that publication in a peer reviewed journal is not a guarantee of quality. It merely helps to filter out some of the lower quality articles. Also genealogical journals will not generally have population geneticists on their editorial board so the reviewers might not be in the best position to review such articles.

    ReplyDelete