Week 9: Found an Incredible Research

May 07, 2021

Hi guys, how’s everyone doing? Last week’s blog explores the effectiveness of genetic simulations in school teaching, but arguably a more important usage of these simulations in in lab research. This week, in addition to keep working on my own program simulation about clonal plants, I found a awesome research showing the effectiveness of genetic simulations:

Howard A Ross, Sumathi Murugan, Wai Lok Sibon Li, Testing the Reliability of Genetic Methods of Species Identification via Simulation, Systematic Biology, Volume 57, Issue 2, April 2008, Pages 216–230, https://doi.org/10.1080/10635150802032990

This research by Howard Ross et al that involves the use of simulators to test methods of species identification. In that research the researchers simulate gene trees of species in a coalescent lineage, and then compare those gene trees to the empirical data of 10 gene trees. The result shows that the result variables—gene tree depth and intraspecific distance—of the simulated gene trees matches reasonably well with the empirical experimental data.

(Graphs: (b) The distribution of mean tree height and mean intraspecific distance for each of the simulated gene trees. (c) The mean distance to the nearest heterospecific and mean intraspecific distance for a sample of 10 gene trees.) It could be seen that the distribution of simulated data is similar with the collected data.


This is given that the input parameter variables—coalescent time and birth rate—match with what sampled in empirical data. However, it is worth mentioning that in the simulations the species birth rate was simulated uniformly randomly, making the distribution of genes on the gene trees to not align exactly with the empirical data, which is shown via the different locations of the dashed line in graphs b and c, representing the threshold of equality between tree depth and coalescent depth. But other than that, the computed values and distributions for the simulated gene trees matches well with the experimental scheme. The result of this research shows that while not being completely accurate, genetic simulations could represent empirical data to a reasonable degree of accuracy.

In the report of their research, the researchers explain the reason why they does the simulation method at the first place is because the computation of empirical gene trees took a lot of time, and that is why they could only choose 10 gene trees as the comparison to the simulation. This highlights the importance of simulations, without those their theories and methods can’t be verified, or it will take a long time to gather and compute enough data from nature. Not to mention certain data, like genetic information way down the tree million years ago, can’t be accurately gathered at all using empirical methods, so they have to rely on simulations. In my opinion, researches like this prove that simulations are unreplaceable when needing to acquire, manipulate, and analyze massive amount of numerical data, and using them is crucial for researchers to prove a method and reach a conclusion.

This research’s result is similar with the intended result of my simulation: the empirical genetic data, although not 100% overlap-able, is very similar, and thus could be represented, by the result of genetic simulations.

3 Replies to “Week 9: Found an Incredible Research”

  1. Jiaming Z. says:

    Oh great the picture is messed up 🙁

  2. Peter L. says:

    Can’t claim that I fully understood this piece of research, but it is indeed awesome! Hope your data and simulations come out to match your intended result! Keep up the great work 🙂

  3. Dora X. says:

    Wow, this looks really cool! I look forward to seeing what other research you come across next week!

Leave a Reply