Events

Thursday, May 24, 2018 — 4:00 PM EDT

Does your phylogenetic tree fit your data?


Phylogenetic methods are used to infer ancestral relationships based on genetic and morphological data. What started as more sophisticated clustering has now become a more and more complex machinery of estimating ancestral processes and divergence times. One major branch of inference is maximum likelihood methods. Here, one selects the parameters from a given model class for which the data are more likely to occur than for any other set of parameters of the same model class. Most analysis of real data is executed using such methods.

However, one step of statistical inference that has little exposure to application is the goodness of fit test between inferred model and data. There seem to be various reasons for this behaviour, users are either content with using a bootstrap approach to obtain support for the inferred topology, are afraid that a goodness of fit test would find little or no support for their phylogeny thus demeaning their carefully assembled data, or they simply lack the statistical background to acknowledge this step.

Recently, methods to detect sections of the data which do not support the inferred model have been proposed, and strategies to explain these differences have been devised. In this talk I will present and discuss some of these methods, their shortcomings and possible ways of improving them.

Thursday, May 17, 2018 — 4:00 PM EDT

Combining phenotypes, genotypes and genealogies to find trait-influencing variants


A basic tenet of statistical genetics is that shared ancestry leads to trait similarities in individuals. Related individuals share segments of their genome, derived from a common ancestor. The coalescent is a popular mathematical model of the shared ancestry that represents the relationships amongst segments as a set of binary trees, or genealogies, along the genome.  While these genealogies cannot be observed directly, the genetic-marker data enable us to sample from their posterior distribution.  We may compare the clustering of trait values on each genealogical tree that is sampled to the clustering under the coalescent prior distribution. This comparison provides a latent p-value that reflects the degree of surprise about the trait clustering in the sampled tree.  The distribution of these latent p-values is the fuzzy p-value as defined by Geyer and Thompson. The fuzzy p-value contrasts the posterior and prior distributions of trait clustering on the latent genealogies and is informative for mapping trait-influencing variants. In this talk, I will discuss these ideas with application to data from an immune-marker study, present results from preliminary analyses and highlight potential avenues for further research.

Saturday, May 12, 2018 — 8:00 AM to 6:00 PM EDT
Datathon Logo
Friday, May 4, 2018 — 3:00 PM to Sunday, May 6, 2018 — 6:00 PM EDT
Datafest Logo

S M T W T F S
29
30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
1
2
  1. 2019 (37)
    1. May (7)
    2. April (7)
    3. March (6)
    4. February (4)
    5. January (13)
  2. 2018 (44)
    1. November (6)
    2. October (6)
    3. September (4)
    4. August (3)
    5. July (2)
    6. June (1)
    7. May (4)
    8. April (2)
    9. March (4)
    10. February (2)
    11. January (10)
  3. 2017 (55)
  4. 2016 (44)
  5. 2015 (38)
  6. 2014 (44)
  7. 2013 (46)
  8. 2012 (44)