-
Notifications
You must be signed in to change notification settings - Fork 39
Alignment threshold and secondary cluster assignments - not sure it's working properly? #295
Description
Hi Matt!
I'm using dRep v3.4.5.
I've been using the compare command to look at secondary genomes clusters that all belong to the same primary cluster. I noticed that some are being assigned to their own secondary cluster even though their ANI is above my threshold. However, their alignment FRACTION with many other genomes is only 0.12-0.4 because they are SAGs and quite incomplete. I thought this was happening because the minimum coverage threshold default was set to be very low, but the 0.1 default should still group those genomes based more on their ANI. I tried rerunning with the coverage threshold explicitly set to 0.1, but got the same results.
Here is my code:
dRep compare -p 24 -g ~/Mendota_genomes/1_dRep/starting_genomes/*fna -sa 0.96 --cov_thresh 0.1 dRep
Here is a snippet of my output:
26Sep2015rr0052-bin.134.fna,SAG_2739367632.fna,0.978604,0.1875,1 27Jul2012rr0045-bin.200.fna,SAG_2739367632.fna,0.969617,0.25,1 MAGv2_3300020483-bin.4.fna,SAG_2739367632.fna,0.977835,0.19377162629757785,
It seems like 26Sep2015rr0052-bin.134.fna,SAG_2739367632, and MAGv2_3300020483-bin.4 should all be in the same cluster. Is it splitting them because of the way cluster-wide ANI is calculated (i.e. average?). Or some other reason?
Love all your tools and they are so accessible. Thank you for your efforts!