Skip to content

Alignment threshold and secondary cluster assignments - not sure it's working properly? #295

@mcmahon-uw

Description

@mcmahon-uw

Hi Matt!

I'm using dRep v3.4.5.

I've been using the compare command to look at secondary genomes clusters that all belong to the same primary cluster. I noticed that some are being assigned to their own secondary cluster even though their ANI is above my threshold. However, their alignment FRACTION with many other genomes is only 0.12-0.4 because they are SAGs and quite incomplete. I thought this was happening because the minimum coverage threshold default was set to be very low, but the 0.1 default should still group those genomes based more on their ANI. I tried rerunning with the coverage threshold explicitly set to 0.1, but got the same results.

Here is my code:
dRep compare -p 24 -g ~/Mendota_genomes/1_dRep/starting_genomes/*fna -sa 0.96 --cov_thresh 0.1 dRep

Here is a snippet of my output:

26Sep2015rr0052-bin.134.fna,SAG_2739367632.fna,0.978604,0.1875,1 27Jul2012rr0045-bin.200.fna,SAG_2739367632.fna,0.969617,0.25,1 MAGv2_3300020483-bin.4.fna,SAG_2739367632.fna,0.977835,0.19377162629757785,

It seems like 26Sep2015rr0052-bin.134.fna,SAG_2739367632, and MAGv2_3300020483-bin.4 should all be in the same cluster. Is it splitting them because of the way cluster-wide ANI is calculated (i.e. average?). Or some other reason?

Love all your tools and they are so accessible. Thank you for your efforts!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions