Skip to content

Sequence identifier warnings from get-ncbi-data. #158

@mikerobeson

Description

@mikerobeson

A user initially reported this issue when running the following command:

qiime rescript get-ncbi-data   \
    --p-query "txid4751[ORGN] AND (ITS1 OR ITS2 OR its1 OR its2) NOT environmental sample[Filter] NOT environmental samples[Filter] NOT environmental[Title] NOT uncultured[Title] NOT unclassified[Title] NOT unidentified[Title] NOT unverified[Title]" \
    --p-ranks kingdom phylum class order family genus species \
    --p-rank-propagation \
    --p-n-jobs 4 \
    --o-sequences ITS-ref-seqs-ng.qza \
    --o-taxonomy ITS-ref-tax-ng.qza \
    --verbose

Which resulted in the following errors:

WARNING:2023-05-10 08:31:04,095:MainProcess:Using pdb|8E5T|3 as a sequence identifier, because it did not come down with an accession version.
WARNING:2023-05-10 08:31:04,096:MainProcess:Using pdb|7V08|6 as a sequence identifier, because it did not come down with an accession version.
WARNING:2023-05-10 08:31:04,096:MainProcess:Using pdb|7UQZ|6 as a sequence identifier, because it did not come down with an accession version.
WARNING:2023-05-10 08:31:04,096:MainProcess:Using pdb|7UQB|6 as a sequence identifier, because it did not come down with an accession version.
...

I was able to reproduce the issue. I exported the resulting FASTA file and did observe sequences with headers like those shown above. I also manually ran BLAST on a few of the sequences, they did appear to contain ITS sequences, though I've not tested thoroughly. I am not sure why pdb identifiers are used, when the returned data might actually contain the requested ITS DNA sequences.

The warning message comes from specifically these lines from ncbi.py.

Probably not really a true issue, but it can be difficult to trace back the origin of these data.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions