Skip to content

request for clarification of noCommonOrganPenalty #689

@bpow

Description

@bpow

I'm having a little trouble understanding the functionality of noCommonOrganPenalty (used as part of the branch case of phenotype LR calculation when a query term is a child of one or more disease-annotated terms).

It looks like this heuristic may be different from what was originally described in the manuscript, possibly to address issue #231.

From my first reading of the comment, it looks like maybe the intent was to scale probability linearly, mapping from between ["maximally rare", "very common"] (i.e., [1/diseases.size(), 10%] to range [1/500, 1/10], perhaps even with clamping.

The last comment before the return statement says "We multiply the overall feature frequency in our cohort by the penalty factor this will give us a likelihood ratio that varies from 0.1 to 0.002". However, the method is named to imply it would return a probability, and I think the return value is being used as a probability rather than a likelihood ratio in the calling function (lrForObservedTerm uses the max of it or proportional frequency as numerator in an LR calculation). Is the output supposed to be a probability or a likelihood ratio?

The code also doesn't appear to map output between 0.1 and 0.002 as implied... The falsePositivePenalty will be mapped from range [DEFAULT_FALSE_POSITIVE_NO_COMMON_ORGAN_PROBABILITY, MAX_PROB] to [MIN_PROB, MAX_PROB], but there is no clamping. Simplifying with the constants in the code, if I get my math right, we get:

falsePositivePenalty = (49/45) * f - (2/225)

or 1.0889 * f - 0.0089

The output value of f * falsePositivePenalty is mapped quadratically from f, again without clamping.

For small enough term frequencies (those between 0 and 2/245), the method can even evaluate to negative values! I guess that ends up being ok since the calling code is using this value in Math.max against something else that will be nonnegative, but it points to maybe the code not doing what is intended.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions