Ensuring Convergence #259

tsbernat · 2026-02-05T21:24:55Z

tsbernat
Feb 5, 2026

Hello CAFE5 Forum,

I am new to using CAFE5 and have a few questions about evaluating convergence.

My plan is to follow previous discussion forum post methods to determine which model best fits my data as follows:

Simulate gene family evolution using the global model
Fit the global & the 2L model (will also do using the 3L model) to the simulated dataset (how well does each model fit the simulated dataset?)
Calculate the likelihood of ratios across the simulations
Extract -lnL values
Compute LR = 2 * (lnL_simple - lnL_complex) to ultimately build a null distribution of LR values from 1000 simulations

In order to complete the second step and compare models, I want to use the appropriate lambda values. My understanding is you want to use ones that have converged, but I am confused about the definition of convergence and what to do if across ten runs, the values do not converge.

In the “Caveats” section of the CAFE5 tutorial, it is advised that one should always perform multiple runs to ensure convergence. I was wondering if you could clarify this caveat. Is convergence considered a prerequisite for using a model? If so, what is considered an appropriate amount of convergence? All the same value? Majority of runs have the same value? Runs being within a certain percentage of each other?

For example, I ran my global model 10 times, each writing to a separate output directory. All of my -lnL values were the same value with essentially the same resulting lambda. On the other hand, once I introduce multiple lambdas with multiple rate categories, there is greater variation between -lnL. As I increase my lambda and/or my rate categories, I see less and less convergence of my -lnL values. The tutorial also states that “For all but the simplest of data sets, searching for multiple lambdas with multiple rate categories will result in a failure of convergence to a single optimum between runs”. If failure of convergence is expected, then why test for it in more complex models in the first place? If it fails to converge, then how do you select lambda values to test on a simulated dataset?

Relatedly, what if the -lnL are more varied, but the lambda values seem to generally converge?

Example dataset:
*All done using root uniform distribution, an ultrametric tree in millions of years (~20mya), no error model
K=2
Global
For each of the 10 runs, -lnL and lambda are the same.

Lambda = 2
Of the 10 runs, 7 of them have basically the same lambda values but -lnL values are within 25 of each other (209495-209520, no families with failures); these also have very similar lambda values. The remaining 3 -lnL values are much more variable (216712-238671) with lambda values dissimilar to each other and the majority of the runs (and have at least one gene family with failures).

Lambda = 3
-lnL values range from 203703 to 224745. Seven of the ten runs have no gene family failures, while 3 of the runs have one family with failure (and attempted optimizer values rejected - up to 11%). The third lambda result is essentially zero.

K=3
Global
All of the 10 runs, half have at least one gene family with failures and the other half have many more. -lnL range from 212107 to 216571 (with 6 of the runs converging on the same number of 212107).

Multiple Lambda
Lambda = 2 and Lambda = 3 also progressively increase the variance in -lnL within this rate category of k = 3.

K=4
Most of the runs, regardless of lambda have some degree of attempted optimizer values rejected

Lastly, what relationship do rejected attempted optimizer percentages have with gene family failures? Is there a threshold of concern for either? I've seen previous posts on gene family failures, but less on rejected optimizer values and the implications of the combination of both.

Thank you in advance for your guidance and patience.

hahnlab-user · 2026-02-11T09:24:54Z

hahnlab-user
Feb 11, 2026
Maintainer

Hello,

Complicated questions! I would say generally that you are looking for convergence in parameter values (like lambda) and not necessarily likelihoods. You could also report a mean and range of parameter values among your runs, if you wanted to be more transparent.

As for failures, this generally means that a particular parameter value used with a particular gene family results in an incalculable number. I would only really worry if your estimated lambda is near a region of parameter space with many failures.

Matt

1 reply

tsbernat Feb 20, 2026
Author

Hello,

Thank you so much for your help.

That makes sense regarding the convergence of lambda. While I agree that reporting a mean and range of parameter values is a good idea when values are not perfectly convergent, I'm still curious if you have an opinion about the degree of convergence for it to be a valid model to consider.

For example, when I run a model on my data with k=2 and lambda =2, the first lambda value is fairly convergent (mostly the same value) but it is also hitting the maximum possible lambda for the topology. I would guess that hitting this boundary is constraining my results so I should increase to a third lambda. However, increasing to a third lambda strongly decreases the convergence of my lambda values. My global model produces results near a region of parameter space with many failures.

Maximum possible lambda for this topology is 0.0181555

k =2, L =2 model
Run # | -lnL | lambda 1 | lambda 2

Run1 | 227199 | 0.00112788681 | 0.00325728642
Run2 | 238671 | 0.0003206611218 | 0.002695611511
Run3 | 209496 | 0.01815371645 | 0.005371707246
Run4 | 209520 | 0.01815554383 | 0.005827456973
Run5 | 209499 | 0.01815554411 | 0.005531776195
Run6 | 209495 | 0.01815554294 | 0.005372491621
Run7 | 216712 | 0.004714158697 | 0.00254966258
Run8 | 209495 | 0.01815554347 | 0.005370515135
Run9 | 209503 | 0.01810237736 | 0.005325417711
Run10 | 209495 | 0.01815554355 | 0.005368106287
Runs with a single family failure: 1, 2, & 7
None of the runs with family failures had rejected optimizer values while those with no family failures had family optimizer values ranging from 13-25%

k =2, L =3 model
Run # | -lnL | lambda 1 | lambda 2 | lambda 3

Run1 | 203703 | 0.01215777616 | 0.005541325867 | 2.02E-05
Run2 | 224745 | 0.002787252666 | 0.001056747642 | 2.61E-05
Run3 | 203832 | 0.01194362954 | 0.007254228531 | 9.66E-07
Run4 | 206490 | 0.006765164604 | 0.004594020145 | 1.09E-07
Run5 | 212618 | 0.006717156503 | 0.001887578312 | 3.87E-05
Run6 | 207080 | 0.008686749091 | 0.003165000142 | 7.35E-08
Run7 | 208100 | 0.005248259922 | 0.00446111915 | 2.82E-05
Run8 | 207157 | 0.00566869976 | 0.005639790495 | 9.15E-06
Run9 | 216788 | 0.002777281367 | 0.002153966545 | 7.55E-06
Run10 | 207833 | 0.006896917113 | 0.003938972389 | 2.51E-04
Runs with a single family failure: 4, 5, & 6
All of the runs had rejected optimizer values ranging from 7-13%

Based on your advice, these runs do not have a significant amount of family failures so I should not worry. However, I'm wondering if the low values for the third lambda means it is now overfitting. Also, I do not understand whether I should be cautious about the rejected optimizer values.

I would love to better understand how concerned I should be about this type of output:
287 values were attempted (10% rejected)

In short, my three main questions are as follows:

How do you determine what level of convergence is valid for testing a model against other models (to determine best fit for your data)?
Do very low lambda values point to overfitting?
Should I be concerned about the amount of rejected attempted optimizer values (separate from family failures)?

I appreciate your help and look forward to hearing your response.

Best,
Tatum

hahnlab-user · 2026-03-05T14:19:15Z

hahnlab-user
Mar 5, 2026
Maintainer

Hi,

Sorry for the delay in answering. This seems like a rather complicated dataset, so it's a bit hard to know what's going on from here. One thing I'm confused about is k and lambda: for a gamma model with k categories, there will also be k mean lambda values, one for each category. Are these the lambdas you mean, or are you also fitting varying numbers of branch-specific lambdas?

Regardless, I would be interested to know what lambda you are getting without the gamma model. Is it near the limit?

Matt

0 replies

tsbernat · 2026-03-30T19:20:44Z

tsbernat
Mar 30, 2026
Author

Hi Matt,

Thank you so much for your response.

I previously misunderstood and thought I could not see the output for the k mean lambda values for each gamma rate category and could only obtain the branch-specific lambdas via the Gamma_results.txt file. Your response to issue #246 helped me understand that the Gamma Cat Mean column of Gamma_family_likelihoods.txt represents these k mean lambda values.

If I am now understanding correctly, k mean lambda values from Gamma_family_likelihoods.txt represent multiplier values for rates of gene family evolution and would be what I am more interested in (in addition to which gene families have expanded and contracted). How many discrete rate categories (or k) you define will determine how the model discretizes the gamma curve (or in other words, changes how many mean rates of gene evolution are tested to fit each gene family). For example, a k=2 would give you two mean rates with one high and one low rate of gene evolution and whichever gamma cat mean fits the gene family better, will be significant.
Similarly, k=3 would give a high, a low, and an intermediate rate of gene evolution and whichever gamma cat mean fits the gene family better, will be significant. If the gene family does not fit into any of the gamma rate categories, then it will be N/S, which means the rate category this family belongs to is uncertain. However, this gene family could still be significant within the Gamma_family_results.txt file if it has undergone a contraction/expansion.

Since I now understand better, I can better answer your question about the limit.
The branch-specific lambda I get without the gamma model is not near the limit (and most runs have many families that fail). Once I introduce the functionality of the gamma model, values seem to approach the limit from the perspective of the k mean lambda values.

Without the gamma model, the base lambda is well below the limit:
Lambda: 0.0047895072966212
Maximum possible lambda for this topology: 0.0181555

With the gamma model (k=2 & a single branch lambda not specified by a tree), the single global lambda is again not near the limit:
Lambda: 0.0094473421357532
Maximum possible lambda for this topology: 0.0181555
In this second case, the k mean lambda values are 0.0789083 and 1.92109. When you multiply 1.92109 and 0.0094473421357532, you get 0.018149… which is essentially the limit.

Increasing the number of gamma rate categories (for example, k =3 with a single branch lambda), follows this same pattern. The provided single global lambda is 0.0067072431833528 while the k mean lambda values are 0.00443491, 0.288718, and 2.70685. When you multiply 1.92109 and 0.0067072431833528, you again get the limit.

My questions:

I cannot specify particular multiplier values for gamma rate categories (only the number of categories like k=2 or k=3). If I want to determine the best model fit for my data following my original workflow, would I just simulate 1000 datasets under the global lambdas for each of the gamma rate categories? For example, the global lambda for k=2 vs. the global lambda for k=3 while specifying the number of rate categories in the simulation?
My goal is to understand which gene families have expanded or contracted in particular groups of the tree, related to traits of interest. Because of my mixup, I now know that introducing branch-specific lambdas improves my model fit and I do not think I should ignore this finding; all of my runs with a lambda of 1 have one to many gene families that fail while introducing a branch specific lambda greater than 1 leads to 70% of my runs without failures of gene families and the latter version has smaller -lnL values. Does this mean I should first find the gamma rate category value with the best likelihood (in my case, k=4) and then test different branch-specific/multiple lambda models against each other for k=4? It seems logical to avoid family failures using the higher lambdas, but I then I run into rejected attempted optimizer values.
Again, it seems logical to avoid family failures using the higher lambdas, but I run into rejected attempted optimizer values when I start to increase the branch-specific lambda. How should I interpret these and should I be concerned about the amount of rejected attempted optimizer values (separate from family failures) What’s more important: amount of rejected attempted optimizer values or amount of family failures?
Is the maximum possible lambda for the topology something to be worried about?

Thank you for reading my long discussion post. I write a lot, in part, to hopefully help others follow along in the future if they encounter similar questions. I look forward to hearing your response. You have been incredibly helpful while I learn and I cannot thank you enough.

Best,
Tatum

0 replies

hahnlab-user · 2026-03-31T08:48:04Z

hahnlab-user
Mar 31, 2026
Maintainer

Hi Tatum,

First, a clarification: there is no sense of "significance" with the gamma categories, either for families within a category or in the number of k. That is, families do not fit one category or another "significantly" better--they just have posterior probabilities of membership. Likewise, there is no statistical way to pick the best k, with simulations or otherwise.

As to your other questions:

Answered above.
If you just want to know which families have contracted or expanded, you don't need any fancy models (though you might gain a little bit of accuracy with either branch-specific or gamma-rates). Also, there is no such thing as "significantly" expanded or contracted--these are just inferences about changes in size.
I'm not sure how to answer this, actually. I guess it just depends on how many families have failures.
The maximum possible lambda itself isn't anything to worry about, but if you are hitting this limit your tree might be too deep. You could try analyzing a smaller set of branches.

cheers,
Matt

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensuring Convergence #259

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Ensuring Convergence #259

Uh oh!

tsbernat Feb 5, 2026

Replies: 4 comments · 1 reply

Uh oh!

hahnlab-user Feb 11, 2026 Maintainer

Uh oh!

Uh oh!

tsbernat Feb 20, 2026 Author

Uh oh!

hahnlab-user Mar 5, 2026 Maintainer

Uh oh!

tsbernat Mar 30, 2026 Author

Uh oh!

hahnlab-user Mar 31, 2026 Maintainer

tsbernat
Feb 5, 2026

Replies: 4 comments 1 reply

hahnlab-user
Feb 11, 2026
Maintainer

tsbernat Feb 20, 2026
Author

hahnlab-user
Mar 5, 2026
Maintainer

tsbernat
Mar 30, 2026
Author

hahnlab-user
Mar 31, 2026
Maintainer