Add maxfailures option to limit test failures before stopping by VanitasCodes · Pull Request #61560 · JuliaLang/julia

VanitasCodes · 2026-04-11T02:07:01Z

This came out of the triage discussion on #61483, where @oscardssmith suggested that instead of changing the default behavior of @testset, we should add a maxfailures option that test runners can use to control how many failures are tolerated before stopping execution. This PR implements the Test stdlib side of that.

There's a global atomic counter that tracks failures and errors across testsets, and a configurable limit. When the count hits the limit, a MaxFailuresError is thrown and execution stops. The default limit is typemax(Int), so existing behavior is completely unchanged unless you explicitly call set_max_failures.

Four new functions are exported from Test:

set_max_failures(n) - set the limit
get_max_failures() - read the current limit
get_failure_count() - read the current failure count
reset_failure_count() - reset the counter to zero

The MaxFailuresError integrates with the existing is_failfast_error machinery, so it propagates correctly through nested testsets the same way FailFastError does. If both failfast and maxfailures are set, failfast takes precedence since it's checked first in record.

The intended usage is for something like Pkg.test(; maxfailures=10) to call Test.set_max_failures(10) before running tests, which would be a follow-up PR.

VanitasCodes · 2026-04-11T02:08:08Z

@oscardssmith Is this along the lines of what you had in mind?

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a maxfailures mechanism to Julia’s Test stdlib to stop a test run after a configurable number of failures/errors, without changing default @testset behavior.

Changes:

Introduces global atomic failure counter + limit, plus exported setters/getters/reset helpers.
Adds MaxFailuresError and integrates it with existing failfast propagation/printing paths.
Adds stdlib tests validating default behavior, stopping at N failures, counting errors, and invalid input.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File	Description
stdlib/Test/src/Test.jl	Implements global max-failures tracking, exports API, adds `MaxFailuresError`, and wires it into `record` and top-level printing.
stdlib/Test/test/runtests.jl	Adds integration-style tests that spawn a Julia process to verify max-failures behavior and messaging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-11T02:15:30Z

stdlib/Test/src/Test.jl

+    set_max_failures(n::Integer)
+
+Set the maximum number of test failures (fails + errors) allowed before stopping.
+Default is `typemax(Int)` (no limit). Set to `0` to stop on first failure.


The docstring states “Set to 0 to stop on first failure”, but with the current record logic (count >= limit) set_max_failures(1) also stops on the first failure (and the added tests use 1 for “stop after 1 failure”). Please clarify the semantics in the docstring to match the implemented/tested behavior (e.g., document that n=1 stops after the first failure, or adjust the comparison logic if n is intended to mean “failures tolerated before stopping”).

Suggested change

Default is `typemax(Int)` (no limit). Set to `0` to stop on first failure.

Default is `typemax(Int)` (no limit). Set to `1` to stop after the first failure.

Copilot · 2026-04-11T02:15:30Z

stdlib/Test/src/Test.jl

+!!! compat "Julia 1.14"
+    This function requires at least Julia 1.14.
+"""
+get_max_failures() = Threads.atomic_add!(global_failure_limit, 0)


Using atomic_add!(x, 0) to read an atomic performs a read-modify-write (RMW) operation, which is heavier than an atomic load and can increase contention. Prefer an atomic load (e.g., global_failure_limit[] / global_failure_count[], or a dedicated atomic_load if used elsewhere in the file) for pure reads.

Copilot · 2026-04-11T02:15:31Z

stdlib/Test/src/Test.jl

+!!! compat "Julia 1.14"
+    This function requires at least Julia 1.14.
+"""
+get_failure_count() = Threads.atomic_add!(global_failure_count, 0)


Using atomic_add!(x, 0) to read an atomic performs a read-modify-write (RMW) operation, which is heavier than an atomic load and can increase contention. Prefer an atomic load (e.g., global_failure_limit[] / global_failure_count[], or a dedicated atomic_load if used elsewhere in the file) for pure reads.

Copilot · 2026-04-11T02:15:31Z

stdlib/Test/src/Test.jl

+    Threads.atomic_xchg!(global_failure_limit, Int(n))
+    return n


set_max_failures stores Int(n) but returns n (which may be a different type, e.g. BigInt). Consider returning the stored value (Int(n)) or nothing to avoid surprising type/behavior mismatches for callers.

Suggested change

Threads.atomic_xchg!(global_failure_limit, Int(n))

return n

limit = Int(n)

Threads.atomic_xchg!(global_failure_limit, limit)

return limit

Copilot · 2026-04-11T02:15:31Z

stdlib/Test/src/Test.jl

    ts.failfast && throw(FailFastError())
+    # check maxfailures limit; +1 because atomic_add! returns the old value
+    count = Threads.atomic_add!(global_failure_count, 1) + 1
+    count >= global_failure_limit[] && throw(MaxFailuresError(global_failure_limit[], count))


global_failure_limit[] is read twice; if another task changes the limit between the check and constructing MaxFailuresError, the thrown error can report a different limit than the one that triggered the stop. Load the limit once into a local limit and use it for both the comparison and the exception payload (this also reduces atomic-load traffic).

Suggested change

count >= global_failure_limit[] && throw(MaxFailuresError(global_failure_limit[], count))

limit = global_failure_limit[]

count >= limit && throw(MaxFailuresError(limit, count))

Copilot · 2026-04-11T02:15:31Z

stdlib/Test/test/runtests.jl

+            result = read(pipeline(ignorestatus(cmd), stderr=devnull), String)
+            @test occursin("Max failures reached: 1", result)
+            @test occursin("First", result)
+            @test !occursin(r"Test Summary:.*\n.*Second", result)


These regexes assume \n line endings. To make the tests more robust across platforms/environments, consider matching \r?\n (or otherwise avoiding hard-coding newline style) so the assertion doesn’t become Windows-sensitive.

Copilot · 2026-04-11T02:15:32Z

stdlib/Test/test/runtests.jl

+            result = read(pipeline(ignorestatus(cmd), stderr=devnull), String)
+            @test occursin("Max failures reached: 2", result)
+            @test occursin("First", result)
+            @test !occursin(r"Test Summary:.*\n.*Second", result)


These regexes assume \n line endings. To make the tests more robust across platforms/environments, consider matching \r?\n (or otherwise avoiding hard-coding newline style) so the assertion doesn’t become Windows-sensitive.

Copilot · 2026-04-11T02:15:32Z

stdlib/Test/test/runtests.jl

+            result = read(pipeline(ignorestatus(cmd), stderr=devnull), String)
+            @test occursin("Max failures reached: 1", result)
+            @test occursin("First", result)
+            @test !occursin(r"Test Summary:.*\n.*Second", result)


These regexes assume \n line endings. To make the tests more robust across platforms/environments, consider matching \r?\n (or otherwise avoiding hard-coding newline style) so the assertion doesn’t become Windows-sensitive.

oscardssmith · 2026-04-11T02:37:29Z

stdlib/Test/src/Test.jl

 export GenericString, GenericSet, GenericDict, GenericArray, GenericOrder
 export TestSetException
 export TestLogger, LogRecord
+export set_max_failures, get_max_failures, get_failure_count, reset_failure_count


these probably shouldn't be exported

oscardssmith · 2026-04-11T02:37:50Z

stdlib/Test/src/Test.jl


+# Global state for tracking test failures across testsets
+const global_failure_count = Threads.Atomic{Int}(0)
+const global_failure_limit = Threads.Atomic{Int}(typemax(Int))


this probably doesn't need to be atomic.

it also should be 0 by default if we want to be backward compatible.

oscardssmith · 2026-04-11T02:39:53Z

stdlib/Test/src/Test.jl

+"""
+    set_max_failures(n::Integer)
+
+Set the maximum number of test failures (fails + errors) allowed before stopping.
+Default is `typemax(Int)` (no limit). Set to `0` to stop on first failure.
+
+!!! compat "Julia 1.14"
+    This function requires at least Julia 1.14.
+"""
+function set_max_failures(n::Integer)
+    n >= 0 || throw(ArgumentError("maxfailures must be non-negative, got $n"))
+    Threads.atomic_xchg!(global_failure_limit, Int(n))
+    return n
+end
+
+"""
+    get_max_failures()
+
+Get the current failure limit. Returns `typemax(Int)` if no limit is set.
+
+!!! compat "Julia 1.14"
+    This function requires at least Julia 1.14.
+"""
+get_max_failures() = Threads.atomic_add!(global_failure_limit, 0)


by making the failure limit nonatomic you could delete both of these functions.

oscardssmith · 2026-04-11T02:41:29Z

stdlib/Test/src/Test.jl

+"""
+    get_failure_count()
+
+Get the current count of test failures (fails + errors).
+
+!!! compat "Julia 1.14"
+    This function requires at least Julia 1.14.
+"""
+get_failure_count() = Threads.atomic_add!(global_failure_count, 0)


This can just be global_failure_count[] (and thus we likely don't need a function for it)

oscardssmith · 2026-04-11T02:41:51Z

stdlib/Test/src/Test.jl

+"""
+    reset_failure_count()
+
+Reset the failure counter to zero. Called at the start of test runs.
+
+!!! compat "Julia 1.14"
+    This function requires at least Julia 1.14.
+"""
+function reset_failure_count()
+    Threads.atomic_xchg!(global_failure_count, 0)
+    return nothing
+end


This can just be global_failure_count[] = 0 (and thus we likely don't need a function for it)

oscardssmith · 2026-04-11T02:44:17Z

Overall, I think this is the right direction (although seeing the changes in Pkg would be useful to see this in use).

DilumAluthge · 2026-04-11T02:46:45Z

Thanks for putting this together!

Four new functions are exported from Test:

Do those functions need to be exported (or public)?

VanitasCodes · 2026-04-11T13:09:56Z

@oscardssmith I have a few questions before I revise.

You mentioned the default should be 0 for backward compatibility. I originally went with typemax(Int) to mean "unlimited" so existing behavior stays the same, but I think you're suggesting 0 should mean "disabled/not counting" instead of "stop immediately"? That makes more sense. So the semantics would be 0 means the feature is off, and you'd call set_max_failures(10) to enable it with a limit of 10. Is that what you had in mind?

You're completely right about the atomics. I was overthinking the concurrency case, but tests run sequentially in a single process anyway. I'll switch to a simple Ref{Int} or just a module global.

For the API, if we make the limit nonatomic and just use direct access like global_failure_limit = n, do we still want helper functions at all, or should Pkg just set Test.global_failure_limit directly? I can see it going either way. The functions feel a bit over-engineered for what they do.

Should I draft a companion PR for Pkg showing how Pkg.test(; maxfailures=10) would use this, or wait until this settles?

VanitasCodes · 2026-04-11T13:11:30Z

@DilumAluthge I originally exported them thinking they might be useful for custom test runners, but @oscardssmith pointed out they probably shouldn't be. I think you're right that they should either be public (documented but not exported) or just kept internal and accessed as Test.set_max_failures(). What do you think makes more sense? I'm leaning toward not exporting them and letting Pkg access them via the Test. prefix.

DilumAluthge · 2026-04-12T00:16:43Z

just kept internal and accessed as Test.set_max_failures()

Keeping them internal (non-public) sounds good to me!

Add maxfailures option to limit test failures before stopping

fe95410

Copilot AI review requested due to automatic review settings April 11, 2026 02:07

Copilot AI reviewed Apr 11, 2026

View reviewed changes

oscardssmith reviewed Apr 11, 2026

View reviewed changes

	Default is `typemax(Int)` (no limit). Set to `0` to stop on first failure.
	Default is `typemax(Int)` (no limit). Set to `1` to stop after the first failure.

-    Threads.atomic_xchg!(global_failure_limit, Int(n))
-    return n
+    limit = Int(n)
+    Threads.atomic_xchg!(global_failure_limit, limit)
+    return limit

	count >= global_failure_limit[] && throw(MaxFailuresError(global_failure_limit[], count))
	limit = global_failure_limit[]
	count >= limit && throw(MaxFailuresError(limit, count))

Uh oh!

Conversation

VanitasCodes commented Apr 11, 2026

Uh oh!

VanitasCodes commented Apr 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

oscardssmith Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

oscardssmith Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

oscardssmith Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

oscardssmith Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

oscardssmith Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

oscardssmith Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

oscardssmith commented Apr 11, 2026

Uh oh!

DilumAluthge commented Apr 11, 2026

Uh oh!

VanitasCodes commented Apr 11, 2026

Uh oh!

VanitasCodes commented Apr 11, 2026

Uh oh!

DilumAluthge commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants