Skip to content

[Fix-18131] Workflow instance stuck in RUNNING state forever when using CONTINUE failure strategy with a failed upstream task#18146

Open
SbloodyS wants to merge 8 commits intoapache:devfrom
SbloodyS:fix_18131
Open

[Fix-18131] Workflow instance stuck in RUNNING state forever when using CONTINUE failure strategy with a failed upstream task#18146
SbloodyS wants to merge 8 commits intoapache:devfrom
SbloodyS:fix_18131

Conversation

@SbloodyS
Copy link
Copy Markdown
Member

@SbloodyS SbloodyS commented Apr 8, 2026

Was this PR generated or assisted by AI?

Yes. IT is generated by AI.

Purpose of the pull request

fix #18131

Brief change log

Verify this pull request

This pull request is code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(or)

Pull Request Notice

Pull Request Notice

If your pull request contains incompatible change, you should also add it to docs/docs/en/guide/upgrade/incompatible.md

@SbloodyS SbloodyS self-assigned this Apr 8, 2026
@SbloodyS SbloodyS added this to the 3.4.2 milestone Apr 8, 2026
@SbloodyS SbloodyS added bug Something isn't working backend labels Apr 8, 2026
@github-actions github-actions bot added the test label Apr 8, 2026
@SbloodyS SbloodyS requested a review from zhongjiajie April 9, 2026 01:25
zhongjiajie
zhongjiajie previously approved these changes Apr 9, 2026
Copy link
Copy Markdown
Member

@ruanwenjun ruanwenjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we only need to add a new judge at AbstractWorkflowStateAction
line 90, if all triggerCandidateTasks's predecessor are inactive, and readyToTriggerTasks is empty then call emitWorkflowFinishedEventIfApplicable

protected void triggerTasks(final IWorkflowExecutionRunnable workflowExecutionRunnable,
                                final List<ITaskExecutionRunnable> triggerCandidateTasks) {
        final IWorkflowExecutionGraph workflowExecutionGraph = workflowExecutionRunnable.getWorkflowExecutionGraph();
        final List<ITaskExecutionRunnable> readyToTriggerTasks = triggerCandidateTasks
                .stream()
                .filter(workflowExecutionGraph::isTriggerConditionMet)
                .sorted(Comparator.comparing(ITaskExecutionRunnable::getName))
                .collect(Collectors.toList());
        final boolean isAllCandidateTaskPredecessorFinished = triggerCandidateTasks.stream()
                .flatMap(t -> workflowExecutionGraph.getPredecessors(t.getName()).stream())
                .allMatch(workflowExecutionGraph::isTaskExecutionRunnableInActive);

        if (CollectionUtils.isEmpty(readyToTriggerTasks)) {
            if (isAllCandidateTaskPredecessorFinished) {
                emitWorkflowFinishedEventIfApplicable(workflowExecutionRunnable);
            }
            return;
        }

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented Apr 9, 2026

Quality Gate Failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 60%)

See analysis details on SonarQube Cloud

.assertThat(repository.queryWorkflowInstance(workflowInstanceId))
.matches(
workflowInstance -> workflowInstance.getState() == WorkflowExecutionStatus.SUCCESS);
workflowInstance -> workflowInstance.getState() == WorkflowExecutionStatus.FAILURE);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may don't need to change this case?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. You are right. I reverted this.

taskType: LogicFakeTask
taskParams: '{"localParams":null,"varPool":[],"shellScript":"echo success"}'
workerGroup: default
environmentCode: 144873539254368
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may don't need to change this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will cause local execution exceptions...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Workflow instance stuck in RUNNING state forever when using CONTINUE failure strategy with a failed upstream task

3 participants