[Feature Request] Implement QRTZ_TRIGGERS Fine-Grained Row-Level Locking to replace Global QRTZ_LOCKS.TRIGGER_ACCESS Mutex

## 1. Introduction: The Scalability Wall of Cluster-Wide Locking

Currently, Quartz relies on a pessimistic **cluster-wide locking** strategy via the `QRTZ_LOCKS` table (specifically the `TRIGGER_ACCESS` lock). While this ensures high data integrity, it creates a significant scalability bottleneck in clustered environments or high-throughput instances processing thousands of jobs per minute.

The official Quartz documentation acknowledges this architectural "speed limit" in the [Quartz FAQ](https://www.quartz-scheduler.org/documentation/quartz-2.5.x/faq.html):

> "...The clustering feature works best for scaling out long-running and/or cpu-intensive jobs (distributing the work-load over multiple nodes). If you need to scale out to support thousands of short-running (e.g 1 second) jobs, consider partitioning the set of jobs by using multiple distinct schedulers. **Using one scheduler forces the use of a cluster-wide lock, a pattern that degrades performance as you add more clients.**"

**The Problem:**

-   **Sequential Acquisition:** Even with multiple nodes, the cluster-wide lock acts as a global mutex. Only one node can "acquire" jobs at a time; others sit idle in a database "wait" state.
    
-   **Completion Bottleneck:** Since the completion phase is thread-per-job, hundreds of worker threads across the cluster compete for the same single row in `QRTZ_LOCKS` to update trigger states.
    
-   **Payload Tax:** Jobs with large `JobDataMaps` (property bags) increase the transaction hold time, exponentially increasing the wait-queue for the cluster-wide lock.
    

----------

## 2. Conceptual Proposal: Row-Based Concurrency

We propose an optional high-performance locking mode that moves synchronization from a **Cluster-Wide Mutex** to **Fine-Grained Row-Level Locking** on the `QRTZ_TRIGGERS` table.

By utilizing modern SQL features like `FOR UPDATE SKIP LOCKED` (available in PostgreSQL 9.5+, Oracle 11g+, MySQL 8.0+, and MS SQL Server), Quartz can achieve true horizontal scalability:

-   **Parallel Acquisition:** Multiple nodes can query the triggers table simultaneously. Node A grabs a batch; Node B _skips_ those locked rows and immediately grabs the next available batch without blocking.
    
-   **Parallel Completion:** Completion logic would lock only the specific row being updated, allowing hundreds of threads to commit state changes simultaneously.
    

----------

## 3. Technical Example: Proposed SQL Implementation

The `DriverDelegate` would be updated to support a "Row-Locking" mode. Below is the conceptual SQL logic:

### A. High-Concurrency Acquisition

Instead of locking a global semaphore, the acquisition query targets rows directly:

SQL

```
-- Acquisition with Skip Locked
SELECT TRIGGER_NAME, TRIGGER_GROUP, NEXT_FIRE_TIME, PRIORITY 
FROM QRTZ_TRIGGERS 
WHERE SCHED_NAME = 'MyScheduler' 
  AND TRIGGER_STATE = 'WAITING' 
  AND NEXT_FIRE_TIME <= :now 
ORDER BY NEXT_FIRE_TIME ASC, PRIORITY DESC 
FOR UPDATE SKIP LOCKED 
LIMIT :batchSize;

```

### B. Row-Level State Transition (Completion)

During completion, we avoid the cluster-wide lock and target the specific trigger row to ensure atomicity.

**Step 1: Lock the specific trigger row**

SQL

```
-- Explicitly lock only the specific row being finalized
SELECT TRIGGER_STATE 
FROM QRTZ_TRIGGERS 
WHERE SCHED_NAME = 'MyScheduler' 
  AND TRIGGER_NAME = :name 
  AND TRIGGER_GROUP = :group 
FOR UPDATE;

```

**Step 2: Update the state**

SQL

```
-- Completion update using the held row-level lock
UPDATE QRTZ_TRIGGERS 
SET TRIGGER_STATE = 'WAITING', 
    NEXT_FIRE_TIME = :nextTime, 
    PREVIOUS_FIRE_TIME = :lastTime 
WHERE SCHED_NAME = 'MyScheduler' 
  AND TRIGGER_NAME = :name 
  AND TRIGGER_GROUP = :group;

```

----------

## 4. Suggested Configuration

We suggest a new property to enable this behavior: `org.quartz.jobStore.useRowLevelLocking = true`

This feature would allow a single Quartz scheduler instance to handle the "thousands of short-running jobs" use case without requiring the complex partitioning currently recommended in the documentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Implement QRTZ_TRIGGERS Fine-Grained Row-Level Locking to replace Global QRTZ_LOCKS.TRIGGER_ACCESS Mutex #1481

1. Introduction: The Scalability Wall of Cluster-Wide Locking

2. Conceptual Proposal: Row-Based Concurrency

3. Technical Example: Proposed SQL Implementation

A. High-Concurrency Acquisition

B. Row-Level State Transition (Completion)

4. Suggested Configuration

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Implement QRTZ_TRIGGERS Fine-Grained Row-Level Locking to replace Global QRTZ_LOCKS.TRIGGER_ACCESS Mutex #1481

Description

1. Introduction: The Scalability Wall of Cluster-Wide Locking

2. Conceptual Proposal: Row-Based Concurrency

3. Technical Example: Proposed SQL Implementation

A. High-Concurrency Acquisition

B. Row-Level State Transition (Completion)

4. Suggested Configuration

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions