Skip to content

zebra: Use zebra dplane for RTM route message#248

Open
yutanaofficial wants to merge 2 commits intomasterfrom
aitest-pr-19536
Open

zebra: Use zebra dplane for RTM route message#248
yutanaofficial wants to merge 2 commits intomasterfrom
aitest-pr-19536

Conversation

@yutanaofficial
Copy link
Copy Markdown

Similar to FRRouting#19296, this PR offloads the parsing and handling of RTM route messages to the dplane.

Canzovo added 2 commits May 1, 2026 13:34
Introduce getter and setter functions for route-related attributes in
the zebra dplane context, including startup flag, nexthop ifindex,
multipath status, nexthop group, and single nexthop.

Signed-off-by: Canzovo <1037306097@qq.com>
a) Move the processing of netlink RTM_NEWROUTE and RTM_DELROUTE messages into
   the dataplane.
b) Introduced `rib_route_dplane_update()` to process the queued dplane
   context and perform the actual RIB update using the stored data.
c) Defer VRF lookup and assignment of vrf_id for each nexthop to the
   main thread, as `Map to VRF` should be done here.

Signed-off-by: Canzovo <1037306097@qq.com>
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 1, 2026

Greptile Summary

This PR offloads kernel RTM_NEWROUTE/RTM_DELROUTE netlink message processing from the main pthread to the dplane thread, mirroring the existing neighbour and LSP notification paths. The approach is sound architecturally, but the implementation has several correctness bugs that must be resolved before merging.

  • P0: dplane_ctx_route_set_nexthop_group(ctx, NULL) (when nhe_id != 0 or multipath yields zero nexthops) calls nexthop_group_copy which immediately dereferences the NULL from pointer — instant zebra crash on any route using kernel nexthop IDs.
  • P0: rib_route_dplane_update calls dplane_ctx_fini(&ctx) on early return, but the outer rib_process_dplane_results loop unconditionally calls dplane_ctx_fini after every dispatch — double-free / heap corruption on invalid-table routes.
  • P1: rib_update_nexthop_vrf_id applies a single ifindex-based VRF lookup (from the first rtnexthop entry only) to all ECMP nexthops, regressing the per-nexthop VRF resolution that the original parse_multipath_nexthops_unicast loop performed correctly.

Confidence Score: 1/5

Not safe to merge — two independent P0 crashes affect any deployment using kernel nexthop IDs or routes in non-default tables.

Two P0 bugs (NULL dereference in nexthop_group_copy and double-free in rib_route_dplane_update early return) would crash zebra in well-exercised production paths. A P1 regression silently misassigns VRF IDs to ECMP nexthops. The architectural intent is correct but the implementation needs significant rework before it is safe.

zebra/rt_netlink.c and zebra/zebra_rib.c require the most attention; the new nexthop_group_copy call site and the rib_route_dplane_update lifecycle/VRF logic both have critical defects.

Important Files Changed

Filename Overview
zebra/rt_netlink.c Core change: RTM_NEWROUTE/DELROUTE parsing now enqueues a dplane ctx instead of calling rib_add/rib_delete directly. Introduces P0 NULL dereference in nexthop_group_copy when ng==NULL (nhe_id path), and incorrectly stores only first rtnexthop ifindex for all ECMP nexthops.
zebra/zebra_rib.c New rib_route_dplane_update function reconstructs rib_add/rib_delete from dplane ctx. Contains P0 double-free on early return, P1 NULL dereference on ng->nexthop when empty, and dead error-reporting code due to always-true ng pointer check.
zebra/zebra_dplane.c Adds new ctx accessors for startup, rtnh_ifindex, is_multipath, route_nhg, and nexthop fields. Cleanup in dplane_ctx_free_internal looks correct. dplane_ctx_route_get_nexthop returns a duplicated nexthop (caller must free), which is handled correctly in rib_route_dplane_update.
zebra/zebra_dplane.h Header declarations for new dplane ctx accessors; straightforward, no issues.
zebra/kernel_netlink.c Moves RTM_NEWROUTE/DELROUTE from main netlink socket (groups) to dplane socket (dplane_groups) and routes them to dplane_netlink_information_fetch; approach is architecturally correct.
zebra/rib.h Adds declaration for rib_route_dplane_update; no issues.
zebra/rt_netlink.h Minor: removes a blank line before a block comment; no functional change.

Sequence Diagram

sequenceDiagram
    participant KNL as kernel_netlink (dplane thread)
    participant RTN as rt_netlink
    participant DPLANE as zebra_dplane queue
    participant RIB as zebra_rib (main thread)

    KNL->>RTN: dplane_netlink_information_fetch(RTM_NEWROUTE/DELROUTE)
    RTN->>RTN: netlink_route_change_read_unicast_internal()
    Note over RTN: Parse nlmsghdr into ctx (VRF lookup deferred)
    RTN->>DPLANE: dplane_provider_enqueue_to_zebra(ctx) notif_provider=NOTIF_PROVIDER_KERNEL
    DPLANE-->>RIB: rib_process_dplane_results() wakeup
    RIB->>RIB: notif_provider != 0 → rib_route_dplane_update(ctx)
    Note over RIB: VRF lookup (main pthread only), rib_update_nexthop_vrf_id(), rib_add_multipath() / rib_delete()
    RIB->>DPLANE: dplane_ctx_fini(ctx) [outer loop]
Loading
Prompt To Fix All With AI
Fix the following 5 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 5
zebra/rt_netlink.c:1205
**NULL dereference in `nexthop_group_copy` when `ng == NULL`**

`dplane_ctx_route_set_nexthop_group` calls `nexthop_group_copy(&ctx->u.rinfo.route_nhg, ng)`. Looking at the implementation of `nexthop_group_copy` in `lib/nexthop_group.c`, the very first statement is `to->nhgr = from->nhgr`, with **no NULL check on `from`**. `ng` is NULL in two cases: (1) `nhe_id != 0` (kernel nexthop IDs are in use) — ng is never allocated; (2) `nhe_id == 0` AND multipath AND `nhop_num == 0` (lines 1198-1201 set `ng = NULL`). Both hit this unconditional NULL dereference and crash zebra.

The guard should be added to `dplane_ctx_route_set_nexthop_group` or the call site:

```c
if (ng)
    dplane_ctx_route_set_nexthop_group(ctx, ng);
```

### Issue 2 of 5
zebra/zebra_rib.c:1738-1742
**Double-free when early-returning from `rib_route_dplane_update`**

`rib_process_dplane_results` **always** calls `dplane_ctx_fini(&ctx)` at line 5461 after the dispatch switch, regardless of which branch ran. However, `rib_route_dplane_update` also calls `dplane_ctx_fini(&ctx)` here on the early-return path. Since `ctx` is passed by value (not pointer-to-pointer), the caller's `ctx` variable is not nullified by the callee's `dplane_ctx_fini` call. When `rib_route_dplane_update` returns, the outer loop's unconditional `dplane_ctx_fini` fires a second time on already-freed memory — heap corruption / use-after-free.

Compare with `rib_process_result(ctx)`: it does **not** call `dplane_ctx_fini` internally and relies on the outer loop to clean up. `rib_route_dplane_update` must follow the same contract: never free the ctx, only return early. Remove the `dplane_ctx_fini` call here and simply `return`.

```c
if (IS_ZEBRA_DEBUG_KERNEL)
    zlog_debug("Route opcode: (%s) unable to parse table received %u, ignoring",
               dplane_op2str(op), table);
return;
```

### Issue 3 of 5
zebra/zebra_rib.c:1663-1701
**Multipath VRF ID regression: single ifindex applied to all ECMP nexthops**

The original `parse_multipath_nexthops_unicast` iterated every `rtnexthop` entry, performed an `if_lookup_by_index_per_ns` per nexthop, and created each nexthop with its own resolved `nh_vrf_id`. The new code captures only `rtnh->rtnh_ifindex` of the **first** entry in the multipath list (line 1184 in `rt_netlink.c`) and then stamps that single resolved VRF onto **every** nexthop in `for (ALL_NEXTHOPS_PTR(ng, nh)) nh->vrf_id = nh_vrf_id`. For ECMP routes where different nexthops reach different VRFs through different interfaces this assigns wrong VRF IDs to every nexthop beyond the first, resulting in silently incorrect routing-table entries.

The per-nexthop VRF resolution must be preserved. The fix requires either passing the full `rtnexthop` list to `rib_update_nexthop_vrf_id` and iterating in lockstep, or resolving VRF ID per-nexthop inside a context accessor loop instead of using a single stored `rtnh_ifindex`.

### Issue 4 of 5
zebra/zebra_rib.c:1676
**NULL dereference on `ng->nexthop` when nexthop group is empty**

`rib_update_nexthop_vrf_id` is called unconditionally before the `if (nhe_id || ng)` guard. When `nhe_id != 0` and `!is_multipath`, `ng = dplane_ctx_route_get_nexthop_group(ctx)` returns a pointer to the embedded zero-initialised `route_nhg` field (nexthop is NULL). The function then executes `ng->nexthop->vrf_id = nh_vrf_id` without checking whether `ng->nexthop` is non-NULL, causing a NULL pointer dereference. A guard is needed:

```c
if (ng->nexthop)
    ng->nexthop->vrf_id = nh_vrf_id;
```

### Issue 5 of 5
zebra/zebra_rib.c:1789-1803
**Error-reporting `else` branch is dead code**

`dplane_ctx_route_get_nexthop_group` returns `&ctx->u.rinfo.route_nhg` — always a non-NULL pointer to an embedded struct field. Therefore `if (nhe_id || ng)` is always true and the `else` branch that logs `"multipath RTM_NEWROUTE has a invalid nexthop group"` and frees `re` can never be reached. The sentinel condition should instead test whether the nexthop group actually has members:

```c
if (nhe_id || ng->nexthop) {
    rib_add_multipath(...);
} else {
    zlog_err(...);
    zebra_rib_route_entry_free(re);
}
```

Reviews (1): Last reviewed commit: "zebra: Use zebra dplane for RTM route me..." | Re-trigger Greptile

Comment thread zebra/rt_netlink.c
__func__, &p);
zebra_rib_route_entry_free(re);
dplane_ctx_route_set_is_multipath(ctx, is_multipath);
dplane_ctx_route_set_nexthop_group(ctx, ng);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 NULL dereference in nexthop_group_copy when ng == NULL

dplane_ctx_route_set_nexthop_group calls nexthop_group_copy(&ctx->u.rinfo.route_nhg, ng). Looking at the implementation of nexthop_group_copy in lib/nexthop_group.c, the very first statement is to->nhgr = from->nhgr, with no NULL check on from. ng is NULL in two cases: (1) nhe_id != 0 (kernel nexthop IDs are in use) — ng is never allocated; (2) nhe_id == 0 AND multipath AND nhop_num == 0 (lines 1198-1201 set ng = NULL). Both hit this unconditional NULL dereference and crash zebra.

The guard should be added to dplane_ctx_route_set_nexthop_group or the call site:

if (ng)
    dplane_ctx_route_set_nexthop_group(ctx, ng);
Prompt To Fix With AI
This is a comment left during a code review.
Path: zebra/rt_netlink.c
Line: 1205

Comment:
**NULL dereference in `nexthop_group_copy` when `ng == NULL`**

`dplane_ctx_route_set_nexthop_group` calls `nexthop_group_copy(&ctx->u.rinfo.route_nhg, ng)`. Looking at the implementation of `nexthop_group_copy` in `lib/nexthop_group.c`, the very first statement is `to->nhgr = from->nhgr`, with **no NULL check on `from`**. `ng` is NULL in two cases: (1) `nhe_id != 0` (kernel nexthop IDs are in use) — ng is never allocated; (2) `nhe_id == 0` AND multipath AND `nhop_num == 0` (lines 1198-1201 set `ng = NULL`). Both hit this unconditional NULL dereference and crash zebra.

The guard should be added to `dplane_ctx_route_set_nexthop_group` or the call site:

```c
if (ng)
    dplane_ctx_route_set_nexthop_group(ctx, ng);
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread zebra/zebra_rib.c
Comment on lines +1738 to +1742
zlog_debug("Route opcode: (%s) unable to parse table received %u, ignoring",
dplane_op2str(op), table);
if (ctx)
dplane_ctx_fini(&ctx);
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Double-free when early-returning from rib_route_dplane_update

rib_process_dplane_results always calls dplane_ctx_fini(&ctx) at line 5461 after the dispatch switch, regardless of which branch ran. However, rib_route_dplane_update also calls dplane_ctx_fini(&ctx) here on the early-return path. Since ctx is passed by value (not pointer-to-pointer), the caller's ctx variable is not nullified by the callee's dplane_ctx_fini call. When rib_route_dplane_update returns, the outer loop's unconditional dplane_ctx_fini fires a second time on already-freed memory — heap corruption / use-after-free.

Compare with rib_process_result(ctx): it does not call dplane_ctx_fini internally and relies on the outer loop to clean up. rib_route_dplane_update must follow the same contract: never free the ctx, only return early. Remove the dplane_ctx_fini call here and simply return.

if (IS_ZEBRA_DEBUG_KERNEL)
    zlog_debug("Route opcode: (%s) unable to parse table received %u, ignoring",
               dplane_op2str(op), table);
return;
Prompt To Fix With AI
This is a comment left during a code review.
Path: zebra/zebra_rib.c
Line: 1738-1742

Comment:
**Double-free when early-returning from `rib_route_dplane_update`**

`rib_process_dplane_results` **always** calls `dplane_ctx_fini(&ctx)` at line 5461 after the dispatch switch, regardless of which branch ran. However, `rib_route_dplane_update` also calls `dplane_ctx_fini(&ctx)` here on the early-return path. Since `ctx` is passed by value (not pointer-to-pointer), the caller's `ctx` variable is not nullified by the callee's `dplane_ctx_fini` call. When `rib_route_dplane_update` returns, the outer loop's unconditional `dplane_ctx_fini` fires a second time on already-freed memory — heap corruption / use-after-free.

Compare with `rib_process_result(ctx)`: it does **not** call `dplane_ctx_fini` internally and relies on the outer loop to clean up. `rib_route_dplane_update` must follow the same contract: never free the ctx, only return early. Remove the `dplane_ctx_fini` call here and simply `return`.

```c
if (IS_ZEBRA_DEBUG_KERNEL)
    zlog_debug("Route opcode: (%s) unable to parse table received %u, ignoring",
               dplane_op2str(op), table);
return;
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread zebra/zebra_rib.c
Comment on lines +1663 to +1701
/* Update the VRF ID for the nexthop group */
static void rib_update_nexthop_vrf_id(bool is_multipath, ns_id_t ns_id, int index, int ifindex,
struct nexthop_group *ng, vrf_id_t vrf_id)
{
struct interface *ifp = NULL;
vrf_id_t nh_vrf_id = vrf_id;

if (!is_multipath) {
if (index) {
ifp = if_lookup_by_index_per_ns(zebra_ns_lookup(ns_id), index);
if (ifp)
nh_vrf_id = ifp->vrf->vrf_id;
}
ng->nexthop->vrf_id = nh_vrf_id;
} else {
struct nexthop *nh = NULL;

if (ifindex) {
/*
* Yes we are looking this up
* for every nexthop and just
* using the last one looked
* up right now
*/
ifp = if_lookup_by_index_per_ns(zebra_ns_lookup(ns_id), ifindex);
if (ifp)
nh_vrf_id = ifp->vrf->vrf_id;
else {
flog_warn(EC_ZEBRA_UNKNOWN_INTERFACE,
"%s: Unknown interface %u specified, defaulting to VRF_DEFAULT",
__func__, ifindex);
nh_vrf_id = VRF_DEFAULT;
}
} else
nh_vrf_id = vrf_id;

for (ALL_NEXTHOPS_PTR(ng, nh))
nh->vrf_id = nh_vrf_id;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Multipath VRF ID regression: single ifindex applied to all ECMP nexthops

The original parse_multipath_nexthops_unicast iterated every rtnexthop entry, performed an if_lookup_by_index_per_ns per nexthop, and created each nexthop with its own resolved nh_vrf_id. The new code captures only rtnh->rtnh_ifindex of the first entry in the multipath list (line 1184 in rt_netlink.c) and then stamps that single resolved VRF onto every nexthop in for (ALL_NEXTHOPS_PTR(ng, nh)) nh->vrf_id = nh_vrf_id. For ECMP routes where different nexthops reach different VRFs through different interfaces this assigns wrong VRF IDs to every nexthop beyond the first, resulting in silently incorrect routing-table entries.

The per-nexthop VRF resolution must be preserved. The fix requires either passing the full rtnexthop list to rib_update_nexthop_vrf_id and iterating in lockstep, or resolving VRF ID per-nexthop inside a context accessor loop instead of using a single stored rtnh_ifindex.

Prompt To Fix With AI
This is a comment left during a code review.
Path: zebra/zebra_rib.c
Line: 1663-1701

Comment:
**Multipath VRF ID regression: single ifindex applied to all ECMP nexthops**

The original `parse_multipath_nexthops_unicast` iterated every `rtnexthop` entry, performed an `if_lookup_by_index_per_ns` per nexthop, and created each nexthop with its own resolved `nh_vrf_id`. The new code captures only `rtnh->rtnh_ifindex` of the **first** entry in the multipath list (line 1184 in `rt_netlink.c`) and then stamps that single resolved VRF onto **every** nexthop in `for (ALL_NEXTHOPS_PTR(ng, nh)) nh->vrf_id = nh_vrf_id`. For ECMP routes where different nexthops reach different VRFs through different interfaces this assigns wrong VRF IDs to every nexthop beyond the first, resulting in silently incorrect routing-table entries.

The per-nexthop VRF resolution must be preserved. The fix requires either passing the full `rtnexthop` list to `rib_update_nexthop_vrf_id` and iterating in lockstep, or resolving VRF ID per-nexthop inside a context accessor loop instead of using a single stored `rtnh_ifindex`.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread zebra/zebra_rib.c
if (ifp)
nh_vrf_id = ifp->vrf->vrf_id;
}
ng->nexthop->vrf_id = nh_vrf_id;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 NULL dereference on ng->nexthop when nexthop group is empty

rib_update_nexthop_vrf_id is called unconditionally before the if (nhe_id || ng) guard. When nhe_id != 0 and !is_multipath, ng = dplane_ctx_route_get_nexthop_group(ctx) returns a pointer to the embedded zero-initialised route_nhg field (nexthop is NULL). The function then executes ng->nexthop->vrf_id = nh_vrf_id without checking whether ng->nexthop is non-NULL, causing a NULL pointer dereference. A guard is needed:

if (ng->nexthop)
    ng->nexthop->vrf_id = nh_vrf_id;
Prompt To Fix With AI
This is a comment left during a code review.
Path: zebra/zebra_rib.c
Line: 1676

Comment:
**NULL dereference on `ng->nexthop` when nexthop group is empty**

`rib_update_nexthop_vrf_id` is called unconditionally before the `if (nhe_id || ng)` guard. When `nhe_id != 0` and `!is_multipath`, `ng = dplane_ctx_route_get_nexthop_group(ctx)` returns a pointer to the embedded zero-initialised `route_nhg` field (nexthop is NULL). The function then executes `ng->nexthop->vrf_id = nh_vrf_id` without checking whether `ng->nexthop` is non-NULL, causing a NULL pointer dereference. A guard is needed:

```c
if (ng->nexthop)
    ng->nexthop->vrf_id = nh_vrf_id;
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread zebra/zebra_rib.c
Comment on lines +1789 to +1803
if (nhe_id || ng) {
rib_add_multipath(afi, SAFI_UNICAST, &p, (struct prefix_ipv6 *)&src_p, re,
ng, startup);
} else {
/*
* I really don't see how this is possible
* but since we are testing for it let's
* let the end user know why the route
* that was just received was swallowed
* up and forgotten
*/
zlog_err("%s: %pFX multipath RTM_NEWROUTE has a invalid nexthop group from the kernel",
__func__, &p);
zebra_rib_route_entry_free(re);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Error-reporting else branch is dead code

dplane_ctx_route_get_nexthop_group returns &ctx->u.rinfo.route_nhg — always a non-NULL pointer to an embedded struct field. Therefore if (nhe_id || ng) is always true and the else branch that logs "multipath RTM_NEWROUTE has a invalid nexthop group" and frees re can never be reached. The sentinel condition should instead test whether the nexthop group actually has members:

if (nhe_id || ng->nexthop) {
    rib_add_multipath(...);
} else {
    zlog_err(...);
    zebra_rib_route_entry_free(re);
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: zebra/zebra_rib.c
Line: 1789-1803

Comment:
**Error-reporting `else` branch is dead code**

`dplane_ctx_route_get_nexthop_group` returns `&ctx->u.rinfo.route_nhg` — always a non-NULL pointer to an embedded struct field. Therefore `if (nhe_id || ng)` is always true and the `else` branch that logs `"multipath RTM_NEWROUTE has a invalid nexthop group"` and frees `re` can never be reached. The sentinel condition should instead test whether the nexthop group actually has members:

```c
if (nhe_id || ng->nexthop) {
    rib_add_multipath(...);
} else {
    zlog_err(...);
    zebra_rib_route_entry_free(re);
}
```

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants