Skip to content

bgpd: initialise nh_flag attribute#225

Open
mwinter-osr wants to merge 1 commit intoopensourcerouting:masterfrom
mwinter-osr:PR21498
Open

bgpd: initialise nh_flag attribute#225
mwinter-osr wants to merge 1 commit intoopensourcerouting:masterfrom
mwinter-osr:PR21498

Conversation

@mwinter-osr
Copy link
Copy Markdown
Member

in "leak_update" function
nh_flag is not initialized when nexthop is valid at route creation.

Original PR21498 by fdumontet6WIND

in "leak_update" function
nh_flag is not initialized when nexthop is valid at route creation.

Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 14, 2026

Greptile Summary

This single-line fix initialises BGP_ATTR_NH_VALID in nh_flags when a new leaked route is installed via leak_update, addressing a symmetry gap with the existing update path. The intent is correct, but the implementation applies SET_FLAG on an attribute that is already interned (new_attr is explicitly documented as "already interned" in the function signature comment). nh_flags is included in attrhash_cmp (bgp_attr.c:1100) but not in attrhash_key_make (bgp_attr.c:1048), meaning in-place mutation after interning creates a hash-state divergence.

  • P1: SET_FLAG(new->attr->nh_flags, BGP_ATTR_NH_VALID) mutates a hashed, ref-counted attribute object. The real root cause is in vpn_leak_to_vrf_update at line 2485 where if (bpi && leak_update_nexthop_valid(...)) silently skips BGP_ATTR_NH_VALID for new routes (bpi == NULL). The fix belongs on static_attr before bgp_attr_intern() at line 2615, not inside leak_update on the already-interned pointer.
  • P0: No topotest is included for this VPN route-leaking regression, which is required for merge.

Confidence Score: 4/5

  • P1 architectural issue (interned-attr mutation) and missing P0 topotest block merge as-is.
  • The bug being fixed is real and the symptom (BGP_PATH_VALID set without BGP_ATTR_NH_VALID) is genuine, but the implementation mutates an already-interned attribute, which is architecturally wrong and can cause attrhash inconsistencies on subsequent intern lookups. The correct fix location is upstream in vpn_leak_to_vrf_update. Additionally, no topotest was added, which is a mandatory P0 requirement per repo rules.
  • bgpd/bgp_mplsvpn.c — specifically the root-cause location in vpn_leak_to_vrf_update (line 2485) and the new fix at line 1445.

Important Files Changed

Filename Overview
bgpd/bgp_mplsvpn.c Adds SET_FLAG(new->attr->nh_flags, BGP_ATTR_NH_VALID) on route creation, but does so on an already-interned attribute (comment on line 1225 says "already interned"), which is architecturally incorrect. The root cause is in vpn_leak_to_vrf_update where bpi==NULL short-circuits the nexthop validity check before bgp_attr_intern. Also missing the symmetric UNSET_FLAG in the else branch.

Sequence Diagram

sequenceDiagram
    participant VPN as vpn_leak_to_vrf_update
    participant ATTR as bgp_attr_intern
    participant HASH as attrhash
    participant LU as leak_update

    VPN->>VPN: "bpi==NULL → else: UNSET_FLAG(static_attr.nh_flags, BGP_ATTR_NH_VALID)"
    VPN->>ATTR: "bgp_attr_intern(&static_attr) [nh_flags=0]"
    ATTR->>HASH: "hash_get → allocate entry {nh_flags=0, refcnt=1}"
    ATTR-->>VPN: "new_attr* [interned, nh_flags=0]"
    VPN->>LU: leak_update(..., new_attr [already interned])
    LU->>LU: "info_make → new->attr = new_attr"
    LU->>LU: leak_update_nexthop_valid → true
    LU->>LU: bgp_path_info_set_flag(BGP_PATH_VALID)
    Note over LU,HASH: ⚠️ PR fix: SET_FLAG on interned attr<br/>(hash entry mutated in-place)
    LU->>HASH: "SET_FLAG(new->attr->nh_flags, BGP_ATTR_NH_VALID) → entry now {nh_flags=1}"

    Note over VPN,HASH: Correct approach: set nh_flags on static_attr BEFORE bgp_attr_intern
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: bgpd/bgp_mplsvpn.c
Line: 1445-1447

Comment:
**Mutating an already-interned attribute in-place**

`new->attr` points to the interned attribute returned by `bgp_attr_intern()` in the caller (`comment: "already interned"` on line 1225). `nh_flags` participates in `attrhash_cmp()` (see `bgp_attr.c:1100`), so modifying it post-intern creates a hash table inconsistency: any subsequent `bgp_attr_intern` for an attribute with the same other fields but `nh_flags = 0` will not find this (now-modified) entry and will allocate a duplicate hash entry.

The root cause is upstream: in `vpn_leak_to_vrf_update` the guard `if (bpi && leak_update_nexthop_valid(...))` short-circuits to the `else` branch (unconditionally calling `UNSET_FLAG`) when `bpi == NULL` — i.e., on new-route creation, `BGP_ATTR_NH_VALID` is never set on `static_attr` before `bgp_attr_intern` at line 2615. The fix must be applied there, setting `nh_flags` on `static_attr` **before** interning, following the same pattern used at lines 2487-2489 but without the `bpi &&` short-circuit.

Additionally, the `else` branch (nexthop invalid) does not `UNSET_FLAG`, unlike the symmetric pattern elsewhere:

```suggestion
		bgp_path_info_set_flag(bn, new, BGP_PATH_VALID);
		SET_FLAG(new->attr->nh_flags, BGP_ATTR_NH_VALID);
	} else {
		bgp_path_info_unset_flag(bn, new, BGP_PATH_VALID);
		UNSET_FLAG(new->attr->nh_flags, BGP_ATTR_NH_VALID);
	}
```

But the real fix must be made in the caller so the flag is baked in before `bgp_attr_intern()`, not patched onto the hash entry afterward.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "bgpd: initialise nh_flag attribute" | Re-trigger Greptile

Comment thread bgpd/bgp_mplsvpn.c
Comment on lines +1445 to 1447
SET_FLAG(new->attr->nh_flags, BGP_ATTR_NH_VALID);
} else
bgp_path_info_unset_flag(bn, new, BGP_PATH_VALID);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Mutating an already-interned attribute in-place

new->attr points to the interned attribute returned by bgp_attr_intern() in the caller (comment: "already interned" on line 1225). nh_flags participates in attrhash_cmp() (see bgp_attr.c:1100), so modifying it post-intern creates a hash table inconsistency: any subsequent bgp_attr_intern for an attribute with the same other fields but nh_flags = 0 will not find this (now-modified) entry and will allocate a duplicate hash entry.

The root cause is upstream: in vpn_leak_to_vrf_update the guard if (bpi && leak_update_nexthop_valid(...)) short-circuits to the else branch (unconditionally calling UNSET_FLAG) when bpi == NULL — i.e., on new-route creation, BGP_ATTR_NH_VALID is never set on static_attr before bgp_attr_intern at line 2615. The fix must be applied there, setting nh_flags on static_attr before interning, following the same pattern used at lines 2487-2489 but without the bpi && short-circuit.

Additionally, the else branch (nexthop invalid) does not UNSET_FLAG, unlike the symmetric pattern elsewhere:

Suggested change
SET_FLAG(new->attr->nh_flags, BGP_ATTR_NH_VALID);
} else
bgp_path_info_unset_flag(bn, new, BGP_PATH_VALID);
bgp_path_info_set_flag(bn, new, BGP_PATH_VALID);
SET_FLAG(new->attr->nh_flags, BGP_ATTR_NH_VALID);
} else {
bgp_path_info_unset_flag(bn, new, BGP_PATH_VALID);
UNSET_FLAG(new->attr->nh_flags, BGP_ATTR_NH_VALID);
}

But the real fix must be made in the caller so the flag is baked in before bgp_attr_intern(), not patched onto the hash entry afterward.

Prompt To Fix With AI
This is a comment left during a code review.
Path: bgpd/bgp_mplsvpn.c
Line: 1445-1447

Comment:
**Mutating an already-interned attribute in-place**

`new->attr` points to the interned attribute returned by `bgp_attr_intern()` in the caller (`comment: "already interned"` on line 1225). `nh_flags` participates in `attrhash_cmp()` (see `bgp_attr.c:1100`), so modifying it post-intern creates a hash table inconsistency: any subsequent `bgp_attr_intern` for an attribute with the same other fields but `nh_flags = 0` will not find this (now-modified) entry and will allocate a duplicate hash entry.

The root cause is upstream: in `vpn_leak_to_vrf_update` the guard `if (bpi && leak_update_nexthop_valid(...))` short-circuits to the `else` branch (unconditionally calling `UNSET_FLAG`) when `bpi == NULL` — i.e., on new-route creation, `BGP_ATTR_NH_VALID` is never set on `static_attr` before `bgp_attr_intern` at line 2615. The fix must be applied there, setting `nh_flags` on `static_attr` **before** interning, following the same pattern used at lines 2487-2489 but without the `bpi &&` short-circuit.

Additionally, the `else` branch (nexthop invalid) does not `UNSET_FLAG`, unlike the symmetric pattern elsewhere:

```suggestion
		bgp_path_info_set_flag(bn, new, BGP_PATH_VALID);
		SET_FLAG(new->attr->nh_flags, BGP_ATTR_NH_VALID);
	} else {
		bgp_path_info_unset_flag(bn, new, BGP_PATH_VALID);
		UNSET_FLAG(new->attr->nh_flags, BGP_ATTR_NH_VALID);
	}
```

But the real fix must be made in the caller so the flag is baked in before `bgp_attr_intern()`, not patched onto the hash entry afterward.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants