Skip to content

Conversation

@TomasTomecek
Copy link
Member

@TomasTomecek TomasTomecek commented Jan 14, 2026

This first version of this PR was completely implemented by Claude Code and Ambient Code.

After Nikola's amazing review, I fixed the problems myself, especially tinkering with padding fixups and updating the regex that parses day of month to properly account for double-digit days (and don't stop at the first digit - thanks Claude for that one).

Fixes #216

RELEASE NOTES BEGIN

Changelog entries that have incorrect padding set for a day of a month are now fixed if possible or ignored.

RELEASE NOTES END

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @TomasTomecek, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where malformed changelog entries with incorrect day-of-month padding could propagate incorrect formatting. It introduces robust validation for RPM changelog day padding and modifies the entry addition logic to ignore invalid existing entries, ensuring that newly added changelog entries adhere to correct formatting standards. This significantly improves the reliability of changelog processing within the system.

Highlights

  • New has_valid_day_padding property: Introduced a property in ChangelogEntry to validate day-of-month padding in RPM changelog headers. It ensures double-digit days (10-31) have no space padding, while single-digit days (1-9) can have any padding, preventing malformed entries.
  • Improved add_changelog_entry method: The add_changelog_entry method now filters out changelog entries with invalid day padding before determining the preferred padding style. If all existing entries are invalid, it defaults to zero-padding, ensuring new entries are always correctly formatted.
  • Comprehensive Testing: Added both unit and integration tests to cover the new validation logic. The unit test verifies has_valid_day_padding with various valid and invalid headers, and the integration test simulates a real-world scenario (issue Ignore clearly incorrect entries when guessing day of month padding when adding changelog entry #216) to confirm invalid entries are ignored.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses an issue with incorrect day padding in changelog entries by introducing a validation property has_valid_day_padding and using it to filter out malformed entries when determining padding for new entries. The changes are logical, and the addition of both unit and integration tests ensures the fix is robust. I have one suggestion to make the regular expression in the new validation logic more precise for better long-term maintainability.

Comment on lines 211 to 195
m = re.search(
rf"""
({weekdays}) # weekday
[ ]
({months}) # month
[ ]
(?P<wsp>[ ]*) # optional whitespace padding
(?P<day>(?P<zp>0)?[1-9]\d*) # day of month (1-31)
[ ] # space after day
""",
self.header,
re.VERBOSE,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regular expression for parsing the day of the month is a bit too loose. [1-9]\\d* can match numbers greater than 31 (e.g., 99). While the subsequent integer conversion handles this for the purpose of checking padding, it's better practice to use a more precise regex to be stricter with the input format and avoid potential issues with malformed changelog entries.\n\nAdditionally, this regex is very similar to those in day_of_month_padding and extended_timestamp. There's an opportunity to reduce code duplication by extracting a common regex pattern or a helper method for date parsing in a future refactoring.

Suggested change
m = re.search(
rf"""
({weekdays}) # weekday
[ ]
({months}) # month
[ ]
(?P<wsp>[ ]*) # optional whitespace padding
(?P<day>(?P<zp>0)?[1-9]\d*) # day of month (1-31)
[ ] # space after day
""",
self.header,
re.VERBOSE,
)
m = re.search(
rf"""
({weekdays}) # weekday
[ ]
({months}) # month
[ ]
(?P<wsp>[ ]*) # optional whitespace padding
(?P<day>(?P<zp>0)?[1-9]|[12]\\d|3[01]) # day of month (1-31)
[ ] # space after day
""",
self.header,
re.VERBOSE,
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And even better would be to make use of existing day_of_month_padding() without reimplementing everything from scratch. The biggest question here is how to get the info about nonsensical whitespace padding in a Pythonic way from day_of_month_padding() alongside its current return value, the rest is extremely trivial to implement.

@centosinfra-prod-github-app
Copy link
Contributor

@centosinfra-prod-github-app
Copy link
Contributor

@TomasTomecek TomasTomecek marked this pull request as draft January 15, 2026 09:31
@TomasTomecek
Copy link
Member Author

Will update this based on Nikola's feedback. (Funny that Gemini didn't complain about that :)

@centosinfra-prod-github-app
Copy link
Contributor


# Reject invalid padding: double-digit days with whitespace
if day_num >= 10 and wsp:
return ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem very smart. Why not rather fix nonsensical padding and return it corrected? day_of_month_padding should be renamed to sanitized_day_of_month_padding or similar in such case, to match the implementation, and nothing else has to change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great point, will do and will cover it by a test case

[ ]
(?P<wsp>[ ]*) # optional whitespace padding
((?P<zp>0)?\d|[12]\d|3[01]) # possibly zero-padded day of month
(?P<day>(?P<zp>0)?[1-9]\d*) # day of month (1-31)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with a stricter regex? Are we assuming months will have more than 31 days in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm honestly surprised as well, because I saw the agent prefer the stricter regex but disbanded it in the second iteration; naughty claude

(?P<wsp>[ ]*) # optional whitespace padding
((?P<zp>0)?\d|[12]\d|3[01]) # possibly zero-padded day of month
(?P<day>(?P<zp>0)?[1-9]\d*) # day of month (1-31)
[ ] # space after day
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't have to be here, does it?

I had to fix the regex to correctly account for double-digit days (only
first digit was parsed in those cases).

Signed-off-by: Tomas Tomecek <[email protected]>
Assisted-by: Claude
@TomasTomecek TomasTomecek marked this pull request as ready for review January 15, 2026 15:42
@TomasTomecek
Copy link
Member Author

/gemini review

@centosinfra-prod-github-app
Copy link
Contributor

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively resolves the issue of malformed changelog entries with incorrect day padding. The changes in specfile/changelog.py enhance the robustness of the day_of_month_padding property by refining the regex for day parsing and introducing explicit error handling for invalid day formats and incorrect space padding for double-digit days. The addition of a default value to the max() function in specfile/specfile.py prevents potential ValueError exceptions when processing empty changelogs, ensuring a sensible default padding. The accompanying unit and integration tests have been thoughtfully updated and expanded to cover these new behaviors and edge cases, providing strong validation for the implemented solution.

padding = max(
(e.day_of_month_padding for e in reversed(changelog)),
key=len,
default="0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Adding default="0" to the max() function is a critical improvement. Without it, if changelog were empty, max() would raise a ValueError, leading to a crash. This ensures that padding is correctly initialized to "0" in such scenarios, preventing a runtime error and providing a sensible default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's critical about this? Shouldn't this be info then? I mean thanks for the feedback but it's not like I should change something here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without it, if changelog were empty

He? See the if changelog: above?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The improvement is so critical it is useless 🙂

[ ]
(?P<wsp>[ ]*) # optional whitespace padding
((?P<zp>0)?\d|[12]\d|3[01]) # possibly zero-padded day of month
((?P<zp>0)?(?P<day>[12]\d|3[01]|[1-9])) # possibly zero-padded day of month
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The updated regex ((?P<zp>0)?(?P<day>[12]\d|3[01]|[1-9])) is a significant improvement. It explicitly captures the day number and correctly distinguishes between single and double-digit days, making the parsing more robust and less prone to misinterpretation compared to the previous generic \d match. This directly addresses the core parsing issue for day padding.

Comment on lines 208 to 209
if day_num >= 10 and wsp:
return zp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This conditional logic if day_num >= 10 and wsp: correctly identifies and handles malformed entries where a double-digit day is incorrectly preceded by whitespace padding. By returning zp (which would be an empty string or '0'), it ensures that such invalid padding is ignored, aligning with the problem description and the new docstring.

Comment on lines +229 to +253
def test_add_changelog_entry_ignores_invalid_padding(specfile_factory, spec_minimal):
"""Test that add_changelog_entry ignores entries with invalid day padding (issue #216)."""
spec = specfile_factory(spec_minimal)

# Simulate scipy scenario: malformed entry with space before double-digit day
with spec.sections() as sections:
sections.changelog.data = [
"* Mon Dec 11 2006 Author <email> - 0.5.1-5", # INVALID
"- Initial version",
"",
"* Tue May 04 2021 Another <email> - 0.1-0", # VALID (zero-padded)
"- First version",
]

# Add new entry with single-digit day - should use zero-padding from valid entry
flexmock(specfile.specfile).should_receive("guess_packager").and_return(
"Test User <[email protected]>"
).once()
spec.add_changelog_entry("test entry", timestamp=datetime.date(2024, 1, 5))

with spec.sections() as sections:
new_entry = sections.changelog[0]
# Should use zero-padding style "Jan 05", not space-padding "Jan 5"
assert "Jan 05" in new_entry
assert "Jan 5" not in new_entry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The new integration test test_add_changelog_entry_ignores_invalid_padding is well-designed. It accurately simulates the problematic scenario (malformed space-padded double-digit day) and verifies that the system correctly ignores invalid padding and applies a valid padding style to new entries. This provides strong confidence in the fix for issue #216.

Comment on lines +97 to +101
(
"* Mon Dec 11 2006 Author <email> - 1.0",
"",
), # Invalid: space before double-digit
("* Invalid header", ""), # Invalid: unparsable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

These new test cases are excellent additions. They specifically target the new logic in day_of_month_padding for handling malformed entries (space before double-digit day) and unparsable headers, ensuring that the function behaves as expected by returning an empty string in these invalid scenarios. This significantly improves unit test coverage for the bug fix.

Comment on lines 203 to 206
try:
day_num = int(day_str)
except (ValueError, TypeError):
return ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Adding a try-except block for int(day_str) is a good practice. It gracefully handles cases where day_str might not be a valid integer, preventing potential crashes and ensuring that invalid entries result in an empty padding string as intended by the docstring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is also useless, because the regex guarantees day_str is a number.

(
"* Mon Oct 18 12:34:45 CEST 2021 Nikola Forró <[email protected]> - 0.2-1",
" ",
"", # Invalid: double-digit day with space padding, returns ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Updating the expected padding from " " to "" for this test case correctly reflects the new behavior of day_of_month_padding, which now treats space padding for double-digit days as invalid and returns an empty string.

Comment on lines +645 to +646
# day_of_month_padding returns "" for invalid entries,
# so max() naturally filters them in favor of valid padding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the case anymore.

Comment on lines +171 to +180
"""
Padding of day of month in the entry header timestamp.
Returns empty string for invalid entries (e.g., when day is not a number
or is not between 1 and 31).
Valid formats:
- Single-digit days: any padding ("", "0", " ")
- Double-digit days: no padding ("")
"""
Copy link
Member

@nforro nforro Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this docstring. If it really has to be more verbose, how about something like:

Suggested change
"""
Padding of day of month in the entry header timestamp.
Returns empty string for invalid entries (e.g., when day is not a number
or is not between 1 and 31).
Valid formats:
- Single-digit days: any padding ("", "0", " ")
- Double-digit days: no padding ("")
"""
"""
Padding of day of month in the entry header timestamp. Nonsensical padding
(i.e. multiple spaces before double-digit day of month) is corrected.
"""

return m is not None

@property
def day_of_month_padding(self) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name hasn't been updated. I mean, it's not strictly necessary, but it's better to be explicit.

Suggested change
def day_of_month_padding(self) -> str:
def sanitized_day_of_month_padding(self) -> str:

@centosinfra-prod-github-app
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

Ignore clearly incorrect entries when guessing day of month padding when adding changelog entry

2 participants