-
Notifications
You must be signed in to change notification settings - Fork 20
changelog: ignore entries with wrong day padding #508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @TomasTomecek, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses an issue where malformed changelog entries with incorrect day-of-month padding could propagate incorrect formatting. It introduces robust validation for RPM changelog day padding and modifies the entry addition logic to ignore invalid existing entries, ensuring that newly added changelog entries adhere to correct formatting standards. This significantly improves the reliability of changelog processing within the system. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively addresses an issue with incorrect day padding in changelog entries by introducing a validation property has_valid_day_padding and using it to filter out malformed entries when determining padding for new entries. The changes are logical, and the addition of both unit and integration tests ensures the fix is robust. I have one suggestion to make the regular expression in the new validation logic more precise for better long-term maintainability.
| m = re.search( | ||
| rf""" | ||
| ({weekdays}) # weekday | ||
| [ ] | ||
| ({months}) # month | ||
| [ ] | ||
| (?P<wsp>[ ]*) # optional whitespace padding | ||
| (?P<day>(?P<zp>0)?[1-9]\d*) # day of month (1-31) | ||
| [ ] # space after day | ||
| """, | ||
| self.header, | ||
| re.VERBOSE, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regular expression for parsing the day of the month is a bit too loose. [1-9]\\d* can match numbers greater than 31 (e.g., 99). While the subsequent integer conversion handles this for the purpose of checking padding, it's better practice to use a more precise regex to be stricter with the input format and avoid potential issues with malformed changelog entries.\n\nAdditionally, this regex is very similar to those in day_of_month_padding and extended_timestamp. There's an opportunity to reduce code duplication by extracting a common regex pattern or a helper method for date parsing in a future refactoring.
| m = re.search( | |
| rf""" | |
| ({weekdays}) # weekday | |
| [ ] | |
| ({months}) # month | |
| [ ] | |
| (?P<wsp>[ ]*) # optional whitespace padding | |
| (?P<day>(?P<zp>0)?[1-9]\d*) # day of month (1-31) | |
| [ ] # space after day | |
| """, | |
| self.header, | |
| re.VERBOSE, | |
| ) | |
| m = re.search( | |
| rf""" | |
| ({weekdays}) # weekday | |
| [ ] | |
| ({months}) # month | |
| [ ] | |
| (?P<wsp>[ ]*) # optional whitespace padding | |
| (?P<day>(?P<zp>0)?[1-9]|[12]\\d|3[01]) # day of month (1-31) | |
| [ ] # space after day | |
| """, | |
| self.header, | |
| re.VERBOSE, | |
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And even better would be to make use of existing day_of_month_padding() without reimplementing everything from scratch. The biggest question here is how to get the info about nonsensical whitespace padding in a Pythonic way from day_of_month_padding() alongside its current return value, the rest is extremely trivial to implement.
|
❌ pre-commit FAILURE in 2m 00s |
|
Build succeeded. ✔️ pre-commit SUCCESS in 1m 58s |
|
Will update this based on Nikola's feedback. (Funny that Gemini didn't complain about that :) |
|
❌ pre-commit FAILURE in 1m 57s |
specfile/changelog.py
Outdated
|
|
||
| # Reject invalid padding: double-digit days with whitespace | ||
| if day_num >= 10 and wsp: | ||
| return "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem very smart. Why not rather fix nonsensical padding and return it corrected? day_of_month_padding should be renamed to sanitized_day_of_month_padding or similar in such case, to match the implementation, and nothing else has to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great point, will do and will cover it by a test case
specfile/changelog.py
Outdated
| [ ] | ||
| (?P<wsp>[ ]*) # optional whitespace padding | ||
| ((?P<zp>0)?\d|[12]\d|3[01]) # possibly zero-padded day of month | ||
| (?P<day>(?P<zp>0)?[1-9]\d*) # day of month (1-31) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's wrong with a stricter regex? Are we assuming months will have more than 31 days in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm honestly surprised as well, because I saw the agent prefer the stricter regex but disbanded it in the second iteration; naughty claude
specfile/changelog.py
Outdated
| (?P<wsp>[ ]*) # optional whitespace padding | ||
| ((?P<zp>0)?\d|[12]\d|3[01]) # possibly zero-padded day of month | ||
| (?P<day>(?P<zp>0)?[1-9]\d*) # day of month (1-31) | ||
| [ ] # space after day |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't have to be here, does it?
I had to fix the regex to correctly account for double-digit days (only first digit was parsed in those cases). Signed-off-by: Tomas Tomecek <[email protected]> Assisted-by: Claude
|
/gemini review |
|
Build succeeded. ✔️ pre-commit SUCCESS in 1m 57s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively resolves the issue of malformed changelog entries with incorrect day padding. The changes in specfile/changelog.py enhance the robustness of the day_of_month_padding property by refining the regex for day parsing and introducing explicit error handling for invalid day formats and incorrect space padding for double-digit days. The addition of a default value to the max() function in specfile/specfile.py prevents potential ValueError exceptions when processing empty changelogs, ensuring a sensible default padding. The accompanying unit and integration tests have been thoughtfully updated and expanded to cover these new behaviors and edge cases, providing strong validation for the implemented solution.
| padding = max( | ||
| (e.day_of_month_padding for e in reversed(changelog)), | ||
| key=len, | ||
| default="0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's critical about this? Shouldn't this be info then? I mean thanks for the feedback but it's not like I should change something here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without it, if
changelogwere empty
He? See the if changelog: above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The improvement is so critical it is useless 🙂
| [ ] | ||
| (?P<wsp>[ ]*) # optional whitespace padding | ||
| ((?P<zp>0)?\d|[12]\d|3[01]) # possibly zero-padded day of month | ||
| ((?P<zp>0)?(?P<day>[12]\d|3[01]|[1-9])) # possibly zero-padded day of month |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updated regex ((?P<zp>0)?(?P<day>[12]\d|3[01]|[1-9])) is a significant improvement. It explicitly captures the day number and correctly distinguishes between single and double-digit days, making the parsing more robust and less prone to misinterpretation compared to the previous generic \d match. This directly addresses the core parsing issue for day padding.
specfile/changelog.py
Outdated
| if day_num >= 10 and wsp: | ||
| return zp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conditional logic if day_num >= 10 and wsp: correctly identifies and handles malformed entries where a double-digit day is incorrectly preceded by whitespace padding. By returning zp (which would be an empty string or '0'), it ensures that such invalid padding is ignored, aligning with the problem description and the new docstring.
| def test_add_changelog_entry_ignores_invalid_padding(specfile_factory, spec_minimal): | ||
| """Test that add_changelog_entry ignores entries with invalid day padding (issue #216).""" | ||
| spec = specfile_factory(spec_minimal) | ||
|
|
||
| # Simulate scipy scenario: malformed entry with space before double-digit day | ||
| with spec.sections() as sections: | ||
| sections.changelog.data = [ | ||
| "* Mon Dec 11 2006 Author <email> - 0.5.1-5", # INVALID | ||
| "- Initial version", | ||
| "", | ||
| "* Tue May 04 2021 Another <email> - 0.1-0", # VALID (zero-padded) | ||
| "- First version", | ||
| ] | ||
|
|
||
| # Add new entry with single-digit day - should use zero-padding from valid entry | ||
| flexmock(specfile.specfile).should_receive("guess_packager").and_return( | ||
| "Test User <[email protected]>" | ||
| ).once() | ||
| spec.add_changelog_entry("test entry", timestamp=datetime.date(2024, 1, 5)) | ||
|
|
||
| with spec.sections() as sections: | ||
| new_entry = sections.changelog[0] | ||
| # Should use zero-padding style "Jan 05", not space-padding "Jan 5" | ||
| assert "Jan 05" in new_entry | ||
| assert "Jan 5" not in new_entry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new integration test test_add_changelog_entry_ignores_invalid_padding is well-designed. It accurately simulates the problematic scenario (malformed space-padded double-digit day) and verifies that the system correctly ignores invalid padding and applies a valid padding style to new entries. This provides strong confidence in the fix for issue #216.
| ( | ||
| "* Mon Dec 11 2006 Author <email> - 1.0", | ||
| "", | ||
| ), # Invalid: space before double-digit | ||
| ("* Invalid header", ""), # Invalid: unparsable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These new test cases are excellent additions. They specifically target the new logic in day_of_month_padding for handling malformed entries (space before double-digit day) and unparsable headers, ensuring that the function behaves as expected by returning an empty string in these invalid scenarios. This significantly improves unit test coverage for the bug fix.
specfile/changelog.py
Outdated
| try: | ||
| day_num = int(day_str) | ||
| except (ValueError, TypeError): | ||
| return "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is also useless, because the regex guarantees day_str is a number.
| ( | ||
| "* Mon Oct 18 12:34:45 CEST 2021 Nikola Forró <[email protected]> - 0.2-1", | ||
| " ", | ||
| "", # Invalid: double-digit day with space padding, returns "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # day_of_month_padding returns "" for invalid entries, | ||
| # so max() naturally filters them in favor of valid padding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the case anymore.
| """ | ||
| Padding of day of month in the entry header timestamp. | ||
| Returns empty string for invalid entries (e.g., when day is not a number | ||
| or is not between 1 and 31). | ||
| Valid formats: | ||
| - Single-digit days: any padding ("", "0", " ") | ||
| - Double-digit days: no padding ("") | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this docstring. If it really has to be more verbose, how about something like:
| """ | |
| Padding of day of month in the entry header timestamp. | |
| Returns empty string for invalid entries (e.g., when day is not a number | |
| or is not between 1 and 31). | |
| Valid formats: | |
| - Single-digit days: any padding ("", "0", " ") | |
| - Double-digit days: no padding ("") | |
| """ | |
| """ | |
| Padding of day of month in the entry header timestamp. Nonsensical padding | |
| (i.e. multiple spaces before double-digit day of month) is corrected. | |
| """ |
| return m is not None | ||
|
|
||
| @property | ||
| def day_of_month_padding(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name hasn't been updated. I mean, it's not strictly necessary, but it's better to be explicit.
| def day_of_month_padding(self) -> str: | |
| def sanitized_day_of_month_padding(self) -> str: |
Co-authored-by: Nikola Forró <[email protected]>
for more information, see https://pre-commit.ci
|
✔️ pre-commit SUCCESS in 1m 58s |
This first version of this PR was completely implemented by Claude Code and Ambient Code.
After Nikola's amazing review, I fixed the problems myself, especially tinkering with padding fixups and updating the regex that parses day of month to properly account for double-digit days (and don't stop at the first digit - thanks Claude for that one).
Fixes #216
RELEASE NOTES BEGIN
Changelog entries that have incorrect padding set for a day of a month are now fixed if possible or ignored.
RELEASE NOTES END