Remove unreferenced arguments from S3 endpoint ruleset#3529
Remove unreferenced arguments from S3 endpoint ruleset#3529akx wants to merge 1 commit intoboto:developfrom
Conversation
None of these arguments are referenced by the ruleset, but set of arguments is used for caching endpoint resolution results. This means that the entire endpoint ruleset is evaluated (only to end up with the same result) for every S3-related request with a different Key. Refs boto#3528
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## develop #3529 +/- ##
===========================================
+ Coverage 92.94% 93.13% +0.18%
===========================================
Files 66 68 +2
Lines 14956 15324 +368
===========================================
+ Hits 13901 14272 +371
+ Misses 1055 1052 -3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi @akx, Thank you for the PR. Unfortunately those models are owned upstream by the service team and can be updated at any time. Even if we were to merge this today, if the service team makes any unrelated changes to their endpoints tomorrow, this file would just be overwritten and the keys would come back. While these aren't used in ruleset evaluation for botocore, my current understanding is that they are used by other AWS products and cannot be removed from the upstream ruleset. I am still working on gathering more specifics. Thank you for calling this out - it doesn't make sense for us to slow down evaluation for botocore users based on parameters we don't use. We should either come up with an alternate solution here that causes unused keys to be ignored for caching purposes or we should find a way to remove these upstream. |
|
@SamRemis Hi, thanks for looking into this! Yeah, I figured this isn't the correct way to fix this given these lines had been added by an automated bot commit anyway, but I wanted to get the ball rolling.
As mentioned above, the JavaScript SDK has a better set of cache keys ( It looks like the One somewhat simple option could be to add some tribal knowledge to the resolver in botocore for this particular use: iff the ruleset is for S3, and the original set of parameters in the ruleset is this known one, then remove the three unused args before passing to caching/evaluation? Also probably a test that would fail if the S3 ruleset's parameters change, so maintainers would know to take action to adjust the code. Another more labor-intensive option I can think of that would probably be for the best in the longer run is to also codegen ruleset evaluation into Python, instead of having some Python code evaluate the JSON at runtime. |
|
I see the endpoint rule set got updated recently, but AISI, no changes to the unused arguments. |
None of these arguments are referenced by the ruleset, but the set of arguments is used for caching endpoint resolution results via
botocore/botocore/endpoint_provider.py
Lines 707 to 708 in 7ddc232
This means that the entire endpoint ruleset is evaluated (only to end up with the same result) for every S3-related request with a different Key, IOW for every file stored in S3. Since evaluating the rule tree is pretty slow, this causes significant slowdowns in mass operations, not to mention memory bloat for the (useless) cache.
The three removed parameters are not in the equivalent JS SDK code either.
Refs #3528.