Define polyfilled T_* constants from Tokenizer as int#1292
Define polyfilled T_* constants from Tokenizer as int#1292fredden wants to merge 20 commits intoPHPCSStandards:4.xfrom
Conversation
|
@jrfnl the 'validate' step for "Find use of Tokens properties" check is failing on this code. Would you like me to rewrite the check or the code to avoid this? |
What gets flagged is this: I think updating the check to exclude this particular use would be most appropriate, though I'm not sure how straight-forward that would be. |
9e01bd1 to
f9e71ea
Compare
|
I have worked out a way to use a static method after all. This means we don't need to change the CI/CD check for external use of deprecated properties. |
jrfnl
left a comment
There was a problem hiding this comment.
Hi @fredden, thanks for that update and sorry for my slow response.
Creative solution and I appreciate the inline comments about the timing aspect of the static method and the fragility of the setup. That should hopefully prevent breakage when this method is touched in the future.
The fragility is a little concerning, but if it works for normal PHPCS CLI use and the PHPCS native test setup (which it does), I'm okay with it (though I wonder which user of an external dependency will complain first about something breaking because of this change...)
Conceptually, I have two questions:
-
The assignment to
$polyfillMappingTable[constant($tokenName)]does not contain any protection against overwriting an existing array index.
While token constants should be unique, if another external tool does this incorrectly and that code would run before the PHPCS polyfill code, it will cause us problems: https://3v4l.org/OCE9X#veol (see the output for PHP 7.2 - 8.0)
We may need to throw an Exception and just block the run if this happens. What do you think ? -
Should the method include protection against being called twice ? (as it is a
public staticmethod defining crucial information) ?
I've not been able to come up with a scenario in which this becomes problematic, but it still feels risky.
Also see https://3v4l.org/bdlTc - when called the second time, each polyfilled token would overwrite its previously created own entry in theTokens::$polyfillMappingTable.
Nitpicky things:
polyfillTokenizerConstants()- there is the PHPTokenizerand the PHPCSTokenizer, purely based on the method name, this could confuse people.
polyfillPHPNativeTokenizerConstants()may be a bit wordy, but might be clearer ?
You might also want to update the method doc block ? (add "PHP native" to the summary, remove "PHP native" from the@internalnote)
| /** | ||
| * Mapping table for polyfilled constants | ||
| * | ||
| * @var array<int, string> |
There was a problem hiding this comment.
| * @var array<int, string> | |
| * @var array<int, string> |
No need to change anything, but just pointing out that we can't actually be sure/guarantee that the key will always be an int as - just like PHPCS did - other tooling could have polyfilled the tokens with some other type of value.
It is because an external project decided to "validate" the value of these polyfilled constants (and complain about our historic choice of values) that we are changing the values we assign.
Collisions are a problem, yes. I have spent some time today and I think I've solved the concerns raised. I made our collision detection more robust and added a check for any collisions before we ran our definitions. These two steps should mean that any other libraries which define collisions will be detected and reported before our code starts getting confused.
No. Calling the method more than once should not cause any problems. I don't see value in adding protection against this.
This name was specifically chosen to allow for us to move the PHP_CodeSniffer tokeniser constants into this method in future. |
4c0104b to
74e2bd4
Compare
jrfnl
left a comment
There was a problem hiding this comment.
We discussed this PR in a call today and what's missing - but also very difficult to add - is tests.
We've looked into adding phpt based tests for this and @fredden will continue working on that.
The principle of this PR is validated and agreed upon though, so this will go into the PHPCS 4.1.0 release.
The value of these constants are not stable, and therefore already cannot be relied upon. This is because the specific values that PHP assigns can change with different versions of PHP. PHPCS does not use the values of these constants (other than to look up their name using the Tokens::tokenName() method). There are other tools which also polyfill these constants. Some of those tools also perform validation on the value for these constants. In order to play nicely with the arbitrary validation that other tools perform on these constants, we are switching from string values to integer values. All PHPCS 'native' tokens currently have reliable values. In line with PHP T_* constants, the values of these tokens should never be relied upon. In a future version of PHPCS, the values for these tokens will switch from strings to integers. Existing tests already cover the use of these constants and do not require adjustment for the code being changed here.
74e2bd4 to
7b972cb
Compare
jrfnl
left a comment
There was a problem hiding this comment.
Awesome work on the tests @fredden !
I've just left some nitpicks comments inline to make the test code largely comply with the coding standards. We're not automatically checking against the standards for phpt files currently, nor would that work out-of-the-box.
Might be worth opening an issue to consider scanning phpt files as a feature enhancement ?
In my opinion, it would be a low priority enhancement as the phpt file format isn't widely used, but, if nothing else, it would help this project to keep the phpt files in line with the rest of the code.
tests/EndToEndPhpt/Util/Tokens/polyfillTokenizerConstants-collision-php.phpt
Outdated
Show resolved
Hide resolved
tests/EndToEndPhpt/Util/Tokens/polyfillTokenizerConstants-collision-php.phpt
Outdated
Show resolved
Hide resolved
tests/EndToEndPhpt/Util/Tokens/polyfillTokenizerConstants-collision-user.phpt
Outdated
Show resolved
Hide resolved
tests/EndToEndPhpt/Util/Tokens/polyfillTokenizerConstants-collision-user.phpt
Outdated
Show resolved
Hide resolved
tests/EndToEndPhpt/Util/Tokens/polyfillTokenizerConstants-number.phpt
Outdated
Show resolved
Hide resolved
tests/EndToEndPhpt/Util/Tokens/polyfillTokenizerConstants-number.phpt
Outdated
Show resolved
Hide resolved
tests/EndToEndPhpt/Util/Tokens/polyfillTokenizerConstants-skip-names.phpt
Outdated
Show resolved
Hide resolved
tests/EndToEndPhpt/Util/Tokens/polyfillTokenizerConstants-skip-names.phpt
Outdated
Show resolved
Hide resolved
tests/EndToEndPhpt/Util/Tokens/polyfillTokenizerConstants-skip-names.phpt
Outdated
Show resolved
Hide resolved
jrfnl
left a comment
There was a problem hiding this comment.
One last question - the get_defined_constants(true) function call is repeated numerous times, including within loops, while the result won't change between calls.
Would it make sense to call the function once, store to a local variable and reference that throughout the function ?
|
Oh... where's my brain... I keep thinking of more things... - might be good to add a section about these EndToEnd tests in the |
tests/EndToEndPhpt/Util/Tokens/polyfillTokenizerConstants-collision-user.phpt
Outdated
Show resolved
Hide resolved
tests/EndToEndPhpt/Util/Tokens/polyfillTokenizerConstants-collision-php.phpt
Outdated
Show resolved
Hide resolved
…ision-php.phpt Co-authored-by: Juliette <[email protected]>
…ision-php.phpt Co-authored-by: Juliette <[email protected]>
Co-authored-by: Juliette <[email protected]>
…-names.phpt Co-authored-by: Juliette <[email protected]>
…-names.phpt Co-authored-by: Juliette <[email protected]>
…-names.phpt Co-authored-by: Juliette <[email protected]>
…ision-user.phpt Co-authored-by: Juliette <[email protected]>
…ision-user.phpt Co-authored-by: Juliette <[email protected]>
…er.phpt Co-authored-by: Juliette <[email protected]>
…er.phpt Co-authored-by: Juliette <[email protected]>
…ision-php.phpt Co-authored-by: Juliette <[email protected]>
…ision-user.phpt Co-authored-by: Juliette <[email protected]>
There was a problem hiding this comment.
Status update: we discussed this PR in a call today.
The tests fails when running with Xdebug in coverage mode.
This is most likely due to PHPUnit preloading files when running in coverage mode.
Then again, it may also have something to do with the tests/bootstrap.php file being preloaded, though if that's the case, the tests without Xdebug should also fail.
Tests in phpt mode already always run in a separate process, so there is nothing we can toggle on that end.
So, considering all this, we are leaning towards moving the phpt test suite config to its own configuration file so it will be more straight-forward to run all the "unit/integration" tests in the normal test workflow, while running the phpt end to end tests separately (always).
In part, the reason for this is that PHPUnit does not have a --exclude-testsuite CLI option, which means that if everything would stay in one PHPUnit config file, we would need to run every test suite separately, making things more complicated when merging code coverage reports.
Action plan:
- Remove the
testsuiteconfig from the pre-existing PHPUnit config files and move it to aendtoend.xml.distfile, preferably with only<phpunit>top-level attributes which are PHPUnit cross-version compatible, so we can avoid having to have separate config files for PHPUnit 8-9 vs 10 and higher. - Add a separate PHPUnit test step to the
end-to-end-tests.ymlworkflow which runs just the PHPUnit end to end tests.
We may need to check in the "normal" PHPUnittest.ymlworkflow if any other steps/config needs to be copied into theend-to-end-tests.ymlfor this to work properly. - Mention the new tests and how to run them in the
CONTRIBUTINGguide. This doesn't need to be a full write-up. It can be expanded later on, but it should have an initial mention in the PR which introduces these new tests.
Description
The value of these constants are not stable, and therefore already cannot be relied upon. This is because the specific values that PHP assigns can change with different versions of PHP. PHPCS does not use the values of these constants (other than to look up their name using the Tokens::tokenName() method).
There are other tools which also poly-fill these constants. Some of those tools also perform validation on the value for these constants. In order to play nicely with the arbitrary validation that other tools perform on these constants, we are switching from string values to integer values.
The PHP manual suggests "using big numbers like 10000" for poly-filled T_* constants. We have arbitrarily chosen to start our numbering scheme from 135_000.
All PHPCS 'native' tokens currently have reliable values. In line with PHP T_* constants, the values of these tokens should never be relied upon. In a future version of PHPCS, the values for these tokens will switch from strings to integers.
Existing tests already cover the use of these constants and do not require adjustment for the code being changed here.
Suggested changelog entry
Related issues/external references
Fixes #1286
Types of changes
PR checklist