Fix #1118: Update stumpy to >=1.11.1 to fix Python 3.11+ uv resolutio… by divyansh0-ai · Pull Request #1120 · blue-yonder/tsfresh

divyansh0-ai · 2026-04-05T19:30:56Z

This PR resolves the dependency resolution issues on Python 3.11+ (Issue #1118) and fixes a few initial Pandas 3.0 regression test failures.
Changes Made:
1.setup.cfg: Bumped stumpy minimum version from >=1.7.2 to >=1.11.1 to fix uv dependency resolution failures on Python 3.11+ (resolves #1118).

2.setup.py & pyproject.toml: Added a standard pyproject.toml exposing pyscaffold and setuptools<70 build requirements, and removed a legacy pkg_resources check in setup.py that caused isolated build managers (like uv or modern pip) to crash with ModuleNotFoundError.

3.tsfresh/utilities/dataframe_functions.py: Fixed a widespread KeyError: 'id' crash in the test suite caused by Pandas 3.0 groupby().apply() no longer including grouping keys in its evaluated chunks. Added fallback logic to dynamically restore missing grouping keys from the DataFrame's name tuple.

4.tests/units/utilities/test_dataframe_functions.py: Updated deprecated Pandas dataframe frequency string aliases (changed freq="H" to freq="h") to fix immediate Pandas 3.0 deprecation failures.

…v resolution and resolve Pandas 3.0 test failures

Copilot

Pull request overview

This PR addresses Python 3.11+ dependency resolution/build isolation issues (Issue #1118) and updates code/tests for Pandas 3.0 behavior changes impacting grouping/apply semantics and deprecated frequency aliases.

Changes:

Bump stumpy minimum version to avoid older numba/llvmlite constraints on newer Python versions.
Add pyproject.toml build-system requirements and simplify setup.py to support isolated build frontends (e.g., uv / modern pip).
Fix Pandas 3.0 groupby().apply() behavior changes by restoring missing grouping key columns, and update tests to use non-deprecated pandas frequency aliases.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`setup.cfg`	Raises `stumpy` minimum version to improve dependency resolution on Python 3.11+.
`pyproject.toml`	Declares PEP 517 build-system requirements for isolated builds.
`setup.py`	Removes legacy `pkg_resources`/setuptools version check that can break isolated build managers.
`tsfresh/utilities/dataframe_functions.py`	Restores missing grouping columns in groupby/apply chunks to avoid Pandas 3.0 KeyErrors.
`tests/units/utilities/test_dataframe_functions.py`	Updates deprecated pandas datetime frequency aliases (`"H"` → `"h"`).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tsfresh/utilities/dataframe_functions.py

Copilot · 2026-04-05T19:34:21Z

tsfresh/utilities/dataframe_functions.py

+        if grouper is not None:
+            for col in grouper:
+                if col not in df_temp.columns:
+                    if len(grouper) > 1:
+                        df_temp[col] = x.name[grouper.index(col)]
+                    else:
+                        df_temp[col] = x.name


The grouping-key restoration logic inside _roll_out_time_series._f duplicates similar logic added later in add_sub_time_series_index and also does an O(n^2) lookup via grouper.index(col) inside a loop. Consider factoring this into a small helper (or iterating with for i, col in enumerate(grouper): ...) so the mapping from grouper position to name tuple index is computed once and shared.

The O(n2) is not so important here, but I think the code is also cleaner. Having a dedicated helper function also makes sense 👍

Copilot · 2026-04-05T19:34:21Z

pyproject.toml

@@ -0,0 +1,3 @@
+[build-system]


pyproject.toml pins setuptools<70 as a build requirement. A hard upper bound can unnecessarily block builds on environments that have already moved to newer setuptools. If this cap is required due to a known incompatibility (e.g., PyScaffold <4), please add a brief rationale/comment or link; otherwise consider relaxing/removing the upper bound.

Suggested change

[build-system]

[build-system]

# Keep setuptools below 70 while using PyScaffold < 4, which has known

# compatibility constraints with newer setuptools releases.

I think mentioning that this is a requirement from pyscaffold is indeed a good idea

Copilot · 2026-04-05T19:34:22Z

tsfresh/utilities/dataframe_functions.py

+        for col in grouper:
+            if col not in df_chunk.columns:
+                if len(grouper) > 1:
+                    df_chunk[col] = df_chunk.name[grouper.index(col)]
+                else:
+                    df_chunk[col] = df_chunk.name


This grouping-key restoration block duplicates the one added in _roll_out_time_series and uses repeated grouper.index(col) lookups inside a loop. Consider extracting to a helper / enumerating grouper once to avoid duplication and reduce repeated list scans.

Co-authored-by: Copilot <[email protected]>

…ndex lookup - Add module-level _restore_grouping_keys(df_chunk, grouper) helper that uses enumerate(grouper) to map each column to its name-tuple index in a single pass, eliminating repeated grouper.index(col) list scans. - Replace the duplicated inline restoration blocks in both _roll_out_time_series._f and add_sub_time_series_index._add_id_column with a single call to the new helper. - Add rationale comment to pyproject.toml explaining the setuptools<70 upper bound (PyScaffold <4 incompatibility). Addresses Copilot review comments on PR blue-yonder#1120.

nils-braun

Thank you very much @divyansh0-ai for your time. Copilots suggestions were already quite reasonable. Can you implement those as well?

nils-braun · 2026-04-08T19:40:00Z

tsfresh/utilities/dataframe_functions.py

+        if grouper is not None:
+            for col in grouper:
+                if col not in df_temp.columns:
+                    if len(grouper) > 1:
+                        df_temp[col] = x.name[grouper.index(col)]
+                    else:
+                        df_temp[col] = x.name


The O(n2) is not so important here, but I think the code is also cleaner. Having a dedicated helper function also makes sense 👍

nils-braun · 2026-04-08T19:41:50Z

pyproject.toml

@@ -0,0 +1,3 @@
+[build-system]


I think mentioning that this is a requirement from pyscaffold is indeed a good idea

1. Pass x.name (the group's name) rather than df_temp.name (the slice which has no .name) to _restore_grouping_keys in _roll_out_time_series. This fixes AttributeError on all roll_time_series tests. 2. Guard the _restore_grouping_keys call in _add_id_column with if grouper: to avoid calling df_chunk.name on a plain DataFrame (the no-grouper / else branch). Fixes test_no_parameters. 3. Move df['sort'] = range(df.shape[0]) to before df.groupby() so the synthetic sort column is visible inside every group chunk. Fixes KeyError: 'sort' in test_dict_rolling / test_dict_rolling_maxshift_1. 4. Update test_no_finite_values_yields_0 to use assertIn instead of assertEqual for the warning message, making it robust to the Pandas 3.0 Arrow backend ArrowStringArray repr of column name arrays. All 35 tests in test_dataframe_functions now pass on Python 3.11.

- feature_extraction/data.py: Fix KeyError in Dask groupby apply() by capturing x.name BEFORE sort_values() (which drops .name on the returned DataFrame), recovering id/kind from the group tuple for both with-kind and without-kind (melt+groupby) paths. - feature_selection/relevance.py: Fix infer_ml_task() to recognize pd.StringDtype (Pandas 3.0 Arrow-backed strings) as classification, not regression. - scripts/run_tsfresh.py: Replace deprecated delim_whitespace=True with sep=r'\s+' (removed in Pandas 3.0). - examples/har_dataset.py: Same delim_whitespace -> sep=r'\s+' fix. - tests/units/feature_selection/test_relevance.py: - Use .iloc[0] instead of [0] for Series with named index (Pandas 3.0 no longer allows integer fallback on non-integer indexes). - Fix ChainedAssignmentError in test fixtures with .loc[cond, col] - tests/units/transformers/test_feature_selector.py: - Fix ChainedAssignmentError with .loc[cond, col] - tests/units/transformers/test_per_column_imputer.py: - Use assertIn() instead of assertEqual() for warning messages to handle ArrowStringArray repr differences in Pandas 3.0. - tests/integrations/examples/test_robot_execution_failures.py: - Allow pd.StringDtype in addition to object dtype for string Series.

divyansh0-ai · 2026-04-09T17:10:27Z

@nils-braun Thanks for the review and the feedback! I've gone ahead and implemented the suggestions, along with the remaining Pandas 3.0 compatibility fixes. The PR is ready for another look. Could you please approve the workflow runs when you have a moment.

nils-braun

Well done! Thanks for fixing all this!
I think there are just a few linting issues left.

Update: apart from the missing pd import there are also other two test failures. Do you want to look into them as well @divyansh0-ai or shall I take a look? One looks like a grouping key issue.

nils-braun · 2026-04-10T19:21:43Z

tests/integrations/examples/test_robot_execution_failures.py


        assert len(y.unique()) > 2
-        assert y.dtype == object
+        assert y.dtype == object or isinstance(y.dtype, pd.StringDtype)


you probably want to import pd here :)

@nils-braun Thanks for the approval! I will gladly take care of these last few items. I’ll fix the missing pd import, clear up the linting, and dig into those two 3.11/3.12 test failures today. I'll push an update shortly!

…r#1120 - tests/integrations/examples/test_robot_execution_failures.py: Add missing import pandas as pd needed for pd.StringDtype usage on line 57 (flagged by nils-braun in review). - tsfresh/feature_selection/relevance.py: Reformat line in infer_ml_task to comply with black 22.12.0 line length limit (pre-commit failure). Wrap lambda ternary in calculate_relevance_table in extra parens to satisfy black's updated style rules. - tsfresh/utilities/dataframe_functions.py: In �dd_sub_time_series_index, replace set_index(get_level_values(-1)) with reset_index(drop=True) and pass include_groups=False (with a Pandas<2.2 fallback) to groupby().apply(). This prevents a grouping- key KeyError (KeyError: 'id') on Python 3.11+ with Pandas 3.0, where groupby columns are no longer present in the applied chunk and group_keys defaults to False, eliminating the MultiIndex that get_level_values(-1) previously relied on.

…r#1120 - tests/integrations/examples/test_robot_execution_failures.py: Add missing import pandas as pd needed for pd.StringDtype usage on line 57 (flagged by nils-braun in review). - tsfresh/feature_selection/relevance.py: Reformat line in infer_ml_task to comply with black 22.12.0 line length limit (pre-commit failure). Wrap lambda ternary in calculate_relevance_table in extra parens to satisfy black's updated style rules. - tsfresh/utilities/dataframe_functions.py: In add_sub_time_series_index, replace set_index(get_level_values(-1)) with reset_index(drop=True) and pass include_groups=False (with a Pandas<2.2 fallback) to groupby().apply(). This prevents a grouping- key KeyError on Python 3.11+ with Pandas 3.0, where groupby columns are no longer present in the applied chunk and group_keys defaults to False, eliminating the MultiIndex that get_level_values(-1) relied on.

In add_sub_time_series_index, groupby().apply() now passes include_groups=False (Pandas 2.2+) so grouping columns are explicitly excluded from each chunk, matching Pandas 3.0 default behaviour. _restore_grouping_keys() re-adds them via df_chunk.name. Also replaced set_index(get_level_values(-1)) with reset_index(drop=True) since group_keys now defaults to False in Pandas 3.0, meaning the groupby result no longer has a MultiIndex. Fixes KeyError on Python 3.11+ / Pandas 3.0 in Test (3.11) and Test (3.12).

Fix blue-yonder#1118: Update stumpy to >=1.11.1 to fix Python 3.11+ u…

d18254e

…v resolution and resolve Pandas 3.0 test failures

Copilot AI review requested due to automatic review settings April 5, 2026 19:30

Copilot started reviewing on behalf of divyansh0-ai April 5, 2026 19:31 View session

Copilot AI reviewed Apr 5, 2026

View reviewed changes

nils-braun and others added 2 commits April 8, 2026 21:36

Apply suggestion from @Copilot

6318312

Co-authored-by: Copilot <[email protected]>

nils-braun reviewed Apr 8, 2026

View reviewed changes

divyansh0-ai added 3 commits April 9, 2026 01:42

style: apply pre-commit formatting (black/isort/end-of-file)

64e2423

nils-braun approved these changes Apr 10, 2026

View reviewed changes

nils-braun reviewed Apr 10, 2026

View reviewed changes

divyansh0-ai force-pushed the fix-python311-uv-install-1118 branch 2 times, most recently from 96ebf50 to 421fe2b Compare April 10, 2026 19:50

divyansh0-ai added 2 commits April 11, 2026 01:28

fix: add missing pd import and fix black formatting

2c632db

Conversation

divyansh0-ai commented Apr 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

nils-braun Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

nils-braun Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

nils-braun left a comment

Choose a reason for hiding this comment

Uh oh!

nils-braun Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

nils-braun Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

divyansh0-ai commented Apr 9, 2026

Uh oh!

nils-braun left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nils-braun Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

divyansh0-ai Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nils-braun left a comment •

edited

Loading