PERF: Optimize BlockManager metadata operations and dtype inference#65035
PERF: Optimize BlockManager metadata operations and dtype inference#65035loryzeta33 wants to merge 3 commits intopandas-dev:mainfrom
Conversation
- Optimized BlockManager._consolidate_check with early exit and set-based duplicate detection, avoiding O(N) list allocations. - Optimized BlockManager.get_dtypes to use np.fromiter with a generator, avoiding intermediate list creation. - Updated interleaved_dtype to accept iterables and modified callers to use generator expressions. - Removed redundant list() call in find_common_type. These changes significantly reduce Python-level overhead and memory pressure in core internal paths.
|
How much of this overlaps with #64574? |
|
Thanks for the feedback, @jbrockmendel. I have removed the report and benchmark files as requested. Regarding the overlap with #64574: While that PR optimizes the consolidation process itself (grouping/merging), this PR focuses on reducing Python-level overhead in metadata accessors like |
|
|
||
| def get_dtypes(self) -> npt.NDArray[np.object_]: | ||
| dtypes = np.array([blk.dtype for blk in self.blocks], dtype=object) | ||
| dtypes = np.fromiter( |
There was a problem hiding this comment.
Does this make a difference?
| if dtype is None: | ||
| dtype = interleaved_dtype( # type: ignore[assignment] | ||
| [blk.dtype for blk in self.blocks] | ||
| blk.dtype for blk in self.blocks |
There was a problem hiding this comment.
You just moved this list conversion from here to inside the function. The only real effect is making the annotation weaker
pre-commit run --all-filesWhat changed?
Optimized several
BlockManagerinternal metadata operations that previously allocated intermediate lists, causing unnecessary overhead. This is a pure performance improvement without behavioral changes._consolidate_check: Now uses a short-circuiting loop with asetrather than a list comprehension followed by aset()cast. This reduces anget_dtypes: Now usesnp.fromiterwith a generator rather than converting a list comprehension to an array.interleaved_dtype: Updated to accept iterables, avoiding intermediate list allocations in critical callers (likefast_xsandto_numpyconversions).find_common_type: Removed a redundantlist()cast in the fast path.Benchmark
Tested on a severely fragmented
BlockManagerwith 1,000 columns (500 float, 500 int) to measureis_consolidated()worst-case performance before consolidation: