fix(du): use getdents64 on Linux to avoid EOVERFLOW on 32-bit architectures#11902
fix(du): use getdents64 on Linux to avoid EOVERFLOW on 32-bit architectures#11902mattsu2020 wants to merge 1 commit intouutils:mainfrom
Conversation
Merging this PR will improve performance by 40.59%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | du_max_depth_balanced_tree[(6, 4, 10)] |
25.1 ms | 20.4 ms | +23.02% |
| ⚡ | Simulation | du_wide_tree[(5000, 500)] |
9.2 ms | 8.2 ms | +12.14% |
| ⚡ | Simulation | du_deep_tree[(100, 3)] |
1,089.6 µs | 998.3 µs | +9.15% |
| ⚡ | Simulation | du_summarize_balanced_tree[(5, 4, 10)] |
6.6 ms | 5.3 ms | +24.25% |
| ⚡ | Simulation | du_all_wide_tree[(5000, 500)] |
16.2 ms | 15 ms | +7.68% |
| ⚡ | Simulation | rm_recursive_tree |
11.9 ms | 8.5 ms | +40.59% |
Comparing mattsu2020:du_fix (4c574da) with main (30fd234)
Footnotes
-
46 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
|
GNU testsuite comparison: |
What happened? |
|
Would you add comment why cannot we use rustix at other unix? |
@oech3 The benchmark contains 46897 system calls in HEAD but 70335 system calls in BASE. Since system calls are not instrumented in CodSpeed, be careful interpreting this result. |
|
perf is near with mimalloc. Does mimalloc increase num of syscalls too by same reason with this? |
70334 system calls with mimalloc (#11866) so no change |
|
@oech Could you run benchmarks with hyperfine? |
| // Helper function to read directory entries. | ||
| // On Linux, use rustix::fs::RawDir which calls getdents64 directly, | ||
| // avoiding EOVERFLOW on 32-bit architectures where libc readdir() uses | ||
| // 32-bit d_ino (Issue #11848). |
There was a problem hiding this comment.
In general please use full URL instead of (Issue #11848). But since this PR will close that issue I think you could remove the reference.
|
@mattsu2020 please clean your commit history |
…ctures On 32-bit Linux (i686), du fails with "Value too large for defined data type" (EOVERFLOW) when reading directories. The root cause is nix::dir::Dir calling libc::readdir(), which uses 32-bit d_ino on 32-bit glibc. Modern filesystems (XFS, Btrfs, ext4+inode64) can return inode numbers exceeding 32 bits. Replace nix::dir::Dir with rustix::fs::RawDir on Linux/Android, which calls getdents64 syscall directly with 64-bit d_ino/d_off. This fixes du and all other utilities using DirFd::read_dir() (rm, chmod, chown, install). On non-Linux Unix (macOS, BSDs), the existing nix implementation is retained. Closes uutils#11848
Summary
duproducing "Value too large for defined data type" (EOVERFLOW) on 32-bit Linux (i686) when reading directoriesnix::dir::Dir(which callslibc::readdir()with 32-bitd_ino) withrustix::fs::RawDir(which callsgetdents64syscall with 64-bitd_ino/d_off) on Linux/AndroidDirFd::read_dir():rm,chmod,chown,installRoot Cause
On 32-bit glibc,
libc::readdir()usesstruct direntwith 32-bitd_ino. Modern filesystems (XFS, Btrfs, ext4+inode64) can return inode numbers exceeding 2^32, causingEOVERFLOW.Fix
Use
rustix::fs::RawDiron Linux which calls thegetdents64syscall directly, always using 64-bit directory entry fields.rustixwithfsfeature is already an unconditional dependency ofuucore, so no new dependencies are needed.On non-Linux Unix (macOS, BSDs), the existing
nix::dir::Dirimplementation is retained.Test
cargo check -p uucorepassescargo check -p uu_dupassescargo test -p uu_du— all tests passcargo test -p uu_rm— all tests passcargo test -p uu_chmod— all tests passcargo test -p coreutils -- du— all 112 du integration tests passCloses #11848