-
-
Notifications
You must be signed in to change notification settings - Fork 650
Description
π bug report
Affected Rule
The issue is caused by the rule: py_console_script_binary (and likely py_binary in general) when used as a tool invoked by another process that has inherited RUNFILES_DIR from a different py_binary.
The specific file affected is _bazel_site_init.py generated by the bootstrap, specifically the _find_runfiles_root() function.
Is this a regression?
Yes, the previous version in which this bug was not present was: 1.6.3
This appears to be a variant of the issue that was fixed in 1.7.0 ("The stage1 bootstrap script now correctly handles nested RUNFILES_DIR environments, fixing issues where a py_binary calls another py_binary"), but the fix doesn't cover the case where a non-Python process sits between the two Python binaries.
Description
When using multiple py_binary tools in a Bazel genrule, where one tool (e.g., grpcio-tools) invokes a non-Python binary (e.g., protoc) that then invokes another py_binary as a plugin (e.g., protoc-gen-mypy), the second Python binary fails with ModuleNotFoundError because it inherits the wrong RUNFILES_DIR from the first Python binary.
The chain of execution is:
grpcio-tools(py_binary) β setsRUNFILES_DIRto its own runfilesgrpcio-toolsinvokesprotoc(C++ binary) β inheritsRUNFILES_DIRunchangedprotocinvokesprotoc-gen-mypy(py_console_script_binary) as a plugin β inherits wrongRUNFILES_DIRprotoc-gen-mypybootstrap seesRUNFILES_DIRpointing togrpcio-tools.runfiles, correctly determines it doesn't contain its own files- Fallback to
__file__-based path calculation fails because__file__resolves to the symlink target (outside.runfiles/) rather than the symlink path (inside.runfiles/) sys.pathgets populated with non-existent paths βModuleNotFoundError
π¬ Minimal Reproduction
# BUILD.bazel
load("@rules_python//python/entry_points:py_console_script_binary.bzl", "py_console_script_binary")
py_console_script_binary(
name = "protoc-gen-mypy",
pkg = "@pypi//:mypy-protobuf",
)
py_binary(
name = "grpcio-tools",
srcs = ["grpcio_tools.py"],
deps = ["@pypi//:grpcio-tools"],
)
genrule(
name = "generate_protos",
srcs = ["example.proto"],
outs = ["example_pb2.py", "example_pb2.pyi"],
cmd = """
$(location :grpcio-tools) \
--plugin=protoc-gen-mypy=$(location :protoc-gen-mypy) \
--mypy_out=$(GENDIR) \
$(location example.proto)
""",
tools = [
":grpcio-tools",
":protoc-gen-mypy",
],
)The key aspect is that grpcio-tools invokes protoc (C++ binary), which then invokes protoc-gen-mypy as a subprocess plugin. The non-Python protoc binary passes through the environment unchanged.
π₯ Exception or Error
ModuleNotFoundError: No module named 'mypy_protobuf'
--mypy_out: protoc-gen-mypy: Plugin failed with status code 1.
With RULES_PYTHON_BOOTSTRAP_VERBOSE=1, the root cause is visible:
bootstrap: stage 2: initial environ: RUNFILES_DIR='.../grpcio-tools.runfiles'
bazel_site_init: runfiles_root: .../bin/tool/src/protos
# ^^^ WRONG! Should be .../protoc-gen-mypy.runfiles
bazel_site_init: append sys.path: .../bin/tool/src/protos/sfc_rules_python++pip+pypi/.../[email protected]/site-packages
# ^^^ This path doesn't exist - the actual files are in protoc-gen-mypy.runfiles/...
π Your Environment
Operating System:
Linux 5.10.238 (aarch64)
Output of bazel version:
Build label: 8.2.1-sf5
Rules_python version:
1.7.0
Anything else relevant?
Root Cause: The _find_runfiles_root() function in _bazel_site_init.py has a fallback that calculates the runfiles root from __file__:
num_dirs_to_runfiles_root = _SELF_RUNFILES_RELATIVE_PATH.count("/") + 1
runfiles_root = os.path.dirname(__file__)
for _ in range(num_dirs_to_runfiles_root):
runfiles_root = os.path.dirname(runfiles_root)However, when Python imports a module through a symlink, __file__ is set to the symlink target, not the symlink path. The symlink target is:
.../bin/tool/src/protos/_protoc-gen-mypy.venv/.../site-packages/_bazel_site_init.py
But the symlink path (where the file is actually accessed from) is:
.../protoc-gen-mypy.runfiles/_main/tool/src/protos/_protoc-gen-mypy.venv/.../site-packages/_bazel_site_init.py
The symlink target doesn't have .runfiles/_main/ in its path, so the directory traversal lands in the wrong location.
Workaround: Create wrapper scripts that unset RUNFILES_DIR before invoking the plugins:
cat > wrapper/protoc-gen-mypy << 'EOF'
#!/bin/bash
unset RUNFILES_DIR
exec "$ACTUAL_BINARY" "$@"
EOFThis forces the bootstrap to use the __file__-based fallback without the incorrect RUNFILES_DIR validation failing first... but then the symlink issue means we'd still have a problem. Actually, unsetting RUNFILES_DIR works because when RUNFILES_DIR is unset, the bootstrap uses a different code path that correctly finds the runfiles via RUNFILES_MANIFEST_FILE or other means.
I want to disclaim that I've used AI to find this issue and the possible solution, as I'm not too familiar with Bazel yet. However; its solution seems to have fixed our issue locally and the explanation seems to match the other linked issue and its accompanying PR.