Skip to content

Implement venv/site-packages based binaries #2156

@groodt

Description

@groodt

Context

This is a tracking issue to recognise that the lack of a site-packages layout causes friction when making use of third-party distribution packages (wheels and sdists) from indexes such as PyPI.

Outside bazel and rules_python, it is common for distribution packages to assume that they will be installed into a single site-packages folder, either in a "virtual environment" or directly into a python user or global site installation.

Notable examples are the libraries in the AI / ML ecosystem that make use of the nvidia CUDA shared libraries. These shared libraries contain relative rpath in the ELF/Mach-O/DLL which fail when not installed as siblings in a site-packages layout.

There is also a complication introduced into the rules due to lack of the single site-packages folder. Namespace packages in rules_python are all processed into pkg-util style namespace packages. This seems to work, but wouldn't be necessary if site-packages was used.

Another rare issue is failure to load *.pth files. Python provides Site-specific configuration hooks that can customize the sys.path at startup. rules_python could workaround this issue perhaps, but if a site-packages layout was used and discovered by the interpreter at startup, no workarounds would be necessary.

Distribution packages on PyPI known to have issues:

  • torch
  • onnxruntime-gpu
  • rerun-sdk

Known workarounds

  1. Patch the third-party dependencies using rules_python patching support
  2. Use an alternative set of rules such as rules_py
  3. Patch the third-party dependencies outside rules_python and push the patched dependencies to a private index

Related

Proposed design to solve

The basic proposed solution is to create a per-binary virtual env whose site-packages contains symlinks to other locations in runfiles. e.g. ``$runfiles/mybin.venv/site-packages/foowould be a symlink to$runfiles/_pypi_foo/site-packages/foo`

TODO list

  • Add PyInfo.site_packages_symlinks. A depset of site-packages relative paths and runfiles paths to symlink to.
  • Make pypi-generated targets use this site-packages solution by default
    • Disable pkgutil-style __init__.py generation in pypi repo phase
    • Maybe refactor the pypi generation to use a custom rule instead of plain py_library.
  • Add a flag to allow experimentation and testing
  • Edge cases
    • if two distributions install into the same directory and/or have overlapping files
    • Handling pkgutil-style packages
    • Interaction of bootstrap=script vs bootstrap=system with this new layout
    • Handle platforms/cases where symlinks can't be created at build time (windows, using rules_pkg)
    • Handling if multiple versions of a distribution are in the deps and ensuring only one is used, while still respecting merge/conflict logic.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    core-rulesIssues concerning core bin/test/lib rules

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      pFad - Phonifier reborn

      Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

      Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


      Alternative Proxies:

      Alternative Proxy

      pFad Proxy

      pFad v3 Proxy

      pFad v4 Proxy