-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Introduction
Hi P4A Maintainers and Contributors,
First, I want acknowledge the immense complexity of the task python-for-android undertakes – enabling Python on Android, especially with compiled extensions, is a significant challenge. P4A has been a vital tool for many developers. This feedback aims to consolidate observations from various GitHub issues, recipe analyses, and code reviews into a constructive discussion about potential areas for improvement, focusing on enhancing the toolchain's robustness, maintainability, caching, packaging efficiency, and overall developer experience.
1. Challenges with Pip Fallback Mechanism
When p4a encounters a requirement without a dedicated recipe, the current pip fallback mechanism faces several recurring challenges, often leading to build failures or runtime errors:
- Incompatible Binaries from Host Wheels: Issues like python for android support for charset_normalizer #3110,
charset-normalizer >= 3.0.0
is not plain-python anymore, and requires a recipe #2755 (charset_normalizer), Add a recipe forcurl-cffi
#2964 (curl-cffi),pyodbc
doesn't have a recipe onpython-for-android
#2662 (pyodbc), and ImportError for fitz / PyMuPDF ImportError: dlopen failed: _fitz.so is 64-bit instead of 32-bit #2628 (fitz/PyMuPDF) consistently show runtimedlopen
errors due to architecture mismatches (e.g.,EM_X86_64
binary in anEM_AARCH64
environment, or 64-bit vs 32-bit). This strongly suggests thatpip
(running on the host build machine) downloads pre-built wheels for the host architecture, and a subsequent step in p4a inadvertently allows these incompatible.so
files into the final Android package. - Source Build Failures: When pip attempts a source build for packages like
pycairo
(ERROR: Could not build wheels for pycairo #2851),readline
(Missing recipe forreadline
#2787), orpyaudio
(Kivy Buildozer PyAudio error(need recipes) #2265), failures often occur during cross-compilation, even though p4a correctly sets NDK environment variables (CC
,CFLAGS
, etc.). Common causes appear to be:- Host Path Contamination: Build scripts referencing host system headers/libraries (
/usr/include
,/usr/lib
) instead of relying solely on the NDK/target paths (e.g.,pycairo
ERROR: Could not build wheels for pycairo #2851). - Autotools Misconfiguration:
./configure
scripts failing because they aren't invoked with the necessary--host=<target_triplet>
flag (e.g.,readline
Missing recipe forreadline
#2787). - Missing C Dependencies: Required C libraries lacking corresponding p4a recipes (e.g.,
portaudio
forpyaudio
Kivy Buildozer PyAudio error(need recipes) #2265).
- Host Path Contamination: Build scripts referencing host system headers/libraries (
- Transitive Dependency Exclusion: The command used in
build.py:run_pymodules_install
explicitly includes--no-deps
. This prevents pip from installing transitive Python dependencies, forcing users to manually list the entire dependency tree for non-recipe packages to avoid runtimeImportError
s.
These points indicate the pip fallback needs careful handling for compiled dependencies and Python dependency resolution, often requiring a dedicated recipe for reliability.
2. C Extension Filename Tagging (.so
Naming)
The runtime errors involving incompatible binaries appear linked to .so
filename handling during cross-compilation:
setuptools
Naming Behavior: Build backends likesetuptools
, when run underhostpython
, often seem to name the output.so
file using host platform tags, even when the code inside is compiled for the target.- The
reduce_object_file_names
Workaround: This function inTargetPythonRecipe
strips these incorrect host tags, allowing the target-compiled.so
to be loaded generically.# In TargetPythonRecipe def reduce_object_file_names(self, dirn): # Strips tags like cpython-XYZ-x86_64-linux-gnu move(filen, join(file_dirname, parts[0] + '.so'))
- Problematic Side Effect: This tag stripping also applies to incompatible host-architecture
.so
files from pip wheels, allowing them into the package and causing runtime crashes mentioned above. - Deviation from Standards: Stripping standard platform tags removes metadata Python typically uses. Addressing the root cause (ensuring correct target tag generation by the build backend) would be ideal.
3. Build Caching and Invalidation Challenges
The current caching requires frequent manual cleaning, indicating limitations:
- Recipe Changes: Modifying recipe files/patches often doesn't trigger a rebuild, as invalidation seems based on output artifact existence, not input sources. Forces manual deletion of
build/other_builds/<recipe>
. - Distribution Reuse: Stale distributions are reused even if underlying recipes are changed. Forces manual deletion of
dists/
. - Site-packages Updates: Updating a Python recipe version doesn't make it do a clean reinstall into
build/python-installs/
, requiring manual deletion there, since it only checks if the package exists in site-packages or not.
Implementing more robust invalidation (e.g., hashing inputs, tracking dependencies accurately, ensuring clean installs) would improve developer workflow.
4. Build and Packaging Inefficiency
Several areas contribute to larger build directories and final package sizes:
- Source Code Duplication: The build process unpacks or copies the entire source code for each recipe into separate, architecture-specific directories within
build/other_builds/
. This duplicates the (usually architecture-independent) source code for every target architecture, increasing disk usage during builds. - Redundant Pure-Python Installs: Pure-Python packages appear to be installed redundantly into each architecture's staging directory (
build/python-installs/<dist>/<arch>
), repeating the install process. libpybundle.so
Bloat: Bundling stdlib, pure-Python site-packages (*.pyc
), and all extensions (*.so
) into a per-architecture gzipped archive namedlibpybundle.so
(to leverage OS extraction) duplicates all architecture-independent bytecode.- Bundling Unused Standard Library: The entire standard library (minus a small blacklist) is included in
stdlib.zip
, rather than performing import analysis to include only necessary modules, further increasing size compared to tools like PyInstaller.
5. Linker Workarounds (LibPthread
/LibRt
)
- Recipes like
LibPthread
andLibRt
exist solely to create fakelibpthread.so
/librt.so
symlinks pointing tolibc.so
. - This works around build systems (like
uvloop
's) that incorrectly try to link-lpthread
or-lrt
on Android, where these symbols are part oflibc
. - This is a hack that pollutes the global linker path. The ideal fix is patching the dependent recipes to remove the unnecessary linker flags when targeting Android.
6. Recipe Maintenance and Ecosystem Support
- Outdated Core Recipes: Many foundational recipes (NumPy, OpenSSL, SciPy, flask, OpenCV, Pandas, Cython, ICU, etc.) are significantly behind current stable/secure versions, limiting usability and posing risks.
Summary & Potential Path Forward
A potential path forward could involve a focused effort on:
- Fix Cross-Compile Naming: Resolve the
.so
filename tagging issue at the build backend level. - Implement Robust Caching: Improve build/dist invalidation based on actual input changes.
- Optimize Build/Packaging: Avoid source/bytecode duplication; perform import analysis for stdlib.
- Fix Pip Fallback: Ensure correct transitive dependency bundling and improve build environment isolation.
- Add Build-Time Architecture Validation: Check
.so
files match the target arch before final packaging. - Remove Hacks: Eliminate
reduce_object_file_names
,LibPthread
/LibRt
, etc., as underlying issues are fixed. - Prioritize Core Recipe Updates: Focus on updating critical outdated recipes once the build system is more robust. Communicate challenges transparently.
Addressing these foundational areas seems key to making p4a more robust, maintainable, efficient, and capable of supporting the modern Python ecosystem on Android.
Thank you again for maintaining this important project and for considering this comprehensive feedback.