Move the relocation of binary text in its own class
Drop threaded text replacement, since the current bottleneck
is decompression. It would be better to parallellize over packages,
instead of over files per package.
A small improvement with separate classes for text replacement is that we
now compile the regex in the constructor; previously it was compiled per
binary to be relocated.
Compilers and linker optimize string constants for space by aliasing
them when one is a suffix of another. For gcc / binutils this happens
already at -O1, due to -fmerge-constants. This means that we have
to take care during relocation to always preserve a certain length
of the suffix of those prefixes that are C-strings.
In this commit we pick length 7 as a safe suffix length, assuming the
suffix is typically the 7 characters from the hash (i.e. random), so
it's unlikely to alias with any string constant used in the sources.
In general we now pad shortened strings from the left with leading
dir seperators, but in the case of C-strings that are much shorter
and don't share a common suffix (due to projections), we do allow
shrinking the C-string, appending a null, and retaining the old part
of the prefix.
Also when rewiring, we ensure that the new hash preserves the last
7 bytes of the old hash.
Co-authored-by: Harmen Stoppels <harmenstoppels@gmail.com>
Whenever the rpath string actually _grows_, it falls back to patchelf,
when it stays the same length or gets shorter, we update it in-place,
padded with null bytes.
This PR only deals with absolute -> absolute rpath replacement. We don't
use `_build_tarball(relative=True)` in our CI. If `relative` then it falls
back to the old replacement code.
With this PR, relocation time goes down significantly, likely because patchelf
does some odd things with mmap, causing lots of overhead. Example:
- `binutils`: 700MB installed, goes from `1.91s` to `0.57s`, or `3.4x` faster.
Relocation time: 27% -> 10% of total install time
- `llvm`: 6.8GB installed, goes from `28.56s` to `5.38`, or `5.3x` faster.
Relocation time: 44% -> 13% of total install time
The bottleneck is now decompression.
Note: I'm somewhat confused about the "relative rpath" code paths. Right
now this PR only deals with absolute -> absolute replacement. As far as
I understand, if you embrace relative rpaths when uploading to the
buildcache, the whole point is you _don't_ want to patch rpaths on
install? So it seems fine to not expand `$ORIGIN` again imho.
- single pass over the binary data matching all prefixes
- collect offsets and replacement strings
- do in-place updates with `fseek` / `fwrite`, since typically our
replacement touch O(few bytes) while the file is O(many megabytes)
- be nice: leave the file untouched if some string can't be
replaced
Instead of looping over multiple regexes and the entire text file
contents, create a giant regex with all literal prefixes and do a single
pass over files to detect prefixes. Not only is a single pass faster,
it's also likely that the regex is compiled better, given that most
prefixes share a common ... prefix.
Scan the text files for relocatable prefixes *before* creating a tarball,
to reduce the amount of work to be done during install from binary
cache.
Co-authored-by: Harmen Stoppels <harmenstoppels@gmail.com>
Currently `relocate_text` and `relocate_text_bin` are unsafe in the
sense that they run in parallel, and lead to races when modifying
different items pointing to the same inode.
This leads to the issue observed in #33453.
This PR:
1. Renames those functions to `unsafe_*` so people are aware
2. Adds logic to deal with hardlinks in current binary packages
3. Adds logic to deal with hardlinks when creating new binary tarballs,
so the install side doesn't have to de-dupe hardlinks.
4. Adds a test for 3
The assumption is that all our relocation logic preserves inodes. That
is, we should never copy a file, modify it, and then move it back. I
quickly verified, and its seems like this is true for (binary) text
relocation, as well as rpath patching in patchelf (even when the file
grows) and mach-o fixes.
Fixup common tests
* Remove requirement for Python 2.6
* Skip new failing test
Windows: Update url util to handle Windows paths (#27959)
* update url util to handle windows paths
* Update tests to handle fixed url handling
* canonicalize path only when the path type matches the host platform
* Skip some url tests on Windows
Co-authored-by: Omar Padron <omar.padron@kitware.com>
Use threading.TIMEOUT_MAX when available (#24246)
This value was introduced in Python 3.2. Specifying a timeout greater than
this value will raise an OverflowError.
Co-authored-by: Lou Lawrence <lou.lawrence@kitware.com>
Co-authored-by: John Parent <john.parent@kitware.com>
Co-authored-by: Betsy McPhail <betsy.mcphail@kitware.com>
CMake - Windows Bootstrap (#25825)
Remove hardcoded cmake compiler (#26410)
Revert breaking cmake changes
Ensure no autotools on Windows
Perl on Windows (#26612)
Python source build windows (#26313)
Reconfigure sysconf for Windows
Python2.6 compatibility
Fxixup new sbang tests for windows
Ruby support (#28287)
Add NASM support (#28319)
Add mock Ninja package for testing
Remove a custom bootstrapping procedure to
use spack.bootstrap instead
Modifications:
* Reference count the bootstrap context manager
* Avoid SpackCommand to make the bootstrapping
procedure more transparent
* Put back requirement on patchelf being in PATH for unit tests
* Add an e2e test to check bootstrapping patchelf
I think this test should be removed, but when it stays, it should at
least follow the symlink, cause it fails for me if I let spack build
patchelf and have a symlink in a view.
After #26608 I got a report about missing rpaths when building a
downstream package independently using a spack-installed toolchain
(@tmdelellis). This occurred because the spack-installed libraries were
being linked into the downstream app, but the rpaths were not being
manually added. Prior to #26608 autotools-installed libs would retain
their hard-coded path and would thus propagate their link information
into the downstream library on mac.
We could solve this problem *if* the mac linker (ld) respected
`LD_RUN_PATH` like it does on GNU systems, i.e. adding `rpath` entries
to each item in the environment variable. However on mac we would have
to manually add rpaths either using spack's compiler wrapper scripts or
manually (e.g. using `CMAKE_BUILD_RPATH` and pointing to the libraries of
all the autotools-installed spack libraries).
The easier and safer thing to do for now is to simply stop changing the
dylib IDs.
* relocate: call install_name_tool less
* zstd: fix race condition
Multiple times on my mac, trying to install in parallel led to failures
from multiple tasks trying to simultaneously create `$PREFIX/lib`.
* PackageMeta: simplify callback flush
* Relocate: use spack.platforms instead of platform
* Relocate: code improvements
* fix zstd
* Automatically fix rpaths for packages on macOS
* Only change library IDs when the path is already in the rpath
This restores the hardcoded library path for GCC.
* Delete nonexistent rpaths and add more testing
* Relocate: Allow @executable_path and @loader_path
The `spack.architecture` module contains an `Arch` class that is very similar to `spack.spec.ArchSpec` but points to platform, operating system and target objects rather than "names". There's a TODO in the class since 2016:
abb0f6e27c/lib/spack/spack/architecture.py (L70-L75)
and this PR basically addresses that. Since there are just a few places where the `Arch` class was used, here we query the relevant platform objects where they are needed directly from `spack.platforms`. This permits to clean the code from vestigial logic.
Modifications:
- [x] Remove the `spack.architecture` module and replace its use by `spack.platforms`
- [x] Remove unneeded tests
* fix remaining flake8 errors
* imports: sort imports everywhere in Spack
We enabled import order checking in #23947, but fixing things manually drives
people crazy. This used `spack style --fix --all` from #24071 to automatically
sort everything in Spack so PR submitters won't have to deal with it.
This should go in after #24071, as it assumes we're using `isort`, not
`flake8-import-order` to order things. `isort` seems to be more flexible and
allows `llnl` mports to be in their own group before `spack` ones, so this
seems like a good switch.
* sbang pushed back to callers;
star moved to util.lang
* updated unit test
* sbang test moved; local tests pass
Co-authored-by: Nathan Hanford <hanford1@llnl.gov>
- [x] add `concretize.lp`, `spack.yaml`, etc. to licensed files
- [x] update all licensed files to say 2013-2021 using
`spack license update-copyright-year`
- [x] appease mypy with some additions to package.py that needed
for oneapi.py
This fixes sbang relocation when using old binary packages, and updates
code in `relocate.py`.
There are really two places where we would want to handle an `sbang`
relocation:
1. Installing an old package that uses `sbang` with shebang lines like
`#!/bin/bash $spack_prefix/sbang`
2. Installing a *new* package that uses `sbang` with shebang lines like
`#!/bin/sh $install_tree/sbang`
The second case is actually handled automatically by our text relocation;
we don't need any special relocation logic for new shebangs, as our
relocation logic already changes references to the build-time
`install_tree` to point to the `install_tree` at intall-time.
Case 1 was not properly handled -- we would not take an old binary
package and point its shebangs at the new `sbang` location. This PR fixes
that and updates the code in `relocation.py` with some notes.
There is one more case we don't currently handle: if a binary package is
created from an installation in a short prefix that does *not* need
`sbang` and is installed to a long prefix that *does* need `sbang`, we
won't do anything. We should just patch the file as we would for a normal
install. In some upcoming PR we should probably change *all* `sbang`
relocation logic to be idempotent and to apply to any sort of shebang'd
file. Then we'd only have to worry about which files to `sbang`-ify at
install time and wouldn't need to care about these special cases.
* make_link_relative: added docstring
* make_elf_binaries_relative: added docstring, unit tests
* raise_if_not_relocatable: added docstring, added unit test for exceptional case
* relocate_links: removed unused arguments, added docstring and comments
Also fixed a possible bug that was issuing spurious
warning when a file was relocated successfully
* relocate_text: added docstring and comments, renamed arguments
* relocate_text_bin: added docstring and comments, renamed arguments, unit tests
- add docstrings and make parameter names consistent in `relocate.py`
- Make `replace_prefix_*` and other functions private (as they are implementation details)
- remove unused function _replace_prefix_nullterm()
- Add unit tests for `relocate.py` functions
- add patchelf to Travis and use it during tests
- add hello_world fixture with a compiled binary, so we can test relocation
This PR introduces trivial refactoring in:
- `get_existing_elf_rpaths`
- `get_relative_elf_rpaths`
- `get_normalized_elf_rpaths`
- `set_placeholder`
mainly to be more consistent with practices used in other
parts of the code and to simplify functions locally. It also
adds or reworks unit tests for these functions and extends
their docstrings.
Co-authored-by: Patrick Gartung <gartung@fnal.gov>
Co-authored-by: Peter J. Scheibel <scheibel1@llnl.gov>
* relocate: removed import from statements
* relocate: renamed *Exception to *Error
This aims at consistency in naming with both
the standard library (ValueError, AttributeError,
etc.) and other errors in 'spack.error'.
Improved existing docstrings
* relocate: simplified search function by un-nesting conditionals
The search function that searches for patchelf has been
refactored to remove deeply nested conditionals.
Extended docstring.
* relocate: removed a condition specific to unit tests
* relocate: added test for _patchelf
Fixes#11678
`spack compiler find` was not searching `PATH` when provided with no
arguments. ea7910a updated the API for the search function and the
command logic did not update how it called this function. This also
adds a test to ensure that `spack compiler find` will collect
compilers from `PATH`.
* Added the `spack buildcache preview` sub-command
This is similar to `spack spec -I` but highlights which nodes in a DAG
are relocatable and which are not.
spec.tree has been generalized a little to accept a status function,
instead of always showing the install status
The current implementation works only for ELF, and needs to be
generalized to other platforms.
* Added a test to check if an executable is relocatable or not
This test requires a few commands to be present in the environment.
Currently it will run only under python 3.7 (which uses Xenial instead
of Trusty).
* Added tests for the 'buildcache preview' command.
* Fixed codebase after rebase
* Fixed the list of apt addons for Python 3.7 in travis.yaml
* Only check ELF executables and shared libraries. Skip checking virtual or external packages. (#229)
* Fixed flake8 issues
* Add handling for macOS mach binaries (#231)