build caches: collect files to relocate while tarballing w/o file (#48212)

A few changes to tarball creation (for build caches):
- do not run file to distinguish binary from text
- file is slow, even when running it in a batched fashion -- it usually reads all bytes and has slow logic to categorize specific types
- we don't need a highly detailed file categorization; a crude categorization of elf, mach-o, text suffices.
detecting elf and mach-o is straightforward and cheap
- detecting utf-8 (and with that ascii) is highly accurate: false positive rate decays exponentially as file size increases. Further it's not only the most common encoding, but the most common file type in package prefixes.
iso-8859-1 is cheaply (but heuristically) detected too, and sufficiently accurate after binaries and utf-8 files are classified earlier
- remove file as a dependency of Spack in general, which makes Spack itself easier to install
- detect file type and need to relocate as part of creating the tarball, which is more cache friendly and thus faster
This commit is contained in:
Harmen Stoppels
2024-12-24 18:53:13 +01:00
committed by GitHub
parent aca469b329
commit e9cdcc4af0
17 changed files with 249 additions and 411 deletions

View File

@@ -161,11 +161,7 @@ jobs:
source share/spack/setup-env.sh
spack -d gpg list
tree $HOME/.spack/bootstrap/store/
- name: Bootstrap File
run: |
source share/spack/setup-env.sh
spack -d python share/spack/qa/bootstrap-file.py
tree $HOME/.spack/bootstrap/store/
windows:
runs-on: "windows-latest"
@@ -196,9 +192,3 @@ jobs:
spack -d gpg list
./share/spack/qa/validate_last_exit.ps1
tree $env:userprofile/.spack/bootstrap/store/
- name: Bootstrap File
run: |
./share/spack/setup-env.ps1
spack -d python share/spack/qa/bootstrap-file.py
./share/spack/qa/validate_last_exit.ps1
tree $env:userprofile/.spack/bootstrap/store/