build caches: collect files to relocate while tarballing w/o file (#48212)

A few changes to tarball creation (for build caches):
- do not run file to distinguish binary from text
- file is slow, even when running it in a batched fashion -- it usually reads all bytes and has slow logic to categorize specific types
- we don't need a highly detailed file categorization; a crude categorization of elf, mach-o, text suffices.
detecting elf and mach-o is straightforward and cheap
- detecting utf-8 (and with that ascii) is highly accurate: false positive rate decays exponentially as file size increases. Further it's not only the most common encoding, but the most common file type in package prefixes.
iso-8859-1 is cheaply (but heuristically) detected too, and sufficiently accurate after binaries and utf-8 files are classified earlier
- remove file as a dependency of Spack in general, which makes Spack itself easier to install
- detect file type and need to relocate as part of creating the tarball, which is more cache friendly and thus faster
This commit is contained in:
Harmen Stoppels
2024-12-24 18:53:13 +01:00
committed by GitHub
parent aca469b329
commit e9cdcc4af0
17 changed files with 249 additions and 411 deletions

View File

@@ -35,7 +35,7 @@ A build matrix showing which packages are working on which systems is shown belo
.. code-block:: console
apt update
apt install bzip2 ca-certificates file g++ gcc gfortran git gzip lsb-release patch python3 tar unzip xz-utils zstd
apt install bzip2 ca-certificates g++ gcc gfortran git gzip lsb-release patch python3 tar unzip xz-utils zstd
.. tab-item:: RHEL

View File

@@ -8,7 +8,6 @@ unzip, , , Compress/Decompress archives
bzip2, , , Compress/Decompress archives
xz, , , Compress/Decompress archives
zstd, , Optional, Compress/Decompress archives
file, , , Create/Use Buildcaches
lsb-release, , , Linux: identify operating system version
gnupg2, , , Sign/Verify Buildcaches
git, , , Manage Software Repositories
1 Name Supported Versions Notes Requirement Reason
8 bzip2 Compress/Decompress archives
9 xz Compress/Decompress archives
10 zstd Optional Compress/Decompress archives
file Create/Use Buildcaches
11 lsb-release Linux: identify operating system version
12 gnupg2 Sign/Verify Buildcaches
13 git Manage Software Repositories