Don't fetch to order mirrors (#34359)

When installing binary tarballs, Spack has to download from its
binary mirrors.

Sometimes Spack has cache available for these mirrors.

That cache helps to order mirrors to increase the likelihood of
getting a direct hit.

However, currently, when Spack can't find a spec in any local cache
of mirrors, it's very dumb:

- A while ago it used to query each mirror to see if it had a spec,
  and use that information to order the mirror again, only to go
  about and do exactly a part of what it just did: fetch the spec
  from that mirror confused
- Recently, it was changed to download a full index.json, which
  can be multiple dozens of MBs of data and may take a minute to
  process thanks to the blazing fast performance you get with
  Python.

In a typical use case of concretizing with reuse, the full index.json
is already available, and it likely that the local cache gives a perfect
mirror ordering on install. (There's typically no need to update any
caches).

However, in the use case of Gitlab CI, the build jobs don't have cache,
and it would be smart to just do direct fetches instead of all the
redundant work of (1) and/or (2).

Also, direct fetches from mirrors will soon be fast enough to
prefer these direct fetches over the excruciating slowness of
index.json files.
This commit is contained in:
Harmen Stoppels 2022-12-13 17:07:11 +01:00 committed by GitHub
parent 8b68b4ae72
commit 333da47dc7
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 15 additions and 11 deletions

View File

@ -266,10 +266,7 @@ def find_by_hash(self, find_hash, mirrors_to_check=None):
None, just assumes all configured mirrors.
"""
if find_hash not in self._mirrors_for_spec:
# Not found in the cached index, pull the latest from the server.
self.update(with_cooldown=True)
if find_hash not in self._mirrors_for_spec:
return None
return []
results = self._mirrors_for_spec[find_hash]
if not mirrors_to_check:
return results
@ -2084,8 +2081,8 @@ def get_mirrors_for_spec(spec=None, mirrors_to_check=None, index_only=False):
spec (spack.spec.Spec): The spec to look for in binary mirrors
mirrors_to_check (dict): Optionally override the configured mirrors
with the mirrors in this dictionary.
index_only (bool): Do not attempt direct fetching of ``spec.json``
files from remote mirrors, only consider the indices.
index_only (bool): When ``index_only`` is set to ``True``, only the local
cache is checked, no requests are made.
Return:
A list of objects, each containing a ``mirror_url`` and ``spec`` key

View File

@ -48,6 +48,7 @@
import spack.compilers
import spack.error
import spack.hooks
import spack.mirror
import spack.package_base
import spack.package_prefs as prefs
import spack.repo
@ -419,18 +420,24 @@ def _try_install_from_binary_cache(pkg, explicit, unsigned=False, timer=timer.NU
otherwise, ``False``
timer (Timer):
"""
# Early exit if no mirrors are configured.
if not spack.mirror.MirrorCollection():
return False
pkg_id = package_id(pkg)
tty.debug("Searching for binary cache of {0}".format(pkg_id))
timer.start("search")
matches = binary_distribution.get_mirrors_for_spec(pkg.spec)
matches = binary_distribution.get_mirrors_for_spec(pkg.spec, index_only=True)
timer.stop("search")
if not matches:
return False
return _process_binary_cache_tarball(
pkg, pkg.spec, explicit, unsigned, mirrors_for_spec=matches, timer=timer
pkg,
pkg.spec,
explicit,
unsigned,
mirrors_for_spec=matches,
timer=timer,
)