spack

Go to file

Todd Gamblin 8363fbf40f Spec: use short-circuiting, stable comparison

## Background

Spec comparison on develop used a somewhat questionable optimization to
get decent spec comparison performance -- instead of comparing entire spec
DAGs, it put a `hash()` call in `_cmp_iter()` and compared specs by their
runtime hashes. This gets us good performance abstract specs, which don't
have complex dependencies and for which hashing is cheap. But it makes
the order of specs unstable and hard to reproduce.

We really need to do a full, consistent traversal over specs to compare
and to get a stable ordering. Simply taking the hash out and yielding
dependencies recursively (i.e. yielding `dep.spec._cmp_iter()` instead
of a hash) goes exponential for concrete specs because it explores all
paths. Traversal tracks visited nodes, but it's expensive to set up
the data structures for that, and it can slow down comparison of simple
abstract specs. Abstract spec comparison performance is important for
concretization (specifically setup), so we don't want to do that.

## New comparison algorithm

We can have (mostly) the best of both worlds -- it's just a bit more
complicated.

This changes Spec comparison to do a proper, stable graph comparison:

1. Spec comparison will now short-circuit whenever possible for concrete
   specs, when DAG hashes are known to be equal or not equal. This means
   that concrete spec `==` and `!=` comparisons will no longer have
   to traverse the whole DAG.

2. Spec comparison now traverses the graph consistently, comparing nodes
   and edges in breadth-first order. This means Spec sort order is stable,
   and it won't vary arbitrarily from run to run.

3. Traversal can be expensive, so we avoid it for simple specs. Specifically,
   if a spec has no dependencies, or if its dependencies have no dependencies,
   we avoid calling `traverse_edges()` by doing some special casing.

The `_cmp_iter` method for `Spec` now iterates over the DAG and yields nodes
in BFS order. While it does that, it generates consistent ids for each node,
based on traversal order. It then outputs edges in terms of these ids, along with
their depflags and virtuals, so that all parts of the Spec DAG are included.
The resulting total ordering of specs keys on node attributes first, then
dependency nodes, then any edge differences between graphs.

Optimized cases skip the id generation and traversal, since we know the
order and therefore the ids in advance.

## Performance ramifications

### Abstract specs

This seems to add around 7-8% overhead to concretization setup time. It's
worth the cost, because this enables concretization caching (as input to
concretization was previously not stable) and setup will eventually be
parallelized, at which point it will no longer be a bottleneck for solving.
Together those two optimizations will cut well over 50% of the time (likely
closer to 90+%) off of most solves.

### Concrete specs

Comparison for concrete specs is faster than before, sometimes *way* faster
because comparison is now guaranteed to be linear time w.r.t. DAG size.
Times for comparing concrete Specs:

```python
def compare(json):
    a = spack.spec.Spec(json)
    b = spack.spec.Spec(json)
    print(a == b)
    print(timeit.timeit(lambda: a == b, number=1))

compare("./py-black.json")
compare("./hdf5.json")
```

* `develop` (uses our prior hash optimization):
  * `py-black`: 7.013e-05s
  * `py-hdf5`: 6.445e-05s

* `develop` with full traversal and no hash:
  * `py-black`: 3.955s
  * `py-hdf5`: 0.0122s

* This branch (full traversal, stable, short-circuiting, no hash)
  * `py-black`: 2.208e-06s
  * `py-hdf5`: 3.416e-06s

Signed-off-by: Todd Gamblin <tgamblin@llnl.gov>

2025-05-19 15:59:03 -07:00

.devcontainer

codespaces: add ubuntu22.04 (#46100 )

2024-09-12 13:40:05 +02:00

.github

Sync CI config to spack/spack-packages (#50497 )

2025-05-16 11:39:11 +02:00

bin

import os.path -> os (#48709 )

2025-01-28 09:45:43 +01:00

etc/spack/defaults

builtin: use api v2.0 and update dir structure (#49275 )

2025-05-06 12:05:44 +02:00

lib/spack

Spec: use short-circuiting, stable comparison

2025-05-19 15:59:03 -07:00

share/spack

Move builders into builtin repo (#50452 )

2025-05-18 20:31:20 -07:00

var/spack

Move builders into builtin repo (#50452 )

2025-05-18 20:31:20 -07:00

.codecov.yml

codecov: increase project threshold to 2% (#46828 )

2024-10-07 08:24:22 +02:00

.dockerignore

Docker: ignore var/spack/cache (source caches) when creating container (#23329 )

2021-05-17 11:28:58 +02:00

.flake8

builtin: use api v2.0 and update dir structure (#49275 )

2025-05-06 12:05:44 +02:00

.git-blame-ignore-revs

Ignore black reformat in git blame (#35544 )

2023-02-18 01:03:50 -08:00

.gitattributes

Remove Prolog so that GitHub detects Answer Set Programming (#50077 )

2025-04-15 22:41:53 -07:00

.gitignore

gitignore: remove *_archive (#49278 )

2025-03-04 18:37:18 +01:00

.mailmap

…

.readthedocs.yml

docs: do not promote build_systems/* at all (#47111 )

2024-10-21 13:40:29 +02:00

CHANGELOG.md

update CHANGELOG.md (#46758 )

2024-10-03 18:01:46 -07:00

CITATION.cff

CITATION.cff: wrap at 100 columns like the rest of Spack (#41849 )

2023-12-27 08:02:30 -08:00

Remove years from license headers (#48352 )

2025-01-02 15:40:28 +01:00

LICENSE-APACHE

…

LICENSE-MIT

Remove years from license headers (#48352 )

2025-01-02 15:40:28 +01:00

NOTICE

…

pyproject.toml

Add a prefix when we import vendored modules (#50443 )

2025-05-13 07:20:40 +02:00

pytest.ini

test_builtin_repo: add a marker to skip the test if builtin is not available (#50476 )

2025-05-15 19:50:42 +02:00

README.md

Improve our README to make it easier for new users (#49711 )

2025-04-15 17:08:39 -04:00

SECURITY.md

security: change SECURITY.md to recommend GitHub's private reporting (#39651 )

2023-08-28 18:06:17 +00:00

README.md

Getting Started • Config • Community • Contributing • Packaging Guide

Spack is a multi-platform package manager that builds and installs multiple versions and configurations of software. It works on Linux, macOS, Windows, and many supercomputers. Spack is non-destructive: installing a new version of a package does not break existing installations, so many configurations of the same package can coexist.

Spack offers a simple "spec" syntax that allows users to specify versions and configuration options. Package files are written in pure Python, and specs allow package authors to write a single script for many different builds of the same package. With Spack, you can build your software all the ways you want to.

See the Feature Overview for examples and highlights.

Installation

To install spack, first make sure you have Python & Git. Then:

git clone -c feature.manyFiles=true --depth=2 https://github.com/spack/spack.git

What are manyFiles=true and --depth=2?

-c feature.manyFiles=true improves git's performance on repositories with 1,000+ files.

--depth=2 prunes the git history to reduce the size of the Spack installation.

# For bash/zsh/sh
. spack/share/spack/setup-env.sh

# For tcsh/csh
source spack/share/spack/setup-env.csh

# For fish
. spack/share/spack/setup-env.fish

# Now you're ready to install a package!
spack install zlib-ng

Documentation

Full documentation is available, or run spack help or spack help --all.

For a cheat sheet on Spack syntax, run spack help --spec.

Tutorial

We maintain a hands-on tutorial. It covers basic to advanced usage, packaging, developer features, and large HPC deployments. You can do all of the exercises on your own laptop using a Docker container.

Feel free to use these materials to teach users at your organization about Spack.

Community

Spack is an open source project. Questions, discussion, and contributions are welcome. Contributions can be anything from new packages to bugfixes, documentation, or even new core features.

Resources:

Slack workspace: spackpm.slack.com. To get an invitation, visit slack.spack.io.
Matrix space: #spack-space:matrix.org: bridged to Slack.
Github Discussions: for Q&A and discussions. Note the pinned discussions for announcements.
X: @spackpm. Be sure to @mention us!
Mailing list: groups.google.com/d/forum/spack: only for announcements. Please use other venues for discussions.

Contributing

Contributing to Spack is relatively easy. Just send us a pull request. When you send your request, make develop the destination branch on the Spack repository.

Your PR must pass Spack's unit tests and documentation tests, and must be PEP 8 compliant. We enforce these guidelines with our CI process. To run these tests locally, and for helpful tips on git, see our Contribution Guide.

Spack's develop branch has the latest contributions. Pull requests should target develop, and users who want the latest package versions, features, etc. can use develop.

Releases

For multi-user site deployments or other use cases that need very stable software installations, we recommend using Spack's stable releases.

Each Spack release series also has a corresponding branch, e.g. releases/v0.14 has 0.14.x versions of Spack, and releases/v0.13 has 0.13.x versions. We backport important bug fixes to these branches but we do not advance the package versions or make other changes that would change the way Spack concretizes dependencies within a release branch. So, you can base your Spack deployment on a release branch and git pull to get fixes, without the package churn that comes with develop.

The latest release is always available with the releases/latest tag.

See the docs on releases for more details.

Code of Conduct

Please note that Spack has a Code of Conduct. By participating in the Spack community, you agree to abide by its rules.

Authors

Many thanks go to Spack's contributors.

Spack was created by Todd Gamblin, tgamblin@llnl.gov.

Citing Spack

If you are referencing Spack in a publication, please cite the following paper:

Todd Gamblin, Matthew P. LeGendre, Michael R. Collette, Gregory L. Lee, Adam Moody, Bronis R. de Supinski, and W. Scott Futral. The Spack Package Manager: Bringing Order to HPC Software Chaos. In Supercomputing 2015 (SC’15), Austin, Texas, November 15-20 2015. LLNL-CONF-669890.

On GitHub, you can copy this citation in APA or BibTeX format via the "Cite this repository" button. Or, see the comments in CITATION.cff for the raw BibTeX.

License

Spack is distributed under the terms of both the MIT license and the Apache License (Version 2.0). Users may choose either license, at their option.

All new contributions must be made under both the MIT and Apache-2.0 licenses.

See LICENSE-MIT, LICENSE-APACHE, COPYRIGHT, and NOTICE for details.

SPDX-License-Identifier: (Apache-2.0 OR MIT)

LLNL-CODE-811652

Description

A flexible package manager that supports multiple versions, configurations, platforms, and compilers.

build-tools hpc hpsf linux macos package-manager python radiuss scientific-computing spack windows

Readme Cite this repository 666 MiB

README.md Unescape Escape

Installation

Documentation

Tutorial

Community

Contributing

Releases

Code of Conduct

Authors

Citing Spack

License

README.md