This commit extends the DSL that can be used in packages
to allow declaring that a package uses different build-systems
under different conditions.
It requires each spec to have a `build_system` single valued
variant. The variant can be used in many context to query, manipulate
or select the build system associated with a concrete spec.
The knowledge to build a package has been moved out of the
PackageBase hierarchy, into a new Builder hierarchy. Customization
of the default behavior for a given builder can be obtained by
coding a new derived builder in package.py.
The "run_after" and "run_before" decorators are now applied to
methods on the builder. They can also incorporate a "when="
argument to specify that a method is run only when certain
conditions apply.
For packages that do not define their own builder, forwarding logic
is added between the builder and package (methods not found in one
will be retrieved from the other); this PR is expected to be fully
backwards compatible with unmodified packages that use a single
build system.
When we lose a running pod (possibly loss of spot instance) or encounter
some other infrastructure-related failure of this job, we need to retry
it. This retries the job the maximum number of times in those cases.
`reuse` and `when_possible` concretization broke the invariant that
`spec[pkg_name]` has unique keys. This invariant is relied on in tons of
places, such as when setting up the build environment.
When using `when_possible` concretization, one may end up with two or
more `perl`s or `python`s among the transitive deps of a spec, because
concretization does not consider build-only deps of reusable specs.
Until the code base is fixed not to rely on this broken property of
`__getitem__`, we should disable reuse in CI.
Currently "spack ci generate" chooses the first matching entry in
gitlab-ci:mappings to fill attributes for a generated build-job,
requiring that the entire configuration matrix is listed out
explicitly. This unfortunately causes significant problems in
environments with large configuration spaces, for example the
environment in #31598 (spack.yaml) supports 5 operating systems,
3 architectures and 130 packages with explicit size requirements,
resulting in 1300 lines of configuration YAML.
This patch adds a configuraiton option to the gitlab-ci schema called
"match_behavior"; when it is set to "merge", all matching entries
are applied in order to the final build-job, allowing a few entries
to cover an entire matrix of configurations.
The default for "match_behavior" is "first", which behaves as before
this commit (only the runner attributes of the first match are used).
In addition, match entries may now include a "remove-attributes"
configuration, which allows matches to remove tags that have been
aggregated by prior matches. This only makes sense to use with
"match_behavior:merge". You can combine "runner-attributes" with
"remove-attributes" to effectively override prior tags.
Basic stack of ML packages we would like to test and generate binaries for in CI.
Spack now has a large CI framework in GitLab for PR testing and public binary generation.
We should take advantage of this to test and distribute optimized binaries for popular ML
frameworks.
This is a pretty extensive initial set, including CPU, ROCm, and CUDA versions of a core
`x96_64_v4` stack.
### Core ML frameworks
These are all popular core ML frameworks already available in Spack.
- [x] PyTorch
- [x] TensorFlow
- [x] Scikit-learn
- [x] MXNet
- [x] CNTK
- [x] Caffe
- [x] Chainer
- [x] XGBoost
- [x] Theano
### ML extensions
These are domain libraries and wrappers that build on top of core ML libraries
- [x] Keras
- [x] TensorBoard
- [x] torchvision
- [x] torchtext
- [x] torchaudio
- [x] TorchGeo
- [x] PyTorch Lightning
- [x] torchmetrics
- [x] GPyTorch
- [x] Horovod
### ML-adjacent libraries
These are libraries that aren't specific to ML but are still core libraries used in ML pipelines
- [x] numpy
- [x] scipy
- [x] pandas
- [x] ONNX
- [x] bazel
Co-authored-by: Jonathon Anderson <17242663+blue42u@users.noreply.github.com>
PR #32615 deprecated Python versions up to 3.6.X. Since
the "build-systems" pipeline requires Python 3.6.15 to
build "tut", it will fail on the first rebuild that
involves Python.
The "tut" package is meant to perform an end-to-end
test of the "Waf" build-system, which is scarcely
used. The fix therefore is just to remove it from
the pipeline.
amazon linux 2 ships a glibc that is too old to work with cuda toolkit
for aarch64.
For example:
`libcurand.so.10.2.10.50` requires the symbol `logf@@GLIBC_2.27`, but
glibc is at 2.26.
So, these specs are removed.
Move the copying of the buildcache to a root job that runs after all the child
pipelines have finished, so that the operation can be coordinated across all
child pipelines to remove the possibility of race conditions during potentially
simlutandous copies. This lets us ensure the .spec.json.sig and .spack files
for any spec in the root mirror always come from the same child pipeline
mirror (though which pipeline is arbitrary). It also allows us to avoid copying
of duplicates, which we now do.
This support requires adding the '--tests' option to 'spack ci rebuild'.
Packages whose stand-alone tests are broken (in the CI environment) can
be configured in gitlab-ci to be skipped by adding them to
broken-tests-packages.
Highlights include:
- Restructured 'spack ci' help to provide better subcommand summaries;
- Ensured only one InstallError (i.e., installer's) rather than allowing
build_environment to have its own; and
- Refactored CI and CDash reporting to keep CDash-related properties and
behavior in a separate class.
This allows stand-alone tests from `spack ci` to run when the `--tests`
option is used. With `--tests`, stand-alone tests are run **after** a
**successful** (re)build of the package. Test results are collected
and report(able) using CDash.
This PR adds the following features:
- Adds `-t` and `--tests` to `spack ci rebuild` to run stand-alone tests;
- Adds `--fail-fast` to stop stand-alone tests after the first failure;
- Ensures a *single* `InstallError` across packages
(i.e., removes second class from build environment);
- Captures skipping tests for externals and uninstalled packages
(for CDash reporting);
- Copies test logs and outputs to the CI artifacts directory to facilitate
debugging;
- Parses stand-alone test results to report outputs from each `run_test` as
separate test parts (CDash reporting);
- Logs a test completion message to allow capture of timing of the last
`run_test` part;
- Adds the runner description to the CDash site to better distinguish entries
in CDash tables;
- Adds `gitlab-ci` `broken-tests-packages` to CI configuration to skip
stand-alone testing for packages with known issues;
- Changes `spack ci --help` so description of each subcommand is a single line;
- Changes `spack ci <subcommand> --help` to provide the full description of
each command (versus no description); and
- Ensures `junit` test log file ends in an `.xml` extension (versus default where
it does not).
Tasks:
- [x] Include the equivalent of the architecture information, or at least the host target, in the CDash output
- [x] Upload stand-alone test results files as `test` artifacts
- [x] Confirm tests are run in GitLab
- [x] Ensure CDash results are uploaded as artifacts
- [x] Resolve issues with CDash build-and test results appearing on same row of the table
- [x] Add unit tests as needed
- [x] Investigate why some (dependency) packages don't have test results (e.g., related from other pipelines)
- [x] Ensure proper parsing and reporting of skipped tests (as `not run`) .. post- #28701 merge
- [x] Restore the proper CDash URLand or mirror ONCE out-of-band testing completed
On PR pipelines we need to override the buildcache destination to
point to the "spack-binaries-prs" bucket, otherwise, those pipelines
try to push to the default mirror in a bucket for which they don't
have write permission.
We adopted the convention of putting binaries for each stack into
a dedicated mirror named after the directory in which the stack
(spack.yaml file) resides. This fixes the mirror url of the
radiuss-aws-aarch64 stack to follow that convention.
Add spack stacks targeted at Spack + AWS + ARM HPC User Group hackathon. Includes
a list of miniapps and full-apps that are ready to run on both x86_64 and aarch64.
Co-authored-by: Scott Wittenburg <scott.wittenburg@kitware.com>
Add two new stacks targeted at x86_64 and arm, representing an initial list of packages
used by current and planned AWS Workshops, and built in conjunction with the ISC22
announcement of the spack public binary cache.
Co-authored-by: Scott Wittenburg <scott.wittenburg@kitware.com>
This PR supports the creation of securely signed binaries built from spack
develop as well as release branches and tags. Specifically:
- remove internal pr mirror url generation logic in favor of buildcache destination
on command line
- with a single mirror url specified in the spack.yaml, this makes it clearer where
binaries from various pipelines are pushed
- designate some tags as reserved: ['public', 'protected', 'notary']
- these tags are stripped from all jobs by default and provisioned internally
based on pipeline type
- update gitlab ci yaml to include pipelines on more protected branches than just
develop (so include releases and tags)
- binaries from all protected pipelines are pushed into mirrors including the
branch name so releases, tags, and develop binaries are kept separate
- update rebuild jobs running on protected pipelines to run on special runners
provisioned with an intermediate signing key
- protected rebuild jobs no longer use "SPACK_SIGNING_KEY" env var to
obtain signing key (in fact, final signing key is nowhere available to rebuild jobs)
- these intermediate signatures are verified at the end of each pipeline by a new
signing job to ensure binaries were produced by a protected pipeline
- optionallly schedule a signing/notary job at the end of the pipeline to sign all
packges in the mirror
- add signing-job-attributes to gitlab-ci section of spack environment to allow
configuration
- signing job runs on special runner (separate from protected rebuild runners)
provisioned with public intermediate key and secret signing key
Currently, environments can either be concretized fully together or fully separately. This works well for users who create environments for interoperable software and can use `concretizer:unify:true`. It does not allow environments with conflicting software to be concretized for maximal interoperability.
The primary use-case for this is facilities providing system software. Facilities provide multiple MPI implementations, but all software built against a given MPI ought to be interoperable.
This PR adds a concretization option `concretizer:unify:when_possible`. When this option is used, Spack will concretize specs in the environment separately, but will optimize for minimal differences in overlapping packages.
* Add a level of indirection to root specs
This commit introduce the "literal" atom, which comes with
a few different "arities". The unary "literal" contains an
integer that id the ID of a spec literal. Other "literals"
contain information on the requests made by literal ID. For
instance zlib@1.2.11 generates the following facts:
literal(0,"root","zlib").
literal(0,"node","zlib").
literal(0,"node_version_satisfies","zlib","1.2.11").
This should help with solving large environments "together
where possible" since later literals can be now solved
together in batches.
* Add a mechanism to relax the number of literals being solved
* Modify spack solve to display the new criteria
Since the new criteria is above all the build criteria,
we need to modify the way we display the output.
Originally done by Greg in #27964 and cherry-picked
to this branch by the co-author of the commit.
Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
* Inject reusable specs into the solve
Instead of coupling the PyclingoDriver() object with
spack.config, inject the concrete specs that can be
reused.
A method level function takes care of reading from
the store and the buildcache.
* spack solve: show output of multi-rounds
* add tests for best-effort coconcretization
* Enforce having at least a literal being solved
Co-authored-by: Greg Becker <becker33@llnl.gov>
Add two new cloud pipelines for E4S on Amazon Linux, include arm and x86 (v3 + v4) stacks.
Notes:
- Updated mpark-variant to remove conflict that no longer exists in Amazon Linux
- Which command on Amazon Linux prefixes on all results when padded_length is too high. In this case, padded_length<=503 works as expected. Chose conservative length of 384.
* Introduce concretizer:unify option to replace spack:concretization
* Deprecate concretization
* Make spack:concretization overrule concretize:unify for now
* Add environment update logic to move from spack:concretization to spack:concretizer:reuse
* Migrate spack:concretization to spack:concretize:unify in all locations
* For new environments make concretizer:unify explicit, so that defaults can be changed in 0.19
For tutorial builds, we should continue to allow deprecated builds to be installed. We
can update them as needed when we update the tutorial, but we don't need to correct them
immediately on deprecation in CI.
- [x] add `deprecated:true` to tutorial `spack.yaml` config.
Gitlab pipelines run for spack already have other S3 storage locations
configured for storage of binaries, so this PR removes the redundant
per-pipeline mirror. As a result, the "cleanup" jobs will no longer be
generated at the end of each pipeline, removing one possible point of
pipeline failure.
gitlab ci: Set resource requests explicitly
This PR sets resource requests for the Kubernetes executor, which should aid in
better workload scheduling in the cluster. The specific values were derived from
profile data taken from several full "from scratch" rebuilds in a separate worker pool.
Co-authored-by: Zack Galbreath <zack.galbreath@kitware.com>
We've previously generated CI pipelines for PRs, and they rebuild any packages that don't have
a binary in an existing build cache. The assumption we were making was that ALL prior merged
builds would be in cache, but due to the way we do security in the pipeline, they aren't. `develop`
pipelines can take a while to catch up with the latest PRs, and while it does that, there may be a
bunch of redundant builds on PRs that duplicate things being rebuilt on `develop`. Until we can
do better caching of PR builds, we'll have this problem.
We can do better in PRs, though, by *only* rebuilding things in the CI environment that are actually
touched by the PR. This change computes exactly what packages are changed by a PR branch and
*only* includes those packages' dependents and dependencies in the generated pipeline. Other
as-yet unbuilt packages are pruned from CI for the PR.
For `develop` pipelines, we still want to build everything to ensure that the stack works, and to ensure
that `develop` catches up with PRs. This is especially true since we do not do rebuilds for *every* commit
on `develop` -- just the most recent one after each `develop` pipeline finishes. Since we skip around,
we may end up missing builds unless we ensure that we rebuild everything.
We differentiate between `develop` and PR pipelines in `.gitlab-ci.yml` by setting
`SPACK_PRUNE_UNTOUCHED` for PRs. `develop` will still have the old behavior.
- [x] Add `SPACK_PRUNE_UNTOUCHED` variable to `spack ci`
- [x] Refactor `spack pkg` command by moving historical package checking logic to `spack.repo`
- [x] Implement pruning logic in `spack ci` to remove untouched packages
- [x] add tests