On PR pipelines we need to override the buildcache destination to
point to the "spack-binaries-prs" bucket, otherwise, those pipelines
try to push to the default mirror in a bucket for which they don't
have write permission.
We adopted the convention of putting binaries for each stack into
a dedicated mirror named after the directory in which the stack
(spack.yaml file) resides. This fixes the mirror url of the
radiuss-aws-aarch64 stack to follow that convention.
Add spack stacks targeted at Spack + AWS + ARM HPC User Group hackathon. Includes
a list of miniapps and full-apps that are ready to run on both x86_64 and aarch64.
Co-authored-by: Scott Wittenburg <scott.wittenburg@kitware.com>
Add two new stacks targeted at x86_64 and arm, representing an initial list of packages
used by current and planned AWS Workshops, and built in conjunction with the ISC22
announcement of the spack public binary cache.
Co-authored-by: Scott Wittenburg <scott.wittenburg@kitware.com>
This PR supports the creation of securely signed binaries built from spack
develop as well as release branches and tags. Specifically:
- remove internal pr mirror url generation logic in favor of buildcache destination
on command line
- with a single mirror url specified in the spack.yaml, this makes it clearer where
binaries from various pipelines are pushed
- designate some tags as reserved: ['public', 'protected', 'notary']
- these tags are stripped from all jobs by default and provisioned internally
based on pipeline type
- update gitlab ci yaml to include pipelines on more protected branches than just
develop (so include releases and tags)
- binaries from all protected pipelines are pushed into mirrors including the
branch name so releases, tags, and develop binaries are kept separate
- update rebuild jobs running on protected pipelines to run on special runners
provisioned with an intermediate signing key
- protected rebuild jobs no longer use "SPACK_SIGNING_KEY" env var to
obtain signing key (in fact, final signing key is nowhere available to rebuild jobs)
- these intermediate signatures are verified at the end of each pipeline by a new
signing job to ensure binaries were produced by a protected pipeline
- optionallly schedule a signing/notary job at the end of the pipeline to sign all
packges in the mirror
- add signing-job-attributes to gitlab-ci section of spack environment to allow
configuration
- signing job runs on special runner (separate from protected rebuild runners)
provisioned with public intermediate key and secret signing key
Currently, environments can either be concretized fully together or fully separately. This works well for users who create environments for interoperable software and can use `concretizer:unify:true`. It does not allow environments with conflicting software to be concretized for maximal interoperability.
The primary use-case for this is facilities providing system software. Facilities provide multiple MPI implementations, but all software built against a given MPI ought to be interoperable.
This PR adds a concretization option `concretizer:unify:when_possible`. When this option is used, Spack will concretize specs in the environment separately, but will optimize for minimal differences in overlapping packages.
* Add a level of indirection to root specs
This commit introduce the "literal" atom, which comes with
a few different "arities". The unary "literal" contains an
integer that id the ID of a spec literal. Other "literals"
contain information on the requests made by literal ID. For
instance zlib@1.2.11 generates the following facts:
literal(0,"root","zlib").
literal(0,"node","zlib").
literal(0,"node_version_satisfies","zlib","1.2.11").
This should help with solving large environments "together
where possible" since later literals can be now solved
together in batches.
* Add a mechanism to relax the number of literals being solved
* Modify spack solve to display the new criteria
Since the new criteria is above all the build criteria,
we need to modify the way we display the output.
Originally done by Greg in #27964 and cherry-picked
to this branch by the co-author of the commit.
Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
* Inject reusable specs into the solve
Instead of coupling the PyclingoDriver() object with
spack.config, inject the concrete specs that can be
reused.
A method level function takes care of reading from
the store and the buildcache.
* spack solve: show output of multi-rounds
* add tests for best-effort coconcretization
* Enforce having at least a literal being solved
Co-authored-by: Greg Becker <becker33@llnl.gov>
Add two new cloud pipelines for E4S on Amazon Linux, include arm and x86 (v3 + v4) stacks.
Notes:
- Updated mpark-variant to remove conflict that no longer exists in Amazon Linux
- Which command on Amazon Linux prefixes on all results when padded_length is too high. In this case, padded_length<=503 works as expected. Chose conservative length of 384.
* Introduce concretizer:unify option to replace spack:concretization
* Deprecate concretization
* Make spack:concretization overrule concretize:unify for now
* Add environment update logic to move from spack:concretization to spack:concretizer:reuse
* Migrate spack:concretization to spack:concretize:unify in all locations
* For new environments make concretizer:unify explicit, so that defaults can be changed in 0.19
For tutorial builds, we should continue to allow deprecated builds to be installed. We
can update them as needed when we update the tutorial, but we don't need to correct them
immediately on deprecation in CI.
- [x] add `deprecated:true` to tutorial `spack.yaml` config.
Gitlab pipelines run for spack already have other S3 storage locations
configured for storage of binaries, so this PR removes the redundant
per-pipeline mirror. As a result, the "cleanup" jobs will no longer be
generated at the end of each pipeline, removing one possible point of
pipeline failure.
gitlab ci: Set resource requests explicitly
This PR sets resource requests for the Kubernetes executor, which should aid in
better workload scheduling in the cluster. The specific values were derived from
profile data taken from several full "from scratch" rebuilds in a separate worker pool.
Co-authored-by: Zack Galbreath <zack.galbreath@kitware.com>
We've previously generated CI pipelines for PRs, and they rebuild any packages that don't have
a binary in an existing build cache. The assumption we were making was that ALL prior merged
builds would be in cache, but due to the way we do security in the pipeline, they aren't. `develop`
pipelines can take a while to catch up with the latest PRs, and while it does that, there may be a
bunch of redundant builds on PRs that duplicate things being rebuilt on `develop`. Until we can
do better caching of PR builds, we'll have this problem.
We can do better in PRs, though, by *only* rebuilding things in the CI environment that are actually
touched by the PR. This change computes exactly what packages are changed by a PR branch and
*only* includes those packages' dependents and dependencies in the generated pipeline. Other
as-yet unbuilt packages are pruned from CI for the PR.
For `develop` pipelines, we still want to build everything to ensure that the stack works, and to ensure
that `develop` catches up with PRs. This is especially true since we do not do rebuilds for *every* commit
on `develop` -- just the most recent one after each `develop` pipeline finishes. Since we skip around,
we may end up missing builds unless we ensure that we rebuild everything.
We differentiate between `develop` and PR pipelines in `.gitlab-ci.yml` by setting
`SPACK_PRUNE_UNTOUCHED` for PRs. `develop` will still have the old behavior.
- [x] Add `SPACK_PRUNE_UNTOUCHED` variable to `spack ci`
- [x] Refactor `spack pkg` command by moving historical package checking logic to `spack.repo`
- [x] Implement pruning logic in `spack ci` to remove untouched packages
- [x] add tests
* hdf5: mark +fortran+shared conflict for older version
This version was only activated unintentionally by silo's conflict
statement, but `@1.8.15+shared+fortran+cxx` errors out in configure:
```
CMake Error at CMakeLists.txt:814 (message):
**** Shared FORTRAN libraries are unsupported ****
```
* silo: refine hdf5 conflicts to avoid building old version
Before this, `silo+hdf5` concretized to 1.10.7 or sometimes 1.8.15. Now
I've verified it works for the following configurations:
```
silo@4.10.2 patches=7b5a1dc,952d3c9
^ hdf5@1.10.7 api=default
silo@4.10.2 patches=7b5a1dc,952d3c9,eb2a3a0
^ hdf5@1.10.8 api=v18
silo@4.10.2 patches=7b5a1dc,952d3c9,eb2a3a0
^ hdf5@1.12.1 api=v110
silo@4.11-bsd patches=eb2a3a0
^ hdf5@1.12.1 api=v110
silo@4.11-bsd patches=eb2a3a0
^ hdf5@1.10.8 api=default
silo@4.11-bsd patches=eb2a3a0
^ hdf5@1.12.1 api=default
```
and verified that the following fail:
```
silo@4.10.2 ^hdf5@1.12.1 api=default
silo@4.11 ^hdf5 api=v18
silo@4.11-bsd ^hdf5@1.13.0 api=v12
silo@4.11-bsd ^hdf5@1.13.0 api=default
```
and have updated the constraints to match. Hdf5 no longer has to be
downgraded to work with Silo.
* silo: fix dependency conflicts
* py-h5py: shorten and add comments to py-h5py hdf5 dependencies
* e4s: remove slightly outdated hdf5 requirement
* e4s: remove excessive hdf5 variant constraints
These I think are holdovers from the old concretizer.
- `hdf5_compat` can be expressed as `+hdf5 ^hdf5@1.8`
- The extra variants on hdf5 shouldn't break conduit
- axom unnecessarily restricts hdf5 version
* conduit: restore hdf5_compat flag
* trilinos: update dependencies
Use the tribits deps to clarify some dependencies, and group some together
using `with` statements, eliminating some transitive conflict duplication.
* trilinos: Restricit cuda incompatibility
* e4s: vastly reduce number of packages in trilinos-cuda build
Not clear who the customers of cuda-enabled trilinos are, or what options
they need, or which sets of options conflict...
* e4s: remove ~wrapper from trilinos+cuda
* llvm: make targets a multivalued variant
* Fix the targets variant values
1. Make them lowercase and add a mapping to cmake equivalent
2. auto -> all
2. Restore composability by using a multivalued variant, so that
`targets=all` and `targets=x86` is combined to `targets=all,x86`
which is then transformed into LLVM_TARGETS_TO_BUILD=all.
* use targets=x86 in iwyu
* Default to nvptx/amdgpu/host arch targets
* default to none
* Update var/spack/repos/builtin/packages/zig/package.py
* ci: Enable more packages in the DVSDK CI pipeline
* doxygen: Add conflicts for gcc bugs
* dray: Add version constraints for api breakage with newer deps
Modifications:
- [x] Change `defaults/config.yaml`
- [x] Add a fix for bootstrapping patchelf from sources if `compilers.yaml` is empty
- [x] Make `SPACK_TEST_SOLVER=clingo` the default for unit-tests
- [x] Fix package failures in the e4s pipeline
Caveats:
1. CentOS 6 still uses the original concretizer as it can't connect to the buildcache due to issues with `ssl` (bootstrapping from sources requires a C++14 capable compiler)
1. I had to update the image tag for GitlabCI in e699f14.
1. libtool v2.4.2 has been deprecated and other packages received some update
Gitlab truncates job trace output (even the complete raw output) at 4MB,
so this change captures it to a file under "user_data" artifacts as well,
to make sure we can debug output from the end of the rebuild job.
Modifications:
- Remove the "build tests" workflow from GitHub Actions
- Setup a similar e2e test on Gitlab
In this way we'll reduce load on GitHub Actions workflows and for e2e tests will
benefit from the buildcache reuse granted by pipelines.
* trilinos: rename basker variant
The Basker solver is part of amesos2 but is clearer without the extra
scoping.
* trilinos: automatically enable teuchos and remove variant
Basically everything in trilinos needs teuchos
* trilinos: group top-level dependencies
* trilinos: update dependencies, removing unused
- GLM, X11 are unused (x11 lacks dependency specs too)
- Python variant is more like a TPL so rearrange that
- Gtest internal package shouldn't be compiled or exported
- Add MPI4PY requirement for pytrilinos
* trilinos: remove package meta-options
- XSDK settings and "all opt packages" are not used anywhere
- all optional packages are dangerous
* trilinos: Use hwloc iff kokkos
See #19119, also the HWLOC tpl name was misspelled so this was being ignored before.
* Flake
* Fix trilinos +netcdf~mpi
* trilinos: default to disabling external dependencies
* Remove teuchos from downstream dependencies
* fixup! trilinos: Use hwloc iff kokkos
* Add netcdf requirements to packages with ^trilinos+exodus
* trilinos: disable exodus by default
* fixup! Add netcdf requirements to packages with ^trilinos+exodus
* trilinos: only enable hwloc when @13: +kokkos
* xyce: propagate trilinos dependencies more simply
* dtk: fix missing boost dependency
* trilinos: remove explicit metis dependency
* trilinos: require metis/parmetis for zoltan
Disable zoltan by default to minimize default dependencies
* trilinos: mark mesquite disabled and fix kokkos arch
* xsdk: fix trilinos to also list zoltan [with zoltan2]
* ci: remove nonexistent variant from trilinos
* trilinos: add missing boost dependency
Co-authored-by: Satish Balay <balay@mcs.anl.gov>
Spack pipelines need to take specific actions internally that depend
on whether the pipeline is being run on a PR to spack or a merge to
the develop branch. Pipelines can also run in other repositories,
which represents other possible use cases than just the two mentioned
above. This PR creates a "SPACK_PIPELINE_TYPE" gitlab variable which
is propagated to rebuild jobs, and is also used internally to determine
which pipeline-specific tasks to run.
One goal of the PR is fix an issue where rebuild jobs which failed on
develop pipelines did not properly report the broken full hash to the
"broken-specs-url".
* remove blueos check on cuda variant, fix typo
* restore necessary compiler guard
* remove axom+cuda from testing because it only partially works outside ppc systems
Building magma has been failing consistently and is currently
blocking PRs from being merged. Disable that spec while we
investigate the failure and work on a fix.
* Update of Flecsi Spackage
Update of flecsi spackage to reconcile differences between flecsi@1:1.9
and flecsi@2: for future support purposes
* Removing Unnecessary Conditional
Removing unused conditional. Initially the plan was to switch based on
version in `cmake_args` but this was not necessary as build system
variable names remained mostly the same and conflicts prevent the rest.
For the most part, if a variant is there it does not need to check
against what version of the code is being built.
* Updated CI To Reconcile Flecsi Changes
Updated CI to target flecsi@1.4.2 which best matches the previous
release version and reconciled change in variant name
* e4s ci: enable full e4s
* add llvm-amdgpu to list of specs needing an xlarge tagged runner
* comment out qt and qwt because of intermittent build failures
* remove +rocm specs because rocblas job consistently fails due to infrastructure
### Overview
The goal of this PR is to make gitlab pipeline builds (especially build failures) more reproducible outside of the pipeline environment. The two key changes here which aim to improve reproducibility are:
1. Produce a `spack.lock` during pipeline generation which is passed to child jobs via artifacts. This concretized environment is used both by generated child jobs as well as uploaded as an artifact to be used when reproducing the build locally.
2. In the `spack ci rebuild` command, if a spec needs to be rebuilt from source, do this by generating and running an `install.sh` shell script which is then also uploaded as a job artifact to be run during local reproduction.
To make it easier to take advantage of improved build reproducibility, this PR also adds a new subcommand, `spack ci reproduce-build`, which, given a url to job artifacts:
- fetches and unzips the job artifacts to a local directory
- looks for the generated pipeline yaml and parses it to find details about the job to reproduce
- attempts to provide a copy of the same version of spack used in the ci build
- if the ci build used a docker image, the command prints a `docker run` command you can run to get an interactive shell for reproducing the build
#### Some highlights
One consequence of this change will be much smaller pipeline yaml files. By encoding the concrete environment in a `spack.lock` and passing to child jobs via artifacts, we will no longer need to encode the concrete root of each spec and write it into the job variables, greatly reducing the size of the generated pipeline yaml.
Additionally `spack ci rebuild` output (stdout/stderr) is no longer internally redirected to a log file, so job output will appear directly in the gitlab job trace. With debug logging turned on, this often results in log files getting truncated because they exceed the maximum amount of log output gitlab allows. If this is a problem, you still have the option to `tee` command output to a file in the within the artifacts directory, as now each generated job exposes a `user_data` directory as an artifact, which you can fill with whatever you want in your custom job scripts.
There are some changes to be aware of in how pipelines should be set up after this PR:
#### Pipeline generation
Because the pipeline generation job now writes a `spack.lock` artifact to be consumed by generated downstream jobs, `spack ci generate` takes a new option `--artifacts-root`, inside which it creates a `concrete_env` directory to place the lockfile. This artifacts root directory is also where the `user_data` directory will live, in case you want to generate any custom artifacts. If you do not provide `--artifacts-root`, the default is for it to create a `jobs_scratch_dir` within your `CI_PROJECT_DIR` (a gitlab predefined environment variable) or whatever is your current working directory if that variable isn't set. Here's the diff of the PR testing `.gitlab-ci.yml` taking advantage of the new option:
```
$ git diff develop..pipelines-reproducible-builds share/spack/gitlab/cloud_pipelines/.gitlab-ci.yml
diff --git a/share/spack/gitlab/cloud_pipelines/.gitlab-ci.yml b/share/spack/gitlab/cloud_pipelines/.gitlab-ci.yml
index 579d7b56f3..0247803a30 100644
--- a/share/spack/gitlab/cloud_pipelines/.gitlab-ci.yml
+++ b/share/spack/gitlab/cloud_pipelines/.gitlab-ci.yml
@@ -28,10 +28,11 @@ default:
- cd share/spack/gitlab/cloud_pipelines/stacks/${SPACK_CI_STACK_NAME}
- spack env activate --without-view .
- spack ci generate --check-index-only
+ --artifacts-root "${CI_PROJECT_DIR}/jobs_scratch_dir"
--output-file "${CI_PROJECT_DIR}/jobs_scratch_dir/cloud-ci-pipeline.yml"
artifacts:
paths:
- - "${CI_PROJECT_DIR}/jobs_scratch_dir/cloud-ci-pipeline.yml"
+ - "${CI_PROJECT_DIR}/jobs_scratch_dir"
tags: ["spack", "public", "medium", "x86_64"]
interruptible: true
```
Notice how we replaced the specific pointer to the generated pipeline file with its containing folder, the same folder we passed as `--artifacts-root`. This way anything in that directory (the generated pipeline yaml, as well as the concrete environment directory containing the `spack.lock`) will be uploaded as an artifact and available to the downstream jobs.
#### Rebuild jobs
Rebuild jobs now must activate the concrete environment created by `spack ci generate` and provided via artifacts. When the pipeline is generated, a directory called `concrete_environment` is created within the artifacts root directory, and this is where the `spack.lock` file is written to be passed to the generated rebuild jobs. The artifacts root directory can be specified using the `--artifacts-root` option to `spack ci generate`, otherwise, it is assumed to be `$CI_PROJECT_DIR`. The directory containing the concrete environment files (`spack.yaml` and `spack.lock`) is then passed to generated child jobs via the `SPACK_CONCRETE_ENV_DIR` variable in the generated pipeline yaml file.
When you don't provide custom `script` sections in your `mappings` within the `gitlab-ci` section of your `spack.yaml`, the default behavior of rebuild jobs is now to change into `SPACK_CONCRETE_ENV_DIR` and activate that environment. If you do provide custom rebuild scripts in your `spack.yaml`, be aware those scripts should do the same thing: assume `SPACK_CONCRETE_ENV_DIR` contains the concretized environment to activate. No other changes to existing custom rebuild scripts should be required as a result of this PR.
As mentioned above, one key change made in this PR is the generation of the `install.sh` script by the rebuild jobs, as that same script is both run by the CI rebuild job as well as exported as an artifact to aid in subsequent attempts to reproduce the build outside of CI. The generated `install.sh` script contains only a single `spack install` command with arguments computed by `spack ci rebuild`. If the install fails, the job trace in gitlab will contain instructions on how to reproduce the build locally:
```
To reproduce this build locally, run:
spack ci reproduce-build https://gitlab.next.spack.io/api/v4/projects/7/jobs/240607/artifacts [--working-dir <dir>]
If this project does not have public pipelines, you will need to first:
export GITLAB_PRIVATE_TOKEN=<generated_token>
... then follow the printed instructions.
```
When run locally, the `spack ci reproduce-build` command shown above will download and process the job artifacts from gitlab, then print out instructions you can copy-paste to run a local reproducer of the CI job.
This PR includes a few other changes to the way pipelines work, see the documentation on pipelines for more details.
This PR erelies on
~- [ ] #23194 to be able to refer to uninstalled specs by DAG hash~
EDIT: that is going to take longer to come to fruition, so for now, we will continue to install specs represented by a concrete `spec.yaml` file on disk.
- [x] #22657 to support install a single spec already present in the active, concrete environment
Fixes for gitlab pipelines
* Remove accidentally retained testing branch name
* Generate pipeline w/out debug mode
* Make jobs interruptible for auto-cancel pending
* Work around concretization conflicts
This change makes improvements to the `spack ci rebuild` command
which supports running gitlab pipelines on PRs from forks. Much
of this has to do with making sure we can run without the secrets
previously required for running gitlab pipelines (e.g signing key,
aws credentials, etc). Specific improvements in this PR:
Check if spack has precisely one signing key, and use that information
as an additional constraint on whether or not we should attempt to sign
the binary package we create.
Also, if spack does not have at least one public key, add the install
option "--no-check-signature"
If we are running a pipeline without any profile or environment
variables allowing us to push to S3, the pipeline could still
successfully create a buildcache in the artifacts and move on. So
just print a message and move on if pushing either the buildcache
entry or cdash id file to the remote mirror fails.
When we attempt to generate a pacakge or gpg key index on an S3
mirror, and there is nothing to index, just print a warning and
exit gracefully rather than throw an exception.
Support the use of PR-specific mirrors for temporary binary pkg
storage. This will allow quality-of-life improvement for developers,
providing a place to store binaries over the lifetime of a PR, so
that they must only wait for packages to rebuild from source when
they push a new commit that causes it to be necessary.
Replace two-pass install with a single pass and the new option:
--require-full-hash-match. Doing this also removes the need to
save a copy of the spack.yaml to be copied over the one spack
rewrites in between the two spack install passes.
Work around a mirror configuration issue caused by using
spack.util.executable to do the package installation.
* Update pipeline trigger jobs for PRs from forks
Moving to PRs from forks relies on external synchronization script
pushing special branch names. Also secrets will only live on the
spack mirror project, and must be propagated to the E4S project via
variables on the trigger jobs.
When this change is merged, pipelines will not run until we update
the "Custom CI configuration path" in the Gitlab CI Settings, as the
name of the file has changed to better reflect its purpose.
* Arg to MirrorCollection is used exclusively, so add main remote mirror to it
* Compute full hash less frequently
* Add tests covering index generation error handling code
* trigger ascent e4s pipeline on merge to spack develop
* change pipeline name ecpcitest/e4s is the pipeline that will be triggered for merge on develop its the E4S use-case.
This change also adds a code path through the spack ci pipelines
infrastructure which supports PR testing on the Spack repository.
Gitlab pipelines run as a result of a PR (either creation or pushing
to a PR branch) will only verify that the packages in the environment
build without error. When the PR branch is merged to develop,
another pipeline will run which results in the generated binaries
getting pushed to the binary mirror.