Refactor Spack's URL parsing commands (#2938)
* Replace `spack urls` and `spack url-parse` with `spack url` * Allow spack url list to only list incorrect parsings * Add spack url test reporting * Add unit tests for new URL commands
This commit is contained in:

committed by
Todd Gamblin

parent
2e81fe4fb3
commit
123f057089
@@ -300,6 +300,42 @@ Stage objects
|
||||
Writing commands
|
||||
----------------
|
||||
|
||||
Adding a new command to Spack is easy. Simply add a ``<name>.py`` file to
|
||||
``lib/spack/spack/cmd/``, where ``<name>`` is the name of the subcommand.
|
||||
At the bare minimum, two functions are required in this file:
|
||||
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
``setup_parser()``
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Unless your command doesn't accept any arguments, a ``setup_parser()``
|
||||
function is required to define what arguments and flags your command takes.
|
||||
See the `Argparse documentation <https://docs.python.org/2.7/library/argparse.html>`_
|
||||
for more details on how to add arguments.
|
||||
|
||||
Some commands have a set of subcommands, like ``spack compiler find`` or
|
||||
``spack module refresh``. You can add subparsers to your parser to handle
|
||||
this. Check out ``spack edit --command compiler`` for an example of this.
|
||||
|
||||
A lot of commands take the same arguments and flags. These arguments should
|
||||
be defined in ``lib/spack/spack/cmd/common/arguments.py`` so that they don't
|
||||
need to be redefined in multiple commands.
|
||||
|
||||
^^^^^^^^^^^^
|
||||
``<name>()``
|
||||
^^^^^^^^^^^^
|
||||
|
||||
In order to run your command, Spack searches for a function with the same
|
||||
name as your command in ``<name>.py``. This is the main method for your
|
||||
command, and can call other helper methods to handle common tasks.
|
||||
|
||||
Remember, before adding a new command, think to yourself whether or not this
|
||||
new command is actually necessary. Sometimes, the functionality you desire
|
||||
can be added to an existing command. Also remember to add unit tests for
|
||||
your command. If it isn't used very frequently, changes to the rest of
|
||||
Spack can cause your command to break without sufficient unit tests to
|
||||
prevent this from happening.
|
||||
|
||||
----------
|
||||
Unit tests
|
||||
----------
|
||||
@@ -312,14 +348,80 @@ Unit testing
|
||||
Developer commands
|
||||
------------------
|
||||
|
||||
.. _cmd-spack-doc:
|
||||
|
||||
^^^^^^^^^^^^^
|
||||
``spack doc``
|
||||
^^^^^^^^^^^^^
|
||||
|
||||
.. _cmd-spack-test:
|
||||
|
||||
^^^^^^^^^^^^^^
|
||||
``spack test``
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
.. _cmd-spack-url:
|
||||
|
||||
^^^^^^^^^^^^^
|
||||
``spack url``
|
||||
^^^^^^^^^^^^^
|
||||
|
||||
A package containing a single URL can be used to download several different
|
||||
versions of the package. If you've ever wondered how this works, all of the
|
||||
magic is in :mod:`spack.url`. This module contains methods for extracting
|
||||
the name and version of a package from its URL. The name is used by
|
||||
``spack create`` to guess the name of the package. By determining the version
|
||||
from the URL, Spack can replace it with other versions to determine where to
|
||||
download them from.
|
||||
|
||||
The regular expressions in ``parse_name_offset`` and ``parse_version_offset``
|
||||
are used to extract the name and version, but they aren't perfect. In order
|
||||
to debug Spack's URL parsing support, the ``spack url`` command can be used.
|
||||
|
||||
"""""""""""""""""""
|
||||
``spack url parse``
|
||||
"""""""""""""""""""
|
||||
|
||||
If you need to debug a single URL, you can use the following command:
|
||||
|
||||
.. command-output:: spack url parse http://cache.ruby-lang.org/pub/ruby/2.2/ruby-2.2.0.tar.gz
|
||||
|
||||
You'll notice that the name and version of this URL are correctly detected,
|
||||
and you can even see which regular expressions it was matched to. However,
|
||||
you'll notice that when it substitutes the version number in, it doesn't
|
||||
replace the ``2.2`` with ``9.9`` where we would expect ``9.9.9b`` to live.
|
||||
This particular package may require a ``list_url`` or ``url_for_version``
|
||||
function.
|
||||
|
||||
This command also accepts a ``--spider`` flag. If provided, Spack searches
|
||||
for other versions of the package and prints the matching URLs.
|
||||
|
||||
""""""""""""""""""
|
||||
``spack url list``
|
||||
""""""""""""""""""
|
||||
|
||||
This command lists every URL in every package in Spack. If given the
|
||||
``--color`` and ``--extrapolation`` flags, it also colors the part of
|
||||
the string that it detected to be the name and version. The
|
||||
``--incorrect-name`` and ``--incorrect-version`` flags can be used to
|
||||
print URLs that were not being parsed correctly.
|
||||
|
||||
""""""""""""""""""
|
||||
``spack url test``
|
||||
""""""""""""""""""
|
||||
|
||||
This command attempts to parse every URL for every package in Spack
|
||||
and prints a summary of how many of them are being correctly parsed.
|
||||
It also prints a histogram showing which regular expressions are being
|
||||
matched and how frequently:
|
||||
|
||||
.. command-output:: spack url test
|
||||
|
||||
This command is essential for anyone adding or changing the regular
|
||||
expressions that parse names and versions. By running this command
|
||||
before and after the change, you can make sure that your regular
|
||||
expression fixes more packages than it breaks.
|
||||
|
||||
---------
|
||||
Profiling
|
||||
---------
|
||||
|
@@ -712,8 +712,8 @@ is at ``http://example.com/downloads/foo-1.0.tar.gz``, Spack will look
|
||||
in ``http://example.com/downloads/`` for links to additional versions.
|
||||
If you need to search another path for download links, you can supply
|
||||
some extra attributes that control how your package finds new
|
||||
versions. See the documentation on `attribute_list_url`_ and
|
||||
`attribute_list_depth`_.
|
||||
versions. See the documentation on :ref:`attribute_list_url` and
|
||||
:ref:`attribute_list_depth`.
|
||||
|
||||
.. note::
|
||||
|
||||
@@ -728,6 +728,102 @@ versions. See the documentation on `attribute_list_url`_ and
|
||||
syntax errors, or the ``import`` will fail. Use this once you've
|
||||
got your package in working order.
|
||||
|
||||
--------------------
|
||||
Finding new versions
|
||||
--------------------
|
||||
|
||||
You've already seen the ``homepage`` and ``url`` package attributes:
|
||||
|
||||
.. code-block:: python
|
||||
:linenos:
|
||||
|
||||
from spack import *
|
||||
|
||||
|
||||
class Mpich(Package):
|
||||
"""MPICH is a high performance and widely portable implementation of
|
||||
the Message Passing Interface (MPI) standard."""
|
||||
homepage = "http://www.mpich.org"
|
||||
url = "http://www.mpich.org/static/downloads/3.0.4/mpich-3.0.4.tar.gz"
|
||||
|
||||
These are class-level attributes used by Spack to show users
|
||||
information about the package, and to determine where to download its
|
||||
source code.
|
||||
|
||||
Spack uses the tarball URL to extrapolate where to find other tarballs
|
||||
of the same package (e.g. in :ref:`cmd-spack-checksum`, but
|
||||
this does not always work. This section covers ways you can tell
|
||||
Spack to find tarballs elsewhere.
|
||||
|
||||
.. _attribute_list_url:
|
||||
|
||||
^^^^^^^^^^^^
|
||||
``list_url``
|
||||
^^^^^^^^^^^^
|
||||
|
||||
When spack tries to find available versions of packages (e.g. with
|
||||
:ref:`cmd-spack-checksum`), it spiders the parent directory
|
||||
of the tarball in the ``url`` attribute. For example, for libelf, the
|
||||
url is:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
url = "http://www.mr511.de/software/libelf-0.8.13.tar.gz"
|
||||
|
||||
Here, Spack spiders ``http://www.mr511.de/software/`` to find similar
|
||||
tarball links and ultimately to make a list of available versions of
|
||||
``libelf``.
|
||||
|
||||
For many packages, the tarball's parent directory may be unlistable,
|
||||
or it may not contain any links to source code archives. In fact,
|
||||
many times additional package downloads aren't even available in the
|
||||
same directory as the download URL.
|
||||
|
||||
For these, you can specify a separate ``list_url`` indicating the page
|
||||
to search for tarballs. For example, ``libdwarf`` has the homepage as
|
||||
the ``list_url``, because that is where links to old versions are:
|
||||
|
||||
.. code-block:: python
|
||||
:linenos:
|
||||
|
||||
class Libdwarf(Package):
|
||||
homepage = "http://www.prevanders.net/dwarf.html"
|
||||
url = "http://www.prevanders.net/libdwarf-20130729.tar.gz"
|
||||
list_url = homepage
|
||||
|
||||
.. _attribute_list_depth:
|
||||
|
||||
^^^^^^^^^^^^^^
|
||||
``list_depth``
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
``libdwarf`` and many other packages have a listing of available
|
||||
versions on a single webpage, but not all do. For example, ``mpich``
|
||||
has a tarball URL that looks like this:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
url = "http://www.mpich.org/static/downloads/3.0.4/mpich-3.0.4.tar.gz"
|
||||
|
||||
But its downloads are in many different subdirectories of
|
||||
``http://www.mpich.org/static/downloads/``. So, we need to add a
|
||||
``list_url`` *and* a ``list_depth`` attribute:
|
||||
|
||||
.. code-block:: python
|
||||
:linenos:
|
||||
|
||||
class Mpich(Package):
|
||||
homepage = "http://www.mpich.org"
|
||||
url = "http://www.mpich.org/static/downloads/3.0.4/mpich-3.0.4.tar.gz"
|
||||
list_url = "http://www.mpich.org/static/downloads/"
|
||||
list_depth = 2
|
||||
|
||||
By default, Spack only looks at the top-level page available at
|
||||
``list_url``. ``list_depth`` tells it to follow up to 2 levels of
|
||||
links from the top-level page. Note that here, this implies two
|
||||
levels of subdirectories, as the ``mpich`` website is structured much
|
||||
like a filesystem. But ``list_depth`` really refers to link depth
|
||||
when spidering the page.
|
||||
|
||||
.. _vcs-fetch:
|
||||
|
||||
@@ -1241,103 +1337,6 @@ RPATHs in Spack are handled in one of three ways:
|
||||
links. You can see this how this is used in the :ref:`PySide
|
||||
example <pyside-patch>` above.
|
||||
|
||||
--------------------
|
||||
Finding new versions
|
||||
--------------------
|
||||
|
||||
You've already seen the ``homepage`` and ``url`` package attributes:
|
||||
|
||||
.. code-block:: python
|
||||
:linenos:
|
||||
|
||||
from spack import *
|
||||
|
||||
|
||||
class Mpich(Package):
|
||||
"""MPICH is a high performance and widely portable implementation of
|
||||
the Message Passing Interface (MPI) standard."""
|
||||
homepage = "http://www.mpich.org"
|
||||
url = "http://www.mpich.org/static/downloads/3.0.4/mpich-3.0.4.tar.gz"
|
||||
|
||||
These are class-level attributes used by Spack to show users
|
||||
information about the package, and to determine where to download its
|
||||
source code.
|
||||
|
||||
Spack uses the tarball URL to extrapolate where to find other tarballs
|
||||
of the same package (e.g. in :ref:`cmd-spack-checksum`, but
|
||||
this does not always work. This section covers ways you can tell
|
||||
Spack to find tarballs elsewhere.
|
||||
|
||||
.. _attribute_list_url:
|
||||
|
||||
^^^^^^^^^^^^
|
||||
``list_url``
|
||||
^^^^^^^^^^^^
|
||||
|
||||
When spack tries to find available versions of packages (e.g. with
|
||||
:ref:`cmd-spack-checksum`), it spiders the parent directory
|
||||
of the tarball in the ``url`` attribute. For example, for libelf, the
|
||||
url is:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
url = "http://www.mr511.de/software/libelf-0.8.13.tar.gz"
|
||||
|
||||
Here, Spack spiders ``http://www.mr511.de/software/`` to find similar
|
||||
tarball links and ultimately to make a list of available versions of
|
||||
``libelf``.
|
||||
|
||||
For many packages, the tarball's parent directory may be unlistable,
|
||||
or it may not contain any links to source code archives. In fact,
|
||||
many times additional package downloads aren't even available in the
|
||||
same directory as the download URL.
|
||||
|
||||
For these, you can specify a separate ``list_url`` indicating the page
|
||||
to search for tarballs. For example, ``libdwarf`` has the homepage as
|
||||
the ``list_url``, because that is where links to old versions are:
|
||||
|
||||
.. code-block:: python
|
||||
:linenos:
|
||||
|
||||
class Libdwarf(Package):
|
||||
homepage = "http://www.prevanders.net/dwarf.html"
|
||||
url = "http://www.prevanders.net/libdwarf-20130729.tar.gz"
|
||||
list_url = homepage
|
||||
|
||||
.. _attribute_list_depth:
|
||||
|
||||
^^^^^^^^^^^^^^
|
||||
``list_depth``
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
``libdwarf`` and many other packages have a listing of available
|
||||
versions on a single webpage, but not all do. For example, ``mpich``
|
||||
has a tarball URL that looks like this:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
url = "http://www.mpich.org/static/downloads/3.0.4/mpich-3.0.4.tar.gz"
|
||||
|
||||
But its downloads are in many different subdirectories of
|
||||
``http://www.mpich.org/static/downloads/``. So, we need to add a
|
||||
``list_url`` *and* a ``list_depth`` attribute:
|
||||
|
||||
.. code-block:: python
|
||||
:linenos:
|
||||
|
||||
class Mpich(Package):
|
||||
homepage = "http://www.mpich.org"
|
||||
url = "http://www.mpich.org/static/downloads/3.0.4/mpich-3.0.4.tar.gz"
|
||||
list_url = "http://www.mpich.org/static/downloads/"
|
||||
list_depth = 2
|
||||
|
||||
By default, Spack only looks at the top-level page available at
|
||||
``list_url``. ``list_depth`` tells it to follow up to 2 levels of
|
||||
links from the top-level page. Note that here, this implies two
|
||||
levels of subdirectories, as the ``mpich`` website is structured much
|
||||
like a filesystem. But ``list_depth`` really refers to link depth
|
||||
when spidering the page.
|
||||
|
||||
.. _attribute_parallel:
|
||||
|
||||
---------------
|
||||
|
Reference in New Issue
Block a user