Python command, libraries, and headers (#3367)

## Motivation

Python installations are both important and unfortunately inconsistent. Depending on the Python version, OS, and the strength of the Earth's magnetic field when it was installed, the name of the Python executable, directory containing its libraries, library names, and the directory containing its headers can vary drastically. 

I originally got into this mess with #3274, where I discovered that Boost could not be built with Python 3 because the executable is called `python3` and we were telling it to use `python`. I got deeper into this mess when I started hacking on #3140, where I discovered just how difficult it is to find the location and name of the Python libraries and headers.

Currently, half of the packages that depend on Python and need to know this information jump through hoops to determine the correct information. The other half are hard-coded to use `python`, `spec['python'].prefix.lib`, and `spec['python'].prefix.include`. Obviously, none of these packages would work for Python 3, and there's no reason to duplicate the effort. The Python package itself should contain all of the information necessary to use it properly. This is in line with the recent work by @alalazo and @davydden with respect to `spec['blas'].libs` and friends.

## Prefix

For most packages in Spack, we assume that the installation directory is `spec['python'].prefix`. This generally works for anything installed with Spack, but gets complicated when we include external packages. Python is a commonly used external package (it needs to be installed just to run Spack). If it was installed with Homebrew, `which python` would return `/usr/local/bin/python`, and most users would erroneously assume that `/usr/local` is the installation directory. If you peruse through #2173, you'll immediately see why this is not the case. Homebrew actually installs Python in `/usr/local/Cellar/python/2.7.12_2` and symlinks the executable to `/usr/local/bin/python`. `PYTHONHOME` (and presumably most things that need to know where Python is installed) needs to be set to the actual installation directory, not `/usr/local`.

Normally I would say, "sounds like user error, make sure to use the real installation directory in your `packages.yaml`". But I think we can make a special case for Python. That's what we decided in #2173 anyway. If we change our minds, I would be more than happy to simplify things.

To solve this problem, I created a `spec['python'].home` attribute that works the same way as `spec['python'].prefix` but queries Python to figure out where it was actually installed. @tgamblin Is there any way to overwrite `spec['python'].prefix`? I think it's currently immutable.

## Command

In general, Python 2 comes with both `python` and `python2` commands, while Python 3 only comes with a `python3` command. But this is up to the OS developers. For example, `/usr/bin/python` on Gentoo is actually Python 3. Worse yet, if someone is using an externally installed Python, all 3 commands may exist in the same directory! Here's what I'm thinking:

If the spec is for Python 3, try searching for the `python3` command.
If the spec is for Python 2, try searching for the `python2` command.
If neither are found, try searching for the `python` command.

## Libraries

Spack installs Python libraries in `spec['python'].prefix.lib`. Except on openSUSE 13, where it installs to `spec['python'].prefix.lib64` (see #2295 and #2253). On my CentOS 6 machine, the Python libraries are installed in `/usr/lib64`. Both need to work.

The libraries themselves change name depending on OS and Python version. For Python 2.7 on macOS, I'm seeing:
```
lib/libpython2.7.dylib
```
For Python 3.6 on CentOS 6, I'm seeing:
```
lib/libpython3.so
lib/libpython3.6m.so.1.0
lib/libpython3.6m.so -> lib/libpython3.6m.so.1.0
```
Notice the `m` after the version number. Yeah, that's a thing.

## Headers

In Python 2.7, I'm seeing:
```
include/python2.7/pyconfig.h
```
In Python 3.6, I'm seeing:
```
include/python3.6m/pyconfig.h
```
It looks like all Python 3 installations have this `m`. Tested with Python 3.2 and 3.6 on macOS and CentOS 6

Spack has really nice support for libraries (`find_libraries` and `LibraryList`), but nothing for headers. Fixed.
This commit is contained in:
Adam J. Stewart
2017-04-29 19:24:13 -05:00
committed by Todd Gamblin
parent a32a0eacba
commit ce3ab503de
44 changed files with 883 additions and 378 deletions

View File

@@ -25,27 +25,32 @@
import collections
import errno
import fileinput
import fnmatch
import glob
import numbers
import os
import re
import shutil
import six
import stat
import subprocess
import sys
from contextlib import contextmanager
import llnl.util.tty as tty
from llnl.util import tty
from llnl.util.lang import dedupe
__all__ = [
'FileFilter',
'HeaderList',
'LibraryList',
'ancestor',
'can_access',
'change_sed_delimiter',
'copy_mode',
'filter_file',
'find',
'find_headers',
'find_libraries',
'find_system_libraries',
'fix_darwin_install_name',
@@ -66,25 +71,32 @@
'touchp',
'traverse_tree',
'unset_executable_mode',
'working_dir']
'working_dir'
]
def filter_file(regex, repl, *filenames, **kwargs):
"""Like sed, but uses python regular expressions.
r"""Like sed, but uses python regular expressions.
Filters every line of each file through regex and replaces the file
with a filtered version. Preserves mode of filtered files.
Filters every line of each file through regex and replaces the file
with a filtered version. Preserves mode of filtered files.
As with re.sub, ``repl`` can be either a string or a callable.
If it is a callable, it is passed the match object and should
return a suitable replacement string. If it is a string, it
can contain ``\1``, ``\2``, etc. to represent back-substitution
as sed would allow.
As with re.sub, ``repl`` can be either a string or a callable.
If it is a callable, it is passed the match object and should
return a suitable replacement string. If it is a string, it
can contain ``\1``, ``\2``, etc. to represent back-substitution
as sed would allow.
Keyword Options:
string[=False] If True, treat regex as a plain string.
backup[=True] Make backup file(s) suffixed with ~
ignore_absent[=False] Ignore any files that don't exist.
Parameters:
regex (str): The regular expression to search for
repl (str): The string to replace matches with
*filenames: One or more files to search and replace
Keyword Arguments:
string (bool): Treat regex as a plain string. Default it False
backup (bool): Make backup file(s) suffixed with ``~``. Default is True
ignore_absent (bool): Ignore any files that don't exist.
Default is False
"""
string = kwargs.get('string', False)
backup = kwargs.get('backup', True)
@@ -128,7 +140,7 @@ def groupid_to_group(x):
class FileFilter(object):
"""Convenience class for calling filter_file a lot."""
"""Convenience class for calling ``filter_file`` a lot."""
def __init__(self, *filenames):
self.filenames = filenames
@@ -139,12 +151,18 @@ def filter(self, regex, repl, **kwargs):
def change_sed_delimiter(old_delim, new_delim, *filenames):
"""Find all sed search/replace commands and change the delimiter.
e.g., if the file contains seds that look like 's///', you can
call change_sed_delimiter('/', '@', file) to change the
delimiter to '@'.
NOTE that this routine will fail if the delimiter is ' or ".
Handling those is left for future work.
e.g., if the file contains seds that look like ``'s///'``, you can
call ``change_sed_delimiter('/', '@', file)`` to change the
delimiter to ``'@'``.
Note that this routine will fail if the delimiter is ``'`` or ``"``.
Handling those is left for future work.
Parameters:
old_delim (str): The delimiter to search for
new_delim (str): The delimiter to replace with
*filenames: One or more files to search and replace
"""
assert(len(old_delim) == 1)
assert(len(new_delim) == 1)
@@ -239,7 +257,7 @@ def mkdirp(*paths):
def force_remove(*paths):
"""Remove files without printing errors. Like rm -f, does NOT
"""Remove files without printing errors. Like ``rm -f``, does NOT
remove directories."""
for path in paths:
try:
@@ -278,7 +296,8 @@ def touch(path):
def touchp(path):
"""Like touch, but creates any parent directories needed for the file."""
"""Like ``touch``, but creates any parent directories needed for the file.
"""
mkdirp(os.path.dirname(path))
touch(path)
@@ -335,17 +354,13 @@ def traverse_tree(source_root, dest_root, rel_path='', **kwargs):
('root/b', 'dest/b')
('root/b/file3', 'dest/b/file3')
Optional args:
order=[pre|post] -- Whether to do pre- or post-order traversal.
ignore=<predicate> -- Predicate indicating which files to ignore.
follow_nonexisting -- Whether to descend into directories in
src that do not exit in dest. Default True.
follow_links -- Whether to descend into symlinks in src.
Keyword Arguments:
order (str): Whether to do pre- or post-order traversal. Accepted
values are 'pre' and 'post'
ignore (str): Predicate indicating which files to ignore
follow_nonexisting (bool): Whether to descend into directories in
``src`` that do not exit in ``dest``. Default is True
follow_links (bool): Whether to descend into symlinks in ``src``
"""
follow_nonexisting = kwargs.get('follow_nonexisting', True)
follow_links = kwargs.get('follow_link', False)
@@ -406,12 +421,10 @@ def set_executable(path):
def remove_dead_links(root):
"""
Removes any dead link that is present in root
Args:
root: path where to search for dead links
"""Removes any dead link that is present in root.
Parameters:
root (str): path where to search for dead links
"""
for file in os.listdir(root):
path = join_path(root, file)
@@ -419,11 +432,10 @@ def remove_dead_links(root):
def remove_if_dead_link(path):
"""
Removes the argument if it is a dead link, does nothing otherwise
"""Removes the argument if it is a dead link.
Args:
path: the potential dead link
Parameters:
path (str): The potential dead link
"""
if os.path.islink(path):
real_path = os.path.realpath(path)
@@ -432,14 +444,13 @@ def remove_if_dead_link(path):
def remove_linked_tree(path):
"""
Removes a directory and its contents. If the directory is a
symlink, follows the link and removes the real directory before
removing the link.
"""Removes a directory and its contents.
Args:
path: directory to be removed
If the directory is a symlink, follows the link and removes the real
directory before removing the link.
Parameters:
path (str): Directory to be removed
"""
if os.path.exists(path):
if os.path.islink(path):
@@ -450,17 +461,17 @@ def remove_linked_tree(path):
def fix_darwin_install_name(path):
"""
Fix install name of dynamic libraries on Darwin to have full path.
"""Fix install name of dynamic libraries on Darwin to have full path.
There are two parts of this task:
(i) use install_name('-id',...) to change install name of a single lib;
(ii) use install_name('-change',...) to change the cross linking between
libs. The function assumes that all libraries are in one folder and
currently won't follow subfolders.
Args:
path: directory in which .dylib files are located
1. Use ``install_name('-id', ...)`` to change install name of a single lib
2. Use ``install_name('-change', ...)`` to change the cross linking between
libs. The function assumes that all libraries are in one folder and
currently won't follow subfolders.
Parameters:
path (str): directory in which .dylib files are located
"""
libs = glob.glob(join_path(path, "*.dylib"))
for lib in libs:
@@ -486,29 +497,108 @@ def fix_darwin_install_name(path):
stdout=subprocess.PIPE).communicate()[0]
break
# Utilities for libraries
def find(root, files, recurse=True):
"""Search for ``files`` starting from the ``root`` directory.
Like GNU/BSD find but written entirely in Python.
Examples:
.. code-block:: console
$ find /usr -name python
is equivalent to:
>>> find('/usr', 'python')
.. code-block:: console
$ find /usr/local/bin -maxdepth 1 -name python
is equivalent to:
>>> find('/usr/local/bin', 'python', recurse=False)
Accepts any glob characters accepted by fnmatch:
======= ====================================
Pattern Meaning
======= ====================================
* matches everything
? matches any single character
[seq] matches any character in ``seq``
[!seq] matches any character not in ``seq``
======= ====================================
Parameters:
root (str): The root directory to start searching from
files (str or collections.Sequence): Library name(s) to search for
recurse (bool, optional): if False search only root folder,
if True descends top-down from the root. Defaults to True.
Returns:
:func:`list`: The files that have been found
"""
if isinstance(files, six.string_types):
files = [files]
if recurse:
return _find_recursive(root, files)
else:
return _find_non_recursive(root, files)
class LibraryList(collections.Sequence):
"""Sequence of absolute paths to libraries
def _find_recursive(root, search_files):
found_files = []
Provides a few convenience methods to manipulate library paths and get
commonly used compiler flags or names
for path, _, list_files in os.walk(root):
for search_file in search_files:
for list_file in list_files:
if fnmatch.fnmatch(list_file, search_file):
found_files.append(join_path(path, list_file))
return found_files
def _find_non_recursive(root, search_files):
found_files = []
for list_file in os.listdir(root):
for search_file in search_files:
if fnmatch.fnmatch(list_file, search_file):
found_files.append(join_path(root, list_file))
return found_files
# Utilities for libraries and headers
class FileList(collections.Sequence):
"""Sequence of absolute paths to files.
Provides a few convenience methods to manipulate file paths.
"""
def __init__(self, libraries):
self.libraries = list(libraries)
def __init__(self, files):
if isinstance(files, six.string_types):
files = [files]
self.files = list(dedupe(files))
@property
def directories(self):
"""Stable de-duplication of the directories where the libraries
reside
"""Stable de-duplication of the directories where the files reside.
>>> l = LibraryList(['/dir1/liba.a', '/dir2/libb.a', '/dir1/libc.a'])
>>> assert l.directories == ['/dir1', '/dir2']
>>> h = HeaderList(['/dir1/a.h', '/dir1/b.h', '/dir2/c.h'])
>>> assert h.directories == ['/dir1', '/dir2']
"""
return list(dedupe(
os.path.dirname(x) for x in self.libraries if os.path.dirname(x)
os.path.dirname(x) for x in self.files if os.path.dirname(x)
))
@property
@@ -517,8 +607,150 @@ def basenames(self):
>>> l = LibraryList(['/dir1/liba.a', '/dir2/libb.a', '/dir3/liba.a'])
>>> assert l.basenames == ['liba.a', 'libb.a']
>>> h = HeaderList(['/dir1/a.h', '/dir2/b.h', '/dir3/a.h'])
>>> assert h.basenames == ['a.h', 'b.h']
"""
return list(dedupe(os.path.basename(x) for x in self.libraries))
return list(dedupe(os.path.basename(x) for x in self.files))
@property
def names(self):
"""Stable de-duplication of file names in the list
>>> h = HeaderList(['/dir1/a.h', '/dir2/b.h', '/dir3/a.h'])
>>> assert h.names == ['a', 'b']
"""
return list(dedupe(x.split('.')[0] for x in self.basenames))
def __getitem__(self, item):
cls = type(self)
if isinstance(item, numbers.Integral):
return self.files[item]
return cls(self.files[item])
def __add__(self, other):
return self.__class__(dedupe(self.files + list(other)))
def __radd__(self, other):
return self.__add__(other)
def __eq__(self, other):
return self.files == other.files
def __len__(self):
return len(self.files)
def joined(self, separator=' '):
return separator.join(self.files)
def __repr__(self):
return self.__class__.__name__ + '(' + repr(self.files) + ')'
def __str__(self):
return self.joined()
class HeaderList(FileList):
"""Sequence of absolute paths to headers.
Provides a few convenience methods to manipulate header paths and get
commonly used compiler flags or names.
"""
def __init__(self, files):
super(HeaderList, self).__init__(files)
self._macro_definitions = []
@property
def headers(self):
return self.files
@property
def include_flags(self):
"""Include flags
>>> h = HeaderList(['/dir1/a.h', '/dir1/b.h', '/dir2/c.h'])
>>> assert h.cpp_flags == '-I/dir1 -I/dir2'
"""
return ' '.join(['-I' + x for x in self.directories])
@property
def macro_definitions(self):
"""Macro definitions
>>> h = HeaderList(['/dir1/a.h', '/dir1/b.h', '/dir2/c.h'])
>>> h.add_macro('-DBOOST_LIB_NAME=boost_regex')
>>> h.add_macro('-DBOOST_DYN_LINK')
>>> assert h.macro_definitions == '-DBOOST_LIB_NAME=boost_regex -DBOOST_DYN_LINK' # noqa
"""
return ' '.join(self._macro_definitions)
@property
def cpp_flags(self):
"""Include flags + macro definitions
>>> h = HeaderList(['/dir1/a.h', '/dir1/b.h', '/dir2/c.h'])
>>> h.add_macro('-DBOOST_DYN_LINK')
>>> assert h.macro_definitions == '-I/dir1 -I/dir2 -DBOOST_DYN_LINK'
"""
return self.include_flags + ' ' + self.macro_definitions
def add_macro(self, macro):
"""Add a macro definition"""
self._macro_definitions.append(macro)
def find_headers(headers, root, recurse=False):
"""Returns an iterable object containing a list of full paths to
headers if found.
Accepts any glob characters accepted by fnmatch:
======= ====================================
Pattern Meaning
======= ====================================
* matches everything
? matches any single character
[seq] matches any character in ``seq``
[!seq] matches any character not in ``seq``
======= ====================================
Parameters:
headers (str or list of str): Header name(s) to search for
root (str): The root directory to start searching from
recurses (bool, optional): if False search only root folder,
if True descends top-down from the root. Defaults to False.
Returns:
HeaderList: The headers that have been found
"""
if isinstance(headers, six.string_types):
headers = [headers]
elif not isinstance(headers, collections.Sequence):
message = '{0} expects a string or sequence of strings as the '
message += 'first argument [got {1} instead]'
message = message.format(find_headers.__name__, type(headers))
raise TypeError(message)
# Construct the right suffix for the headers
suffix = 'h'
# List of headers we are searching with suffixes
headers = ['{0}.{1}'.format(header, suffix) for header in headers]
return HeaderList(find(root, headers, recurse))
class LibraryList(FileList):
"""Sequence of absolute paths to libraries
Provides a few convenience methods to manipulate library paths and get
commonly used compiler flags or names
"""
@property
def libraries(self):
return self.files
@property
def names(self):
@@ -556,36 +788,9 @@ def ld_flags(self):
"""
return self.search_flags + ' ' + self.link_flags
def __getitem__(self, item):
cls = type(self)
if isinstance(item, numbers.Integral):
return self.libraries[item]
return cls(self.libraries[item])
def __add__(self, other):
return LibraryList(dedupe(self.libraries + list(other)))
def __radd__(self, other):
return self.__add__(other)
def __eq__(self, other):
return self.libraries == other.libraries
def __len__(self):
return len(self.libraries)
def joined(self, separator=' '):
return separator.join(self.libraries)
def __repr__(self):
return self.__class__.__name__ + '(' + repr(self.libraries) + ')'
def __str__(self):
return self.joined()
def find_system_libraries(library_names, shared=True):
"""Searches the usual system library locations for ``library_names``.
def find_system_libraries(libraries, shared=True):
"""Searches the usual system library locations for ``libraries``.
Search order is as follows:
@@ -596,20 +801,32 @@ def find_system_libraries(library_names, shared=True):
5. ``/usr/local/lib64``
6. ``/usr/local/lib``
Args:
library_names (str or list of str): Library name(s) to search for
shared (bool): searches for shared libraries if True
Accepts any glob characters accepted by fnmatch:
======= ====================================
Pattern Meaning
======= ====================================
* matches everything
? matches any single character
[seq] matches any character in ``seq``
[!seq] matches any character not in ``seq``
======= ====================================
Parameters:
libraries (str or list of str): Library name(s) to search for
shared (bool, optional): if True searches for shared libraries,
otherwise for static. Defaults to True.
Returns:
LibraryList: The libraries that have been found
"""
if isinstance(library_names, str):
library_names = [library_names]
elif not isinstance(library_names, collections.Sequence):
if isinstance(libraries, six.string_types):
libraries = [libraries]
elif not isinstance(libraries, collections.Sequence):
message = '{0} expects a string or sequence of strings as the '
message += 'first argument [got {1} instead]'
message = message.format(
find_system_libraries.__name__, type(library_names))
message = message.format(find_system_libraries.__name__,
type(libraries))
raise TypeError(message)
libraries_found = []
@@ -622,7 +839,7 @@ def find_system_libraries(library_names, shared=True):
'/usr/local/lib',
]
for library in library_names:
for library in libraries:
for root in search_locations:
result = find_libraries(library, root, shared, recurse=True)
if result:
@@ -632,26 +849,38 @@ def find_system_libraries(library_names, shared=True):
return libraries_found
def find_libraries(library_names, root, shared=True, recurse=False):
def find_libraries(libraries, root, shared=True, recurse=False):
"""Returns an iterable of full paths to libraries found in a root dir.
Args:
library_names (str or list of str): Library names to search for
Accepts any glob characters accepted by fnmatch:
======= ====================================
Pattern Meaning
======= ====================================
* matches everything
? matches any single character
[seq] matches any character in ``seq``
[!seq] matches any character not in ``seq``
======= ====================================
Parameters:
libraries (str or list of str): Library name(s) to search for
root (str): The root directory to start searching from
shared (bool): if True searches for shared libraries, otherwise static.
recurse (bool): if False search only root folder,
if True descends top-down from the root
shared (bool, optional): if True searches for shared libraries,
otherwise for static. Defaults to True.
recurse (bool, optional): if False search only root folder,
if True descends top-down from the root. Defaults to False.
Returns:
LibraryList: The libraries that have been found
"""
if isinstance(library_names, str):
library_names = [library_names]
elif not isinstance(library_names, collections.Sequence):
if isinstance(libraries, six.string_types):
libraries = [libraries]
elif not isinstance(libraries, collections.Sequence):
message = '{0} expects a string or sequence of strings as the '
message += 'first argument [got {1} instead]'
raise TypeError(message.format(
find_libraries.__name__, type(library_names)))
message = message.format(find_libraries.__name__, type(libraries))
raise TypeError(message)
# Construct the right suffix for the library
if shared is True:
@@ -659,38 +888,6 @@ def find_libraries(library_names, root, shared=True, recurse=False):
else:
suffix = 'a'
# List of libraries we are searching with suffixes
libraries = ['{0}.{1}'.format(lib, suffix) for lib in library_names]
# Search method
if recurse is False:
search_method = _find_libraries_non_recursive
else:
search_method = _find_libraries_recursive
libraries = ['{0}.{1}'.format(lib, suffix) for lib in libraries]
return search_method(libraries, root)
def _find_libraries_recursive(libraries, root):
library_dict = collections.defaultdict(list)
for path, _, files in os.walk(root):
for lib in libraries:
if lib in files:
library_dict[lib].append(
join_path(path, lib)
)
answer = []
for lib in libraries:
answer.extend(library_dict[lib])
return LibraryList(answer)
def _find_libraries_non_recursive(libraries, root):
def lib_or_none(lib):
library = join_path(root, lib)
if not os.path.exists(library):
return None
return library
return LibraryList(
[lib_or_none(lib) for lib in libraries if lib_or_none(lib) is not None]
)
return LibraryList(find(root, libraries, recurse))