targets: Spack targets can now be fine-grained microarchitectures

Spack can now:

- label ppc64, ppc64le, x86_64, etc. builds with specific
  microarchitecture-specific names, like 'haswell', 'skylake' or
  'icelake'.

- detect the host architecture of a machine from /proc/cpuinfo or similar
  tools.

- Understand which microarchitectures are compatible with which (for
  binary reuse)

- Understand which compiler flags are needed (for GCC, so far) to build
  binaries for particular microarchitectures.

All of this is managed through a JSON file (microarchitectures.json) that
contains detailed auto-detection, compiler flag, and compatibility
information for specific microarchitecture targets.  The `llnl.util.cpu`
module implements a library that allows detection and comparison of
microarchitectures based on the data in this file.

The `target` part of Spack specs is now essentially a Microarchitecture
object, and Specs' targets can be compared for compatibility as well.
This allows us to label optimized binary packages at a granularity that
enables them to be reused on compatible machines.  Previously, we only
knew that a package was built for x86_64, NOT which x86_64 machines it
was usable on.

Currently this feature supports Intel, Power, and AMD chips. Support for
ARM is forthcoming.

Specifics:

- Add microarchitectures.json with descriptions of architectures

- Relaxed semantic of compiler's "target" attribute.  Before this change
  the semantic to check if a compiler could be viable for a given target
  was exact match. This made sense as the finest granularity of targets
  was architecture families.  As now we can target micro-architectures,
  this commit changes the semantic by interpreting as the architecture
  family what is stored in the compiler's "target" attribute. A compiler
  is then a viable choice if the target being concretized belongs to the
  same family. Similarly when a new compiler is detected the architecture
  family is stored in the "target" attribute.

- Make Spack's `cc` compiler wrapper inject target-specific flags on the
  command line

- Architecture concretization updated to use the same algorithm as
  compiler concretization

- Micro-architecture features, vendor, generation etc. are included in
  the package hash.  Generic architectures, such as x86_64 or ppc64, are
  still dumped using the name only.

- If the compiler for a target is not supported exit with an intelligible
  error message. If the compiler support is unknown don't try to use
  optimization flags.

- Support and define feature aliases (e.g., sse3 -> ssse3) in
  microarchitectures.json and on Microarchitecture objects. Feature
  aliases are defined in targets.json and map a name (the "alias") to a
  list of rules that must be met for the test to be successful. The rules
  that are available can be extended later using a decorator.

- Implement subset semantics for comparing microarchitectures (treat
  microarchitectures as a partial order, i.e. (a < b), (a == b) and (b <
  a) can all be false.

- Implement logic to automatically demote the default target if the
  compiler being used is too old to optimize for it. Updated docs to make
  this behavior explicit.  This avoids surprising the user if the default
  compiler is older than the host architecture.

This commit adds unit tests to verify the semantics of target ranges and
target lists in constraints. The implementation to allow target ranges
and lists is minimal and doesn't add any new type.  A more careful
refactor that takes into account the type system might be due later.

Co-authored-by: Gregory Becker <becker33.llnl.gov>
This commit is contained in:
Massimiliano Culpo
2019-06-19 15:47:07 +02:00
committed by Todd Gamblin
parent dfabf5d6b1
commit 3c4322bf1a
50 changed files with 3039 additions and 539 deletions

View File

@@ -0,0 +1,16 @@
# Copyright 2013-2019 Lawrence Livermore National Security, LLC and other
# Spack Project Developers. See the top-level COPYRIGHT file for details.
#
# SPDX-License-Identifier: (Apache-2.0 OR MIT)
from .microarchitecture import Microarchitecture, UnsupportedMicroarchitecture
from .microarchitecture import targets, generic_microarchitecture
from .detect import host
__all__ = [
'Microarchitecture',
'UnsupportedMicroarchitecture',
'targets',
'generic_microarchitecture',
'host'
]

View File

@@ -0,0 +1,102 @@
# Copyright 2013-2019 Lawrence Livermore National Security, LLC and other
# Spack Project Developers. See the top-level COPYRIGHT file for details.
#
# SPDX-License-Identifier: (Apache-2.0 OR MIT)
#: Known predicates that can be used to construct feature aliases
from .schema import targets_json, LazyDictionary, properties
_feature_alias_predicate = {}
class FeatureAliasTest(object):
"""A test that must be passed for a feature alias to succeed.
Args:
rules (dict): dictionary of rules to be met. Each key must be a
valid alias predicate
"""
def __init__(self, rules):
self.rules = rules
self.predicates = []
for name, args in rules.items():
self.predicates.append(_feature_alias_predicate[name](args))
def __call__(self, microarchitecture):
return all(
feature_test(microarchitecture) for feature_test in self.predicates
)
def _feature_aliases():
"""Returns the dictionary of all defined feature aliases."""
json_data = targets_json['feature_aliases']
aliases = {}
for alias, rules in json_data.items():
aliases[alias] = FeatureAliasTest(rules)
return aliases
feature_aliases = LazyDictionary(_feature_aliases)
def alias_predicate(predicate_schema):
"""Decorator to register a predicate that can be used to define
feature aliases.
Args:
predicate_schema (dict): schema to be enforced in
microarchitectures.json for the predicate
"""
def decorator(func):
name = func.__name__
# Check we didn't register anything else with the same name
if name in _feature_alias_predicate:
msg = 'the alias predicate "{0}" already exists'.format(name)
raise KeyError(msg)
# Update the overall schema
alias_schema = properties['feature_aliases']['patternProperties']
alias_schema[r'([\w]*)']['properties'].update(
{name: predicate_schema}
)
# Register the predicate
_feature_alias_predicate[name] = func
return func
return decorator
@alias_predicate(predicate_schema={'type': 'string'})
def reason(motivation_for_the_alias):
"""This predicate returns always True and it's there to allow writing
a documentation string in the JSON file to explain why an alias is needed.
"""
return lambda x: True
@alias_predicate(predicate_schema={
'type': 'array',
'items': {'type': 'string'}
})
def any_of(list_of_features):
"""Returns a predicate that is True if any of the feature in the
list is in the microarchitecture being tested, False otherwise.
"""
def _impl(microarchitecture):
return any(x in microarchitecture for x in list_of_features)
return _impl
@alias_predicate(predicate_schema={
'type': 'array',
'items': {'type': 'string'}
})
def families(list_of_families):
"""Returns a predicate that is True if the architecture family of
the microarchitecture being tested is in the list, False otherwise.
"""
def _impl(microarchitecture):
return str(microarchitecture.family) in list_of_families
return _impl

View File

@@ -0,0 +1,216 @@
# Copyright 2013-2019 Lawrence Livermore National Security, LLC and other
# Spack Project Developers. See the top-level COPYRIGHT file for details.
#
# SPDX-License-Identifier: (Apache-2.0 OR MIT)
import collections
import functools
import platform
import re
import subprocess
import sys
import six
from .microarchitecture import generic_microarchitecture, targets
#: Mapping from operating systems to chain of commands
#: to obtain a dictionary of raw info on the current cpu
info_factory = collections.defaultdict(list)
#: Mapping from micro-architecture families (x86_64, ppc64le, etc.) to
#: functions checking the compatibility of the host with a given target
compatibility_checks = {}
def info_dict(operating_system):
"""Decorator to mark functions that are meant to return raw info on
the current cpu.
Args:
operating_system (str or tuple): operating system for which the marked
function is a viable factory of raw info dictionaries.
"""
def decorator(factory):
info_factory[operating_system].append(factory)
@functools.wraps(factory)
def _impl():
info = factory()
# Check that info contains a few mandatory fields
msg = 'field "{0}" is missing from raw info dictionary'
assert 'vendor_id' in info, msg.format('vendor_id')
assert 'flags' in info, msg.format('flags')
assert 'model' in info, msg.format('model')
assert 'model_name' in info, msg.format('model_name')
return info
return _impl
return decorator
@info_dict(operating_system='Linux')
def proc_cpuinfo():
"""Returns a raw info dictionary by parsing the first entry of
``/proc/cpuinfo``
"""
info = {}
with open('/proc/cpuinfo') as file:
for line in file:
key, separator, value = line.partition(':')
# If there's no separator and info was already populated
# according to what's written here:
#
# http://www.linfo.org/proc_cpuinfo.html
#
# we are on a blank line separating two cpus. Exit early as
# we want to read just the first entry in /proc/cpuinfo
if separator != ':' and info:
break
info[key.strip()] = value.strip()
return info
def check_output(args):
if sys.version_info[:2] == (2, 6):
return subprocess.run(
args, check=True, stdout=subprocess.PIPE).stdout # nopyqver
else:
return subprocess.check_output(args) # nopyqver
@info_dict(operating_system='Darwin')
def sysctl():
"""Returns a raw info dictionary parsing the output of sysctl."""
info = {}
info['vendor_id'] = check_output(
['sysctl', '-n', 'machdep.cpu.vendor']
).strip()
info['flags'] = check_output(
['sysctl', '-n', 'machdep.cpu.features']
).strip().lower()
info['flags'] += ' ' + check_output(
['sysctl', '-n', 'machdep.cpu.leaf7_features']
).strip().lower()
info['model'] = check_output(
['sysctl', '-n', 'machdep.cpu.model']
).strip()
info['model name'] = check_output(
['sysctl', '-n', 'machdep.cpu.brand_string']
).strip()
# Super hacky way to deal with slight representation differences
# Would be better to somehow consider these "identical"
if 'sse4.1' in info['flags']:
info['flags'] += ' sse4_1'
if 'sse4.2' in info['flags']:
info['flags'] += ' sse4_2'
if 'avx1.0' in info['flags']:
info['flags'] += ' avx'
return info
def raw_info_dictionary():
"""Returns a dictionary with information on the cpu of the current host.
This function calls all the viable factories one after the other until
there's one that is able to produce the requested information.
"""
info = {}
for factory in info_factory[platform.system()]:
try:
info = factory()
except Exception:
pass
if info:
break
return info
def compatible_microarchitectures(info):
"""Returns an unordered list of known micro-architectures that are
compatible with the info dictionary passed as argument.
Args:
info (dict): dictionary containing information on the host cpu
"""
architecture_family = platform.machine()
# If a tester is not registered, be conservative and assume no known
# target is compatible with the host
tester = compatibility_checks.get(architecture_family, lambda x, y: False)
return [x for x in targets.values() if tester(info, x)] or \
[generic_microarchitecture(architecture_family)]
def host():
"""Detects the host micro-architecture and returns it."""
# Retrieve a dictionary with raw information on the host's cpu
info = raw_info_dictionary()
# Get a list of possible candidates for this micro-architecture
candidates = compatible_microarchitectures(info)
# Reverse sort of the depth for the inheritance tree among only targets we
# can use. This gets the newest target we satisfy.
return sorted(candidates, key=lambda t: len(t.ancestors), reverse=True)[0]
def compatibility_check(architecture_family):
"""Decorator to register a function as a proper compatibility check.
A compatibility check function takes the raw info dictionary as a first
argument and an arbitrary target as the second argument. It returns True
if the target is compatible with the info dictionary, False otherwise.
Args:
architecture_family (str or tuple): architecture family for which
this test can be used, e.g. x86_64 or ppc64le etc.
"""
# Turn the argument into something iterable
if isinstance(architecture_family, six.string_types):
architecture_family = (architecture_family,)
def decorator(func):
# TODO: on removal of Python 2.6 support this can be re-written as
# TODO: an update + a dict comprehension
for arch_family in architecture_family:
compatibility_checks[arch_family] = func
return func
return decorator
@compatibility_check(architecture_family=('ppc64le', 'ppc64'))
def compatibility_check_for_power(info, target):
basename = platform.machine()
generation_match = re.search(r'POWER(\d+)', info.get('cpu', ''))
generation = int(generation_match.group(1))
# We can use a target if it descends from our machine type and our
# generation (9 for POWER9, etc) is at least its generation.
arch_root = targets[basename]
return (target == arch_root or arch_root in target.ancestors) \
and target.generation <= generation
@compatibility_check(architecture_family='x86_64')
def compatibility_check_for_x86_64(info, target):
basename = 'x86_64'
vendor = info.get('vendor_id', 'generic')
features = set(info.get('flags', '').split())
# We can use a target if it descends from our machine type, is from our
# vendor, and we have all of its features
arch_root = targets[basename]
return (target == arch_root or arch_root in target.ancestors) \
and (target.vendor == vendor or target.vendor == 'generic') \
and target.features.issubset(features)

View File

@@ -0,0 +1,343 @@
# Copyright 2013-2019 Lawrence Livermore National Security, LLC and other
# Spack Project Developers. See the top-level COPYRIGHT file for details.
#
# SPDX-License-Identifier: (Apache-2.0 OR MIT)
import functools
import platform
import warnings
try:
from collections.abc import Sequence
except ImportError:
from collections import Sequence
import six
import llnl.util
import llnl.util.cpu.alias
import llnl.util.cpu.schema
from .schema import LazyDictionary
from .alias import feature_aliases
def coerce_target_names(func):
"""Decorator that automatically converts a known target name to a proper
Microarchitecture object.
"""
@functools.wraps(func)
def _impl(self, other):
if isinstance(other, six.string_types):
if other not in targets:
msg = '"{0}" is not a valid target name'
raise ValueError(msg.format(other))
other = targets[other]
return func(self, other)
return _impl
class Microarchitecture(object):
#: Aliases for micro-architecture's features
feature_aliases = feature_aliases
def __init__(
self, name, parents, vendor, features, compilers, generation=0
):
"""Represents a specific CPU micro-architecture.
Args:
name (str): name of the micro-architecture (e.g. skylake).
parents (list): list of parents micro-architectures, if any.
Parenthood is considered by cpu features and not
chronologically. As such each micro-architecture is
compatible with its ancestors. For example "skylake",
which has "broadwell" as a parent, supports running binaries
optimized for "broadwell".
vendor (str): vendor of the micro-architecture
features (list of str): supported CPU flags. Note that the semantic
of the flags in this field might vary among architectures, if
at all present. For instance x86_64 processors will list all
the flags supported by a given CPU while Arm processors will
list instead only the flags that have been added on top of the
base model for the current micro-architecture.
compilers (dict): compiler support to generate tuned code for this
micro-architecture. This dictionary has as keys names of
supported compilers, while values are list of dictionaries
with fields:
* name: name of the micro-architecture according to the
compiler. This is the name passed to the ``-march`` option
or similar. Not needed if the name is the same as that
passed in as argument above.
* versions: versions that support this micro-architecture.
generation (int): generation of the micro-architecture, if
relevant.
"""
self.name = name
self.parents = parents
self.vendor = vendor
self.features = features
self.compilers = compilers
self.generation = generation
@property
def ancestors(self):
value = self.parents[:]
for parent in self.parents:
value.extend(a for a in parent.ancestors if a not in value)
return value
def _to_set(self):
"""Returns a set of the nodes in this microarchitecture DAG."""
# This function is used to implement subset semantics with
# comparison operators
return set([str(self)] + [str(x) for x in self.ancestors])
@coerce_target_names
def __eq__(self, other):
if not isinstance(other, Microarchitecture):
return NotImplemented
return (self.name == other.name and
self.vendor == other.vendor and
self.features == other.features and
self.ancestors == other.ancestors and
self.compilers == other.compilers and
self.generation == other.generation)
@coerce_target_names
def __ne__(self, other):
return not self == other
@coerce_target_names
def __lt__(self, other):
if not isinstance(other, Microarchitecture):
return NotImplemented
return self._to_set() < other._to_set()
@coerce_target_names
def __le__(self, other):
return (self == other) or (self < other)
@coerce_target_names
def __gt__(self, other):
if not isinstance(other, Microarchitecture):
return NotImplemented
return self._to_set() > other._to_set()
@coerce_target_names
def __ge__(self, other):
return (self == other) or (self > other)
def __repr__(self):
cls_name = self.__class__.__name__
fmt = cls_name + '({0.name!r}, {0.parents!r}, {0.vendor!r}, ' \
'{0.features!r}, {0.compilers!r}, {0.generation!r})'
return fmt.format(self)
def __str__(self):
return self.name
def __contains__(self, feature):
# Feature must be of a string type, so be defensive about that
if not isinstance(feature, six.string_types):
msg = 'only objects of string types are accepted [got {0}]'
raise TypeError(msg.format(str(type(feature))))
# Here we look first in the raw features, and fall-back to
# feature aliases if not match was found
if feature in self.features:
return True
# Check if the alias is defined, if not it will return False
match_alias = Microarchitecture.feature_aliases.get(
feature, lambda x: False
)
return match_alias(self)
@property
def family(self):
"""Returns the architecture family a given target belongs to"""
roots = [x for x in [self] + self.ancestors if not x.ancestors]
msg = "a target is expected to belong to just one architecture family"
msg += "[found {0}]".format(', '.join(str(x) for x in roots))
assert len(roots) == 1, msg
return roots.pop()
def to_dict(self, return_list_of_items=False):
"""Returns a dictionary representation of this object.
Args:
return_list_of_items (bool): if True returns an ordered list of
items instead of the dictionary
"""
list_of_items = [
('name', str(self.name)),
('vendor', str(self.vendor)),
('features', sorted(
str(x) for x in self.features
)),
('generation', self.generation),
('parents', [str(x) for x in self.parents])
]
if return_list_of_items:
return list_of_items
return dict(list_of_items)
def optimization_flags(self, compiler, version):
"""Returns a string containing the optimization flags that needs
to be used to produce code optimized for this micro-architecture.
If there is no information on the compiler passed as argument the
function returns an empty string. If it is known that the compiler
version we want to use does not support this architecture the function
raises an exception.
Args:
compiler (str): name of the compiler to be used
version (str): version of the compiler to be used
"""
# If we don't have information on compiler return an empty string
if compiler not in self.compilers:
return ''
# If we have information on this compiler we need to check the
# version being used
compiler_info = self.compilers[compiler]
# Normalize the entries to have a uniform treatment in the code below
if not isinstance(compiler_info, Sequence):
compiler_info = [compiler_info]
def satisfies_constraint(entry, version):
min_version, max_version = entry['versions'].split(':')
# Check version suffixes
min_version, _, min_suffix = min_version.partition('-')
max_version, _, max_suffix = max_version.partition('-')
version, _, suffix = version.partition('-')
# If the suffixes are not all equal there's no match
if suffix != min_suffix or suffix != max_suffix:
return False
# Assume compiler versions fit into semver
tuplify = lambda x: tuple(int(y) for y in x.split('.'))
version = tuplify(version)
if min_version:
min_version = tuplify(min_version)
if min_version > version:
return False
if max_version:
max_version = tuplify(max_version)
if max_version < version:
return False
return True
for compiler_entry in compiler_info:
if satisfies_constraint(compiler_entry, version):
flags_fmt = compiler_entry['flags']
# If there's no field name, use the name of the
# micro-architecture
compiler_entry.setdefault('name', self.name)
# Check if we need to emit a warning
warning_message = compiler_entry.get('warnings', None)
if warning_message:
warnings.warn(warning_message)
flags = flags_fmt.format(**compiler_entry)
return flags
msg = ("cannot produce optimized binary for micro-architecture '{0}'"
" with {1}@{2} [supported compiler versions are {3}]")
msg = msg.format(self.name, compiler, version,
', '.join([x['versions'] for x in compiler_info]))
raise UnsupportedMicroarchitecture(msg)
def generic_microarchitecture(name):
"""Returns a generic micro-architecture with no vendor and no features.
Args:
name (str): name of the micro-architecture
"""
return Microarchitecture(
name, parents=[], vendor='generic', features=[], compilers={}
)
def _known_microarchitectures():
"""Returns a dictionary of the known micro-architectures. If the
current host platform is unknown adds it too as a generic target.
"""
# TODO: Simplify this logic using object_pairs_hook to OrderedDict
# TODO: when we stop supporting python2.6
def fill_target_from_dict(name, data, targets):
"""Recursively fills targets by adding the micro-architecture
passed as argument and all its ancestors.
Args:
name (str): micro-architecture to be added to targets.
data (dict): raw data loaded from JSON.
targets (dict): dictionary that maps micro-architecture names
to ``Microarchitecture`` objects
"""
values = data[name]
# Get direct parents of target
parent_names = values['from']
if isinstance(parent_names, six.string_types):
parent_names = [parent_names]
if parent_names is None:
parent_names = []
for p in parent_names:
# Recursively fill parents so they exist before we add them
if p in targets:
continue
fill_target_from_dict(p, data, targets)
parents = [targets.get(p) for p in parent_names]
vendor = values['vendor']
features = set(values['features'])
compilers = values.get('compilers', {})
generation = values.get('generation', 0)
targets[name] = Microarchitecture(
name, parents, vendor, features, compilers, generation
)
targets = {}
data = llnl.util.cpu.schema.targets_json['microarchitectures']
for name in data:
if name in targets:
# name was already brought in as ancestor to a target
continue
fill_target_from_dict(name, data, targets)
# Add the host platform if not present
host_platform = platform.machine()
targets.setdefault(host_platform, generic_microarchitecture(host_platform))
return targets
#: Dictionary of known micro-architectures
targets = LazyDictionary(_known_microarchitectures)
class UnsupportedMicroarchitecture(ValueError):
"""Raised if a compiler version does not support optimization for a given
micro-architecture.
"""

View File

@@ -0,0 +1,832 @@
{
"microarchitectures": {
"x86": {
"from": null,
"vendor": "generic",
"features": []
},
"i686": {
"from": "x86",
"vendor": "GenuineIntel",
"features": []
},
"pentium2": {
"from": "i686",
"vendor": "GenuineIntel",
"features": [
"mmx"
]
},
"pentium3": {
"from": "pentium2",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse"
]
},
"pentium4": {
"from": "pentium3",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2"
]
},
"prescott": {
"from": "pentium4",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"sse3"
]
},
"x86_64": {
"from": null,
"vendor": "generic",
"features": [],
"compilers": {
"gcc": {
"versions": "4:",
"name": "x86-64",
"flags": "-march={name} -mtune={name}"
}
}
},
"nocona": {
"from": "x86_64",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"sse3"
],
"compilers": {
"gcc": {
"versions": "4:",
"flags": "-march={name} -mtune={name}"
}
}
},
"core2": {
"from": "nocona",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"ssse3"
],
"compilers": {
"gcc": {
"versions": "4:",
"flags": "-march={name} -mtune={name}"
}
}
},
"nehalem": {
"from": "core2",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"ssse3",
"sse4_1",
"sse4_2",
"popcnt"
],
"compilers": {
"gcc": [
{
"versions": "4.9:",
"flags": "-march={name} -mtune={name}"
},
{
"versions": "4.6:4.8.5",
"name": "corei7",
"flags": "-march={name} -mtune={name}"
}
]
}
},
"westmere": {
"from": "nehalem",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"ssse3",
"sse4_1",
"sse4_2",
"popcnt",
"aes",
"pclmulqdq"
],
"compilers": {
"gcc": {
"versions": "4.9:",
"flags": "-march={name} -mtune={name}"
}
}
},
"sandybridge": {
"from": "westmere",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"ssse3",
"sse4_1",
"sse4_2",
"popcnt",
"aes",
"pclmulqdq",
"avx"
],
"compilers": {
"gcc": [
{
"versions": "4.9:",
"flags": "-march={name} -mtune={name}"
},
{
"versions": "4.6:4.8.5",
"name": "corei7-avx",
"flags": "-march={name} -mtune={name}"
}
]
}
},
"ivybridge": {
"from": "sandybridge",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"ssse3",
"sse4_1",
"sse4_2",
"popcnt",
"aes",
"pclmulqdq",
"avx",
"rdrand",
"f16c"
],
"compilers": {
"gcc": [
{
"versions": "4.9:",
"flags": "-march={name} -mtune={name}"
},
{
"versions": ":4.8.5",
"name": "core-avx-i",
"flags": "-march={name} -mtune={name}"
}
]
}
},
"haswell": {
"from": "ivybridge",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"ssse3",
"sse4_1",
"sse4_2",
"popcnt",
"aes",
"pclmulqdq",
"avx",
"rdrand",
"f16c",
"movbe",
"fma",
"avx2",
"bmi1",
"bmi2"
],
"compilers": {
"gcc": [
{
"versions": "4.9:",
"flags": "-march={name} -mtune={name}"
},
{
"versions": ":4.8.5",
"name": "core-avx2",
"flags": "-march={name} -mtune={name}"
}
]
}
},
"broadwell": {
"from": "haswell",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"ssse3",
"sse4_1",
"sse4_2",
"popcnt",
"aes",
"pclmulqdq",
"avx",
"rdrand",
"f16c",
"movbe",
"fma",
"avx2",
"bmi1",
"bmi2",
"rdseed",
"adx"
],
"compilers": {
"gcc": {
"versions": "4.9:",
"flags": "-march={name} -mtune={name}"
}
}
},
"skylake": {
"from": "broadwell",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"ssse3",
"sse4_1",
"sse4_2",
"popcnt",
"aes",
"pclmulqdq",
"avx",
"rdrand",
"f16c",
"movbe",
"fma",
"avx2",
"bmi1",
"bmi2",
"rdseed",
"adx",
"clflushopt",
"xsavec",
"xsaveopt"
],
"compilers": {
"gcc": {
"versions": "5.3:",
"flags": "-march={name} -mtune={name}"
}
}
},
"skylake_avx512": {
"from": "skylake",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"ssse3",
"sse4_1",
"sse4_2",
"popcnt",
"aes",
"pclmulqdq",
"avx",
"rdrand",
"f16c",
"movbe",
"fma",
"avx2",
"bmi1",
"bmi2",
"rdseed",
"adx",
"clflushopt",
"xsavec",
"xsaveopt",
"avx512f",
"clwb",
"avx512vl",
"avx512bw",
"avx512dq",
"avx512cd"
],
"compilers": {
"gcc": {
"name": "skylake-avx512",
"versions": "5.3:",
"flags": "-march={name} -mtune={name}"
}
}
},
"cannonlake": {
"from": "skylake",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"ssse3",
"sse4_1",
"sse4_2",
"popcnt",
"aes",
"pclmulqdq",
"avx",
"rdrand",
"f16c",
"movbe",
"fma",
"avx2",
"bmi1",
"bmi2",
"rdseed",
"adx",
"clflushopt",
"xsavec",
"xsaveopt",
"avx512f",
"avx512vl",
"avx512bw",
"avx512dq",
"avx512cd",
"avx512vbmi",
"avx512ifma",
"sha",
"umip"
],
"compilers": {
"gcc": {
"versions": "8:",
"flags": "-march={name} -mtune={name}"
}
}
},
"cascadelake": {
"from": "skylake_avx512",
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"ssse3",
"sse4_1",
"sse4_2",
"popcnt",
"aes",
"pclmulqdq",
"avx",
"rdrand",
"f16c",
"movbe",
"fma",
"avx2",
"bmi1",
"bmi2",
"rdseed",
"adx",
"clflushopt",
"xsavec",
"xsaveopt",
"avx512f",
"clwb",
"avx512vl",
"avx512bw",
"avx512dq",
"avx512cd",
"avx512vnni"
],
"compilers": {
"gcc": {
"versions": "9:",
"flags": "-march={name} -mtune={name}"
}
}
},
"icelake": {
"from": [
"cascadelake",
"cannonlake"
],
"vendor": "GenuineIntel",
"features": [
"mmx",
"sse",
"sse2",
"ssse3",
"sse4_1",
"sse4_2",
"popcnt",
"aes",
"pclmulqdq",
"avx",
"rdrand",
"f16c",
"movbe",
"fma",
"avx2",
"bmi1",
"bmi2",
"rdseed",
"adx",
"clflushopt",
"xsavec",
"xsaveopt",
"avx512f",
"avx512vl",
"avx512bw",
"avx512dq",
"avx512cd",
"avx512vbmi",
"avx512ifma",
"sha",
"umip",
"clwb",
"rdpid",
"gfni",
"avx512vbmi2",
"avx512vpopcntdq",
"avx512bitalg",
"avx512vnni",
"vpclmulqdq",
"vaes"
],
"compilers": {
"gcc": {
"name": "icelake-client",
"versions": "8:",
"flags": "-march={name} -mtune={name}"
}
}
},
"barcelona": {
"from": "x86_64",
"vendor": "AuthenticAMD",
"features": [
"mmx",
"sse",
"sse2",
"sse4a",
"abm"
]
},
"bulldozer": {
"from": "barcelona",
"vendor": "AuthenticAMD",
"features": [
"mmx",
"sse",
"sse2",
"sse4a",
"abm",
"avx",
"xop",
"lwp",
"aes",
"pclmulqdq",
"cx16",
"ssse3",
"sse4_1",
"sse4_2"
],
"compilers": {
"gcc": {
"name": "bdver1",
"versions": "4.6:",
"flags": "-march={name} -mtune={name}"
}
}
},
"piledriver": {
"from": "bulldozer",
"vendor": "AuthenticAMD",
"features": [
"mmx",
"sse",
"sse2",
"sse4a",
"abm",
"avx",
"aes",
"pclmulqdq",
"cx16",
"ssse3",
"sse4_1",
"sse4_2",
"bmi1",
"f16c",
"fma"
],
"compilers": {
"gcc": {
"name": "bdver2",
"versions": "4.7:",
"flags": "-march={name} -mtune={name}"
}
}
},
"steamroller": {
"from": "piledriver",
"vendor": "AuthenticAMD",
"features": [
"mmx",
"sse",
"sse2",
"sse4a",
"abm",
"avx",
"aes",
"pclmulqdq",
"cx16",
"ssse3",
"sse4_1",
"sse4_2",
"bmi1",
"f16c",
"fma",
"fsgsbase"
],
"compilers": {
"gcc": {
"name": "bdver3",
"versions": "4.8:",
"flags": "-march={name} -mtune={name}"
}
}
},
"excavator": {
"from": "steamroller",
"vendor": "AuthenticAMD",
"features": [
"mmx",
"sse",
"sse2",
"sse4a",
"abm",
"avx",
"aes",
"pclmulqdq",
"cx16",
"ssse3",
"sse4_1",
"sse4_2",
"bmi1",
"f16c",
"fma",
"fsgsbase",
"bmi2",
"avx2",
"movbe"
],
"compilers": {
"gcc": {
"name": "bdver4",
"versions": "4.9:",
"flags": "-march={name} -mtune={name}"
}
}
},
"zen": {
"from": "excavator",
"vendor": "AuthenticAMD",
"features": [
"bmi1",
"bmi2",
"f16c",
"fma",
"fsgsbase",
"avx",
"avx2",
"rdseed",
"clzero",
"aes",
"pclmulqdq",
"cx16",
"movbe",
"mmx",
"sse",
"sse2",
"sse4a",
"ssse3",
"sse4_1",
"sse4_2",
"abm",
"xsavec",
"xsaveopt",
"clflushopt",
"popcnt"
],
"compilers": {
"gcc": {
"name": "znver1",
"versions": "6:",
"flags": "-march={name} -mtune={name}"
}
}
},
"zen2": {
"from": "zen",
"vendor": "AuthenticAMD",
"features": [
"bmi1",
"bmi2",
"f16c",
"fma",
"fsgsbase",
"avx",
"avx2",
"rdseed",
"clzero",
"aes",
"pclmulqdq",
"cx16",
"movbe",
"mmx",
"sse",
"sse2",
"sse4a",
"ssse3",
"sse4_1",
"sse4_2",
"abm",
"xsavec",
"xsaveopt",
"clflushopt",
"popcnt",
"clwb"
],
"compilers": {
"gcc": {
"name": "znver2",
"versions": "9:",
"flags": "-march={name} -mtune={name}"
}
}
},
"ppc64": {
"from": null,
"vendor": "generic",
"features": [],
"compilers": {
"gcc": {
"versions": "4:",
"flags": "-mcpu={name} -mtune={name}"
}
}
},
"power7": {
"from": "ppc64",
"vendor": "IBM",
"generation": 7,
"features": [],
"compilers": {
"gcc": {
"versions": "4.5:",
"flags": "-mcpu={name} -mtune={name}"
}
}
},
"power8": {
"from": "power7",
"vendor": "IBM",
"generation": 8,
"features": [],
"compilers": {
"gcc": [
{
"versions": "4.9:",
"flags": "-mcpu={name} -mtune={name}"
},
{
"versions": "4.8:4.8.5",
"warnings": "Using GCC 4.8 to optimize for Power 8 might not work if you are not on Red Hat Enterprise Linux 7, where a custom backport of the feature has been done. Upstream support from GCC starts in version 4.9",
"flags": "-mcpu={name} -mtune={name}"
}
]
}
},
"power9": {
"from": "power8",
"vendor": "IBM",
"generation": 9,
"features": [],
"compilers": {
"gcc": {
"versions": "6:",
"flags": "-mcpu={name} -mtune={name}"
}
}
},
"ppc64le": {
"from": null,
"vendor": "generic",
"features": [],
"compilers": {
"gcc": {
"versions": "4:",
"flags": "-mcpu={name} -mtune={name}"
}
}
},
"power8le": {
"from": "ppc64le",
"vendor": "IBM",
"generation": 8,
"features": [],
"compilers": {
"gcc": [
{
"versions": "4.9:",
"name": "power8",
"flags": "-mcpu={name} -mtune={name}"
},
{
"versions": "4.8:4.8.5",
"warnings": "Using GCC 4.8 to optimize for Power 8 might not work if you are not on Red Hat Enterprise Linux 7, where a custom backport of the feature has been done. Upstream support from GCC starts in version 4.9",
"name": "power8",
"flags": "-mcpu={name} -mtune={name}"
}
]
}
},
"power9le": {
"from": "power8le",
"vendor": "IBM",
"generation": 9,
"features": [],
"compilers": {
"gcc": {
"name": "power9",
"versions": "6:",
"flags": "-mcpu={name} -mtune={name}"
}
}
},
"aarch64": {
"from": null,
"vendor": "generic",
"features": [],
"compilers": {
"gcc": {
"versions": "4:",
"flags": "-march=armv8-a -mtune=generic"
}
}
}
},
"feature_aliases": {
"sse3": {
"reason": "ssse3 is a superset of sse3 and might be the only one listed",
"any_of": [
"ssse3"
]
},
"avx512": {
"reason": "avx512 indicates generic support for any of the avx512 instruction sets",
"any_of": [
"avx512f",
"avx512vl",
"avx512bw",
"avx512dq",
"avx512cd"
]
},
"altivec": {
"reason": "altivec is supported by Power PC architectures, but might not be listed in features",
"families": [
"ppc64le",
"ppc64"
]
},
"sse4.1": {
"reason": "permits to refer to sse4_1 also as sse4.1",
"any_of": [
"sse4_1"
]
},
"sse4.2": {
"reason": "permits to refer to sse4_2 also as sse4.2",
"any_of": [
"sse4_2"
]
}
}
}

View File

@@ -0,0 +1,133 @@
# Copyright 2013-2019 Lawrence Livermore National Security, LLC and other
# Spack Project Developers. See the top-level COPYRIGHT file for details.
#
# SPDX-License-Identifier: (Apache-2.0 OR MIT)
import json
import os.path
try:
from collections.abc import MutableMapping
except ImportError:
from collections import MutableMapping
compilers_schema = {
'type': 'object',
'properties': {
'versions': {'type': 'string'},
'name': {'type': 'string'},
'flags': {'type': 'string'}
},
'required': ['versions', 'flags']
}
properties = {
'microarchitectures': {
'type': 'object',
'patternProperties': {
r'([\w]*)': {
'type': 'object',
'properties': {
'from': {
'anyOf': [
# More than one parent
{'type': 'array', 'items': {'type': 'string'}},
# Exactly one parent
{'type': 'string'},
# No parent
{'type': 'null'}
]
},
'vendor': {
'type': 'string'
},
'features': {
'type': 'array',
'items': {'type': 'string'}
},
'compilers': {
'type': 'object',
'patternProperties': {
r'([\w]*)': {
'anyOf': [
compilers_schema,
{
'type': 'array',
'items': compilers_schema
}
]
}
}
}
},
'required': ['from', 'vendor', 'features']
}
}
},
'feature_aliases': {
'type': 'object',
'patternProperties': {
r'([\w]*)': {
'type': 'object',
'properties': {},
'additionalProperties': False
}
},
}
}
schema = {
'$schema': 'http://json-schema.org/schema#',
'title': 'Schema for microarchitecture definitions and feature aliases',
'type': 'object',
'additionalProperties': False,
'properties': properties,
}
class LazyDictionary(MutableMapping):
"""Lazy dictionary that gets constructed on first access to any object key
Args:
factory (callable): factory function to construct the dictionary
"""
def __init__(self, factory, *args, **kwargs):
self.factory = factory
self.args = args
self.kwargs = kwargs
self._data = None
@property
def data(self):
if self._data is None:
self._data = self.factory(*self.args, **self.kwargs)
return self._data
def __getitem__(self, key):
return self.data[key]
def __setitem__(self, key, value):
self.data[key] = value
def __delitem__(self, key):
del self.data[key]
def __iter__(self):
return iter(self.data)
def __len__(self):
return len(self.data)
def _load_targets_json():
"""Loads ``microarchitectures.json`` in memory."""
directory_name = os.path.dirname(os.path.abspath(__file__))
filename = os.path.join(directory_name, 'microarchitectures.json')
with open(filename, 'r') as f:
return json.load(f)
#: In memory representation of the data in microarchitectures.json,
#: loaded on first access
targets_json = LazyDictionary(_load_targets_json)

View File

@@ -1,226 +0,0 @@
# Copyright 2013-2019 Lawrence Livermore National Security, LLC and other
# Spack Project Developers. See the top-level COPYRIGHT file for details.
#
# SPDX-License-Identifier: (Apache-2.0 OR MIT)
import platform
import re
import subprocess
import sys
# Tuple of name, flags added, flags removed (default [])
_intel_32 = [
('i686', []),
('pentium2', ['mmx']),
('pentium3', ['sse']),
('pentium4', ['sse2']),
('prescott', ['sse3']),
]
_intel_64 = [ # commenting out the ones that aren't shown through sysctl
('nocona', ['mmx', 'sse', 'sse2', 'sse3']),#lm
('core2', ['ssse3'], ['sse3']),
('nehalem', ['sse4_1', 'sse4_2', 'popcnt']),
('westmere', ['aes', 'pclmulqdq']),
('sandybridge', ['avx']),
('ivybridge', ['rdrand', 'f16c']),#fsgsbase (is it RDWRFSGS on darwin?)
('haswell', ['movbe', 'fma', 'avx2', 'bmi1', 'bmi2']),
('broadwell', ['rdseed', 'adx']),
('skylake', ['xsavec', 'xsaves'])
]
# We will need to build on these and combine with names when intel releases
# further avx512 processors.
# _intel_avx12 = ['avx512f', 'avx512cd']
_amd_10_names = [
('barcelona', ['mmx', 'sse', 'sse2', 'sse3', 'sse4a', 'abm'])
]
_amd_14_names = [
('btver1', ['mmx', 'sse', 'sse2', 'sse3', 'ssse3', 'sse4a', 'cx16',
'abm']),#lm
]
_amd_15_names = [
('bdver1', ['avx', 'aes', 'pclmulqdq', 'cx16', 'sse', 'sse2', 'sse3',
'ssse3', 'sse4a', 'sse4_1', 'sse4_2', 'abm']),#xop, lwp
('bdver2', ['bmi1', 'f16c', 'fma',]),#tba?
('bdver3', ['fsgsbase']),
('bdver4', ['bmi2', 'movbe', 'avx2'])
]
_amd_16_names = [
('btver2', ['mmx', 'sse', 'sse2', 'sse3', 'ssse3', 'sse4a', 'cx16',
'abm', 'movbe', 'f16c', 'bmi1', 'avx', 'pclmulqdq',
'aes', 'sse4_1', 'sse4_2']),#lm
]
_amd_17_names = [
('znver1', ['bmi1', 'bmi2', 'f16c', 'fma', 'fsgsbase', 'avx', 'avx2',
'rdseed', 'mwaitx', 'clzero', 'aes', 'pclmulqdq', 'cx16',
'movbe', 'mmx', 'sse', 'sse2', 'sse3', 'ssse3', 'sse4a',
'sse4_1', 'sse4_2', 'abm', 'xsavec', 'xsaves',
'clflushopt', 'popcnt', 'adcx'])
]
_amd_numbers = {
0x10: _amd_10_names,
0x14: _amd_14_names,
0x15: _amd_15_names,
0x16: _amd_16_names,
0x17: _amd_17_names
}
def supported_target_names():
intel_names = set(t[0] for t in _intel_64)
intel_names |= set(t[0] for t in _intel_32)
amd_names = set()
for family in _amd_numbers:
amd_names |= set(t[0] for t in _amd_numbers[family])
power_names = set('power' + str(d) for d in range(7, 10))
return intel_names | amd_names | power_names
def create_dict_from_cpuinfo():
# Initialize cpuinfo from file
cpuinfo = {}
try:
with open('/proc/cpuinfo') as file:
text = file.readlines()
for line in text:
if line.strip():
key, _, value = line.partition(':')
cpuinfo[key.strip()] = value.strip()
except IOError:
return None
return cpuinfo
def check_output(args):
if sys.version_info >= (3, 0):
return subprocess.run(args, check=True, stdout=PIPE).stdout # nopyqver
else:
return subprocess.check_output(args) # nopyqver
def create_dict_from_sysctl():
cpuinfo = {}
try:
cpuinfo['vendor_id'] = check_output(['sysctl', '-n',
'machdep.cpu.vendor']).strip()
cpuinfo['flags'] = check_output(['sysctl', '-n',
'machdep.cpu.features']).strip().lower()
cpuinfo['flags'] += ' ' + check_output(['sysctl', '-n',
'machdep.cpu.leaf7_features']).strip().lower()
cpuinfo['model'] = check_output(['sysctl', '-n',
'machdep.cpu.model']).strip()
cpuinfo['model name'] = check_output(['sysctl', '-n',
'machdep.cpu.brand_string']).strip()
# Super hacky way to deal with slight representation differences
# Would be better to somehow consider these "identical"
if 'sse4.1' in cpuinfo['flags']:
cpuinfo['flags'] += ' sse4_1'
if 'sse4.2' in cpuinfo['flags']:
cpuinfo['flags'] += ' sse4_2'
if 'avx1.0' in cpuinfo['flags']:
cpuinfo['flags'] += ' avx'
except:
pass
return cpuinfo
def get_cpu_name():
name = get_cpu_name_helper(platform.system())
return name if name else platform.machine()
def get_cpu_name_helper(system):
# TODO: Elsewhere create dict of codenames (targets) and flag sets.
# Return cpu name or an empty string if one cannot be determined.
cpuinfo = {}
if system == 'Linux':
cpuinfo = create_dict_from_cpuinfo()
elif system == 'Darwin':
cpuinfo = create_dict_from_sysctl()
if not cpuinfo:
return ''
if 'vendor_id' in cpuinfo and cpuinfo['vendor_id'] == 'GenuineIntel':
if 'model name' not in cpuinfo or 'flags' not in cpuinfo:
# We don't have the information we need to determine the
# microarchitecture name
return ''
return get_intel_cpu_name(cpuinfo)
elif 'vendor_id' in cpuinfo and cpuinfo['vendor_id'] == 'AuthenticAMD':
if 'cpu family' not in cpuinfo or 'flags' not in cpuinfo:
# We don't have the information we need to determine the
# microarchitecture name
return ''
return get_amd_cpu_name(cpuinfo)
elif 'cpu' in cpuinfo and 'POWER' in cpuinfo['cpu']:
return get_ibm_cpu_name(cpuinfo['cpu'])
else:
return ''
def get_ibm_cpu_name(cpu):
power_pattern = re.compile('POWER(\d+)')
power_match = power_pattern.search(cpu)
if power_match:
if 'le' in platform.machine():
return 'power' + power_match.group(1) + 'le'
return 'power' + power_match.group(1)
else:
return ''
def get_intel_cpu_name(cpuinfo):
model_name = cpuinfo['model name']
if 'Atom' in model_name:
return 'atom'
elif 'Quark' in model_name:
return 'quark'
elif 'Xeon' in model_name and 'Phi' in model_name:
# This is hacky and needs to be extended for newer avx512 chips
return 'knl'
else:
ret = ''
flag_list = cpuinfo['flags'].split()
proc_flags = []
for _intel_processors in [_intel_32, _intel_64]:
for entry in _intel_processors:
try:
proc, flags_added, flags_removed = entry
except ValueError:
proc, flags_added = entry
flags_removed = []
proc_flags = list(filter(lambda x: x not in flags_removed, proc_flags))
proc_flags.extend(flags_added)
if all(f in flag_list for f in proc_flags):
ret = proc
return ret
def get_amd_cpu_name(cpuinfo):
#TODO: Learn what the "canonical" granularity of naming
# is for AMD processors, implement dict as for intel.
ret = ''
flag_list = cpuinfo['flags'].split()
model_number = int(cpuinfo['cpu family'])
flags_dict = _amd_numbers[model_number]
proc_flags = []
for proc, proc_flags_added in flags_dict:
proc_flags.extend(proc_flags_added)
if all(f in flag_list for f in proc_flags):
ret = proc
else:
break
return ret
"""IDEA: In build_environment.setup_compiler_environment, include a
call to compiler.tuning_flags(spec.architecture.target). For gcc this
would return "-march=%s" % str(spec.architecture.target). We only call
this if the target is a valid tuning target (I.e. not
platform.machine(), but a more specific target we successfully
discovered.
Then set
SPACK_TUNING_FLAGS=compiler.tuning_flags(spec.architecture.target)
This way the compiler wrapper can just add $SPACK_TUNING_FLAGS to the
eventual command."""