targets: Spack targets can now be fine-grained microarchitectures
Spack can now: - label ppc64, ppc64le, x86_64, etc. builds with specific microarchitecture-specific names, like 'haswell', 'skylake' or 'icelake'. - detect the host architecture of a machine from /proc/cpuinfo or similar tools. - Understand which microarchitectures are compatible with which (for binary reuse) - Understand which compiler flags are needed (for GCC, so far) to build binaries for particular microarchitectures. All of this is managed through a JSON file (microarchitectures.json) that contains detailed auto-detection, compiler flag, and compatibility information for specific microarchitecture targets. The `llnl.util.cpu` module implements a library that allows detection and comparison of microarchitectures based on the data in this file. The `target` part of Spack specs is now essentially a Microarchitecture object, and Specs' targets can be compared for compatibility as well. This allows us to label optimized binary packages at a granularity that enables them to be reused on compatible machines. Previously, we only knew that a package was built for x86_64, NOT which x86_64 machines it was usable on. Currently this feature supports Intel, Power, and AMD chips. Support for ARM is forthcoming. Specifics: - Add microarchitectures.json with descriptions of architectures - Relaxed semantic of compiler's "target" attribute. Before this change the semantic to check if a compiler could be viable for a given target was exact match. This made sense as the finest granularity of targets was architecture families. As now we can target micro-architectures, this commit changes the semantic by interpreting as the architecture family what is stored in the compiler's "target" attribute. A compiler is then a viable choice if the target being concretized belongs to the same family. Similarly when a new compiler is detected the architecture family is stored in the "target" attribute. - Make Spack's `cc` compiler wrapper inject target-specific flags on the command line - Architecture concretization updated to use the same algorithm as compiler concretization - Micro-architecture features, vendor, generation etc. are included in the package hash. Generic architectures, such as x86_64 or ppc64, are still dumped using the name only. - If the compiler for a target is not supported exit with an intelligible error message. If the compiler support is unknown don't try to use optimization flags. - Support and define feature aliases (e.g., sse3 -> ssse3) in microarchitectures.json and on Microarchitecture objects. Feature aliases are defined in targets.json and map a name (the "alias") to a list of rules that must be met for the test to be successful. The rules that are available can be extended later using a decorator. - Implement subset semantics for comparing microarchitectures (treat microarchitectures as a partial order, i.e. (a < b), (a == b) and (b < a) can all be false. - Implement logic to automatically demote the default target if the compiler being used is too old to optimize for it. Updated docs to make this behavior explicit. This avoids surprising the user if the default compiler is older than the host architecture. This commit adds unit tests to verify the semantics of target ranges and target lists in constraints. The implementation to allow target ranges and lists is minimal and doesn't add any new type. A more careful refactor that takes into account the type system might be due later. Co-authored-by: Gregory Becker <becker33.llnl.gov>
This commit is contained in:

committed by
Todd Gamblin

parent
dfabf5d6b1
commit
3c4322bf1a
16
lib/spack/llnl/util/cpu/__init__.py
Normal file
16
lib/spack/llnl/util/cpu/__init__.py
Normal file
@@ -0,0 +1,16 @@
|
||||
# Copyright 2013-2019 Lawrence Livermore National Security, LLC and other
|
||||
# Spack Project Developers. See the top-level COPYRIGHT file for details.
|
||||
#
|
||||
# SPDX-License-Identifier: (Apache-2.0 OR MIT)
|
||||
|
||||
from .microarchitecture import Microarchitecture, UnsupportedMicroarchitecture
|
||||
from .microarchitecture import targets, generic_microarchitecture
|
||||
from .detect import host
|
||||
|
||||
__all__ = [
|
||||
'Microarchitecture',
|
||||
'UnsupportedMicroarchitecture',
|
||||
'targets',
|
||||
'generic_microarchitecture',
|
||||
'host'
|
||||
]
|
102
lib/spack/llnl/util/cpu/alias.py
Normal file
102
lib/spack/llnl/util/cpu/alias.py
Normal file
@@ -0,0 +1,102 @@
|
||||
# Copyright 2013-2019 Lawrence Livermore National Security, LLC and other
|
||||
# Spack Project Developers. See the top-level COPYRIGHT file for details.
|
||||
#
|
||||
# SPDX-License-Identifier: (Apache-2.0 OR MIT)
|
||||
#: Known predicates that can be used to construct feature aliases
|
||||
|
||||
from .schema import targets_json, LazyDictionary, properties
|
||||
|
||||
_feature_alias_predicate = {}
|
||||
|
||||
|
||||
class FeatureAliasTest(object):
|
||||
"""A test that must be passed for a feature alias to succeed.
|
||||
|
||||
Args:
|
||||
rules (dict): dictionary of rules to be met. Each key must be a
|
||||
valid alias predicate
|
||||
"""
|
||||
def __init__(self, rules):
|
||||
self.rules = rules
|
||||
self.predicates = []
|
||||
for name, args in rules.items():
|
||||
self.predicates.append(_feature_alias_predicate[name](args))
|
||||
|
||||
def __call__(self, microarchitecture):
|
||||
return all(
|
||||
feature_test(microarchitecture) for feature_test in self.predicates
|
||||
)
|
||||
|
||||
|
||||
def _feature_aliases():
|
||||
"""Returns the dictionary of all defined feature aliases."""
|
||||
json_data = targets_json['feature_aliases']
|
||||
aliases = {}
|
||||
for alias, rules in json_data.items():
|
||||
aliases[alias] = FeatureAliasTest(rules)
|
||||
return aliases
|
||||
|
||||
|
||||
feature_aliases = LazyDictionary(_feature_aliases)
|
||||
|
||||
|
||||
def alias_predicate(predicate_schema):
|
||||
"""Decorator to register a predicate that can be used to define
|
||||
feature aliases.
|
||||
|
||||
Args:
|
||||
predicate_schema (dict): schema to be enforced in
|
||||
microarchitectures.json for the predicate
|
||||
"""
|
||||
def decorator(func):
|
||||
name = func.__name__
|
||||
|
||||
# Check we didn't register anything else with the same name
|
||||
if name in _feature_alias_predicate:
|
||||
msg = 'the alias predicate "{0}" already exists'.format(name)
|
||||
raise KeyError(msg)
|
||||
|
||||
# Update the overall schema
|
||||
alias_schema = properties['feature_aliases']['patternProperties']
|
||||
alias_schema[r'([\w]*)']['properties'].update(
|
||||
{name: predicate_schema}
|
||||
)
|
||||
# Register the predicate
|
||||
_feature_alias_predicate[name] = func
|
||||
|
||||
return func
|
||||
return decorator
|
||||
|
||||
|
||||
@alias_predicate(predicate_schema={'type': 'string'})
|
||||
def reason(motivation_for_the_alias):
|
||||
"""This predicate returns always True and it's there to allow writing
|
||||
a documentation string in the JSON file to explain why an alias is needed.
|
||||
"""
|
||||
return lambda x: True
|
||||
|
||||
|
||||
@alias_predicate(predicate_schema={
|
||||
'type': 'array',
|
||||
'items': {'type': 'string'}
|
||||
})
|
||||
def any_of(list_of_features):
|
||||
"""Returns a predicate that is True if any of the feature in the
|
||||
list is in the microarchitecture being tested, False otherwise.
|
||||
"""
|
||||
def _impl(microarchitecture):
|
||||
return any(x in microarchitecture for x in list_of_features)
|
||||
return _impl
|
||||
|
||||
|
||||
@alias_predicate(predicate_schema={
|
||||
'type': 'array',
|
||||
'items': {'type': 'string'}
|
||||
})
|
||||
def families(list_of_families):
|
||||
"""Returns a predicate that is True if the architecture family of
|
||||
the microarchitecture being tested is in the list, False otherwise.
|
||||
"""
|
||||
def _impl(microarchitecture):
|
||||
return str(microarchitecture.family) in list_of_families
|
||||
return _impl
|
216
lib/spack/llnl/util/cpu/detect.py
Normal file
216
lib/spack/llnl/util/cpu/detect.py
Normal file
@@ -0,0 +1,216 @@
|
||||
# Copyright 2013-2019 Lawrence Livermore National Security, LLC and other
|
||||
# Spack Project Developers. See the top-level COPYRIGHT file for details.
|
||||
#
|
||||
# SPDX-License-Identifier: (Apache-2.0 OR MIT)
|
||||
import collections
|
||||
import functools
|
||||
import platform
|
||||
import re
|
||||
import subprocess
|
||||
import sys
|
||||
|
||||
import six
|
||||
|
||||
from .microarchitecture import generic_microarchitecture, targets
|
||||
|
||||
#: Mapping from operating systems to chain of commands
|
||||
#: to obtain a dictionary of raw info on the current cpu
|
||||
info_factory = collections.defaultdict(list)
|
||||
|
||||
#: Mapping from micro-architecture families (x86_64, ppc64le, etc.) to
|
||||
#: functions checking the compatibility of the host with a given target
|
||||
compatibility_checks = {}
|
||||
|
||||
|
||||
def info_dict(operating_system):
|
||||
"""Decorator to mark functions that are meant to return raw info on
|
||||
the current cpu.
|
||||
|
||||
Args:
|
||||
operating_system (str or tuple): operating system for which the marked
|
||||
function is a viable factory of raw info dictionaries.
|
||||
"""
|
||||
def decorator(factory):
|
||||
info_factory[operating_system].append(factory)
|
||||
|
||||
@functools.wraps(factory)
|
||||
def _impl():
|
||||
info = factory()
|
||||
|
||||
# Check that info contains a few mandatory fields
|
||||
msg = 'field "{0}" is missing from raw info dictionary'
|
||||
assert 'vendor_id' in info, msg.format('vendor_id')
|
||||
assert 'flags' in info, msg.format('flags')
|
||||
assert 'model' in info, msg.format('model')
|
||||
assert 'model_name' in info, msg.format('model_name')
|
||||
|
||||
return info
|
||||
|
||||
return _impl
|
||||
|
||||
return decorator
|
||||
|
||||
|
||||
@info_dict(operating_system='Linux')
|
||||
def proc_cpuinfo():
|
||||
"""Returns a raw info dictionary by parsing the first entry of
|
||||
``/proc/cpuinfo``
|
||||
"""
|
||||
info = {}
|
||||
with open('/proc/cpuinfo') as file:
|
||||
for line in file:
|
||||
key, separator, value = line.partition(':')
|
||||
|
||||
# If there's no separator and info was already populated
|
||||
# according to what's written here:
|
||||
#
|
||||
# http://www.linfo.org/proc_cpuinfo.html
|
||||
#
|
||||
# we are on a blank line separating two cpus. Exit early as
|
||||
# we want to read just the first entry in /proc/cpuinfo
|
||||
if separator != ':' and info:
|
||||
break
|
||||
|
||||
info[key.strip()] = value.strip()
|
||||
return info
|
||||
|
||||
|
||||
def check_output(args):
|
||||
if sys.version_info[:2] == (2, 6):
|
||||
return subprocess.run(
|
||||
args, check=True, stdout=subprocess.PIPE).stdout # nopyqver
|
||||
else:
|
||||
return subprocess.check_output(args) # nopyqver
|
||||
|
||||
|
||||
@info_dict(operating_system='Darwin')
|
||||
def sysctl():
|
||||
"""Returns a raw info dictionary parsing the output of sysctl."""
|
||||
|
||||
info = {}
|
||||
info['vendor_id'] = check_output(
|
||||
['sysctl', '-n', 'machdep.cpu.vendor']
|
||||
).strip()
|
||||
info['flags'] = check_output(
|
||||
['sysctl', '-n', 'machdep.cpu.features']
|
||||
).strip().lower()
|
||||
info['flags'] += ' ' + check_output(
|
||||
['sysctl', '-n', 'machdep.cpu.leaf7_features']
|
||||
).strip().lower()
|
||||
info['model'] = check_output(
|
||||
['sysctl', '-n', 'machdep.cpu.model']
|
||||
).strip()
|
||||
info['model name'] = check_output(
|
||||
['sysctl', '-n', 'machdep.cpu.brand_string']
|
||||
).strip()
|
||||
|
||||
# Super hacky way to deal with slight representation differences
|
||||
# Would be better to somehow consider these "identical"
|
||||
if 'sse4.1' in info['flags']:
|
||||
info['flags'] += ' sse4_1'
|
||||
if 'sse4.2' in info['flags']:
|
||||
info['flags'] += ' sse4_2'
|
||||
if 'avx1.0' in info['flags']:
|
||||
info['flags'] += ' avx'
|
||||
|
||||
return info
|
||||
|
||||
|
||||
def raw_info_dictionary():
|
||||
"""Returns a dictionary with information on the cpu of the current host.
|
||||
|
||||
This function calls all the viable factories one after the other until
|
||||
there's one that is able to produce the requested information.
|
||||
"""
|
||||
info = {}
|
||||
for factory in info_factory[platform.system()]:
|
||||
try:
|
||||
info = factory()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if info:
|
||||
break
|
||||
|
||||
return info
|
||||
|
||||
|
||||
def compatible_microarchitectures(info):
|
||||
"""Returns an unordered list of known micro-architectures that are
|
||||
compatible with the info dictionary passed as argument.
|
||||
|
||||
Args:
|
||||
info (dict): dictionary containing information on the host cpu
|
||||
"""
|
||||
architecture_family = platform.machine()
|
||||
# If a tester is not registered, be conservative and assume no known
|
||||
# target is compatible with the host
|
||||
tester = compatibility_checks.get(architecture_family, lambda x, y: False)
|
||||
return [x for x in targets.values() if tester(info, x)] or \
|
||||
[generic_microarchitecture(architecture_family)]
|
||||
|
||||
|
||||
def host():
|
||||
"""Detects the host micro-architecture and returns it."""
|
||||
# Retrieve a dictionary with raw information on the host's cpu
|
||||
info = raw_info_dictionary()
|
||||
|
||||
# Get a list of possible candidates for this micro-architecture
|
||||
candidates = compatible_microarchitectures(info)
|
||||
|
||||
# Reverse sort of the depth for the inheritance tree among only targets we
|
||||
# can use. This gets the newest target we satisfy.
|
||||
return sorted(candidates, key=lambda t: len(t.ancestors), reverse=True)[0]
|
||||
|
||||
|
||||
def compatibility_check(architecture_family):
|
||||
"""Decorator to register a function as a proper compatibility check.
|
||||
|
||||
A compatibility check function takes the raw info dictionary as a first
|
||||
argument and an arbitrary target as the second argument. It returns True
|
||||
if the target is compatible with the info dictionary, False otherwise.
|
||||
|
||||
Args:
|
||||
architecture_family (str or tuple): architecture family for which
|
||||
this test can be used, e.g. x86_64 or ppc64le etc.
|
||||
"""
|
||||
# Turn the argument into something iterable
|
||||
if isinstance(architecture_family, six.string_types):
|
||||
architecture_family = (architecture_family,)
|
||||
|
||||
def decorator(func):
|
||||
# TODO: on removal of Python 2.6 support this can be re-written as
|
||||
# TODO: an update + a dict comprehension
|
||||
for arch_family in architecture_family:
|
||||
compatibility_checks[arch_family] = func
|
||||
|
||||
return func
|
||||
|
||||
return decorator
|
||||
|
||||
|
||||
@compatibility_check(architecture_family=('ppc64le', 'ppc64'))
|
||||
def compatibility_check_for_power(info, target):
|
||||
basename = platform.machine()
|
||||
generation_match = re.search(r'POWER(\d+)', info.get('cpu', ''))
|
||||
generation = int(generation_match.group(1))
|
||||
|
||||
# We can use a target if it descends from our machine type and our
|
||||
# generation (9 for POWER9, etc) is at least its generation.
|
||||
arch_root = targets[basename]
|
||||
return (target == arch_root or arch_root in target.ancestors) \
|
||||
and target.generation <= generation
|
||||
|
||||
|
||||
@compatibility_check(architecture_family='x86_64')
|
||||
def compatibility_check_for_x86_64(info, target):
|
||||
basename = 'x86_64'
|
||||
vendor = info.get('vendor_id', 'generic')
|
||||
features = set(info.get('flags', '').split())
|
||||
|
||||
# We can use a target if it descends from our machine type, is from our
|
||||
# vendor, and we have all of its features
|
||||
arch_root = targets[basename]
|
||||
return (target == arch_root or arch_root in target.ancestors) \
|
||||
and (target.vendor == vendor or target.vendor == 'generic') \
|
||||
and target.features.issubset(features)
|
343
lib/spack/llnl/util/cpu/microarchitecture.py
Normal file
343
lib/spack/llnl/util/cpu/microarchitecture.py
Normal file
@@ -0,0 +1,343 @@
|
||||
# Copyright 2013-2019 Lawrence Livermore National Security, LLC and other
|
||||
# Spack Project Developers. See the top-level COPYRIGHT file for details.
|
||||
#
|
||||
# SPDX-License-Identifier: (Apache-2.0 OR MIT)
|
||||
import functools
|
||||
import platform
|
||||
import warnings
|
||||
|
||||
try:
|
||||
from collections.abc import Sequence
|
||||
except ImportError:
|
||||
from collections import Sequence
|
||||
|
||||
import six
|
||||
|
||||
import llnl.util
|
||||
import llnl.util.cpu.alias
|
||||
import llnl.util.cpu.schema
|
||||
|
||||
from .schema import LazyDictionary
|
||||
from .alias import feature_aliases
|
||||
|
||||
|
||||
def coerce_target_names(func):
|
||||
"""Decorator that automatically converts a known target name to a proper
|
||||
Microarchitecture object.
|
||||
"""
|
||||
@functools.wraps(func)
|
||||
def _impl(self, other):
|
||||
if isinstance(other, six.string_types):
|
||||
if other not in targets:
|
||||
msg = '"{0}" is not a valid target name'
|
||||
raise ValueError(msg.format(other))
|
||||
other = targets[other]
|
||||
|
||||
return func(self, other)
|
||||
return _impl
|
||||
|
||||
|
||||
class Microarchitecture(object):
|
||||
#: Aliases for micro-architecture's features
|
||||
feature_aliases = feature_aliases
|
||||
|
||||
def __init__(
|
||||
self, name, parents, vendor, features, compilers, generation=0
|
||||
):
|
||||
"""Represents a specific CPU micro-architecture.
|
||||
|
||||
Args:
|
||||
name (str): name of the micro-architecture (e.g. skylake).
|
||||
parents (list): list of parents micro-architectures, if any.
|
||||
Parenthood is considered by cpu features and not
|
||||
chronologically. As such each micro-architecture is
|
||||
compatible with its ancestors. For example "skylake",
|
||||
which has "broadwell" as a parent, supports running binaries
|
||||
optimized for "broadwell".
|
||||
vendor (str): vendor of the micro-architecture
|
||||
features (list of str): supported CPU flags. Note that the semantic
|
||||
of the flags in this field might vary among architectures, if
|
||||
at all present. For instance x86_64 processors will list all
|
||||
the flags supported by a given CPU while Arm processors will
|
||||
list instead only the flags that have been added on top of the
|
||||
base model for the current micro-architecture.
|
||||
compilers (dict): compiler support to generate tuned code for this
|
||||
micro-architecture. This dictionary has as keys names of
|
||||
supported compilers, while values are list of dictionaries
|
||||
with fields:
|
||||
|
||||
* name: name of the micro-architecture according to the
|
||||
compiler. This is the name passed to the ``-march`` option
|
||||
or similar. Not needed if the name is the same as that
|
||||
passed in as argument above.
|
||||
* versions: versions that support this micro-architecture.
|
||||
|
||||
generation (int): generation of the micro-architecture, if
|
||||
relevant.
|
||||
"""
|
||||
self.name = name
|
||||
self.parents = parents
|
||||
self.vendor = vendor
|
||||
self.features = features
|
||||
self.compilers = compilers
|
||||
self.generation = generation
|
||||
|
||||
@property
|
||||
def ancestors(self):
|
||||
value = self.parents[:]
|
||||
for parent in self.parents:
|
||||
value.extend(a for a in parent.ancestors if a not in value)
|
||||
return value
|
||||
|
||||
def _to_set(self):
|
||||
"""Returns a set of the nodes in this microarchitecture DAG."""
|
||||
# This function is used to implement subset semantics with
|
||||
# comparison operators
|
||||
return set([str(self)] + [str(x) for x in self.ancestors])
|
||||
|
||||
@coerce_target_names
|
||||
def __eq__(self, other):
|
||||
if not isinstance(other, Microarchitecture):
|
||||
return NotImplemented
|
||||
|
||||
return (self.name == other.name and
|
||||
self.vendor == other.vendor and
|
||||
self.features == other.features and
|
||||
self.ancestors == other.ancestors and
|
||||
self.compilers == other.compilers and
|
||||
self.generation == other.generation)
|
||||
|
||||
@coerce_target_names
|
||||
def __ne__(self, other):
|
||||
return not self == other
|
||||
|
||||
@coerce_target_names
|
||||
def __lt__(self, other):
|
||||
if not isinstance(other, Microarchitecture):
|
||||
return NotImplemented
|
||||
|
||||
return self._to_set() < other._to_set()
|
||||
|
||||
@coerce_target_names
|
||||
def __le__(self, other):
|
||||
return (self == other) or (self < other)
|
||||
|
||||
@coerce_target_names
|
||||
def __gt__(self, other):
|
||||
if not isinstance(other, Microarchitecture):
|
||||
return NotImplemented
|
||||
|
||||
return self._to_set() > other._to_set()
|
||||
|
||||
@coerce_target_names
|
||||
def __ge__(self, other):
|
||||
return (self == other) or (self > other)
|
||||
|
||||
def __repr__(self):
|
||||
cls_name = self.__class__.__name__
|
||||
fmt = cls_name + '({0.name!r}, {0.parents!r}, {0.vendor!r}, ' \
|
||||
'{0.features!r}, {0.compilers!r}, {0.generation!r})'
|
||||
return fmt.format(self)
|
||||
|
||||
def __str__(self):
|
||||
return self.name
|
||||
|
||||
def __contains__(self, feature):
|
||||
# Feature must be of a string type, so be defensive about that
|
||||
if not isinstance(feature, six.string_types):
|
||||
msg = 'only objects of string types are accepted [got {0}]'
|
||||
raise TypeError(msg.format(str(type(feature))))
|
||||
|
||||
# Here we look first in the raw features, and fall-back to
|
||||
# feature aliases if not match was found
|
||||
if feature in self.features:
|
||||
return True
|
||||
|
||||
# Check if the alias is defined, if not it will return False
|
||||
match_alias = Microarchitecture.feature_aliases.get(
|
||||
feature, lambda x: False
|
||||
)
|
||||
return match_alias(self)
|
||||
|
||||
@property
|
||||
def family(self):
|
||||
"""Returns the architecture family a given target belongs to"""
|
||||
roots = [x for x in [self] + self.ancestors if not x.ancestors]
|
||||
msg = "a target is expected to belong to just one architecture family"
|
||||
msg += "[found {0}]".format(', '.join(str(x) for x in roots))
|
||||
assert len(roots) == 1, msg
|
||||
|
||||
return roots.pop()
|
||||
|
||||
def to_dict(self, return_list_of_items=False):
|
||||
"""Returns a dictionary representation of this object.
|
||||
|
||||
Args:
|
||||
return_list_of_items (bool): if True returns an ordered list of
|
||||
items instead of the dictionary
|
||||
"""
|
||||
list_of_items = [
|
||||
('name', str(self.name)),
|
||||
('vendor', str(self.vendor)),
|
||||
('features', sorted(
|
||||
str(x) for x in self.features
|
||||
)),
|
||||
('generation', self.generation),
|
||||
('parents', [str(x) for x in self.parents])
|
||||
]
|
||||
if return_list_of_items:
|
||||
return list_of_items
|
||||
|
||||
return dict(list_of_items)
|
||||
|
||||
def optimization_flags(self, compiler, version):
|
||||
"""Returns a string containing the optimization flags that needs
|
||||
to be used to produce code optimized for this micro-architecture.
|
||||
|
||||
If there is no information on the compiler passed as argument the
|
||||
function returns an empty string. If it is known that the compiler
|
||||
version we want to use does not support this architecture the function
|
||||
raises an exception.
|
||||
|
||||
Args:
|
||||
compiler (str): name of the compiler to be used
|
||||
version (str): version of the compiler to be used
|
||||
"""
|
||||
# If we don't have information on compiler return an empty string
|
||||
if compiler not in self.compilers:
|
||||
return ''
|
||||
|
||||
# If we have information on this compiler we need to check the
|
||||
# version being used
|
||||
compiler_info = self.compilers[compiler]
|
||||
|
||||
# Normalize the entries to have a uniform treatment in the code below
|
||||
if not isinstance(compiler_info, Sequence):
|
||||
compiler_info = [compiler_info]
|
||||
|
||||
def satisfies_constraint(entry, version):
|
||||
min_version, max_version = entry['versions'].split(':')
|
||||
|
||||
# Check version suffixes
|
||||
min_version, _, min_suffix = min_version.partition('-')
|
||||
max_version, _, max_suffix = max_version.partition('-')
|
||||
version, _, suffix = version.partition('-')
|
||||
|
||||
# If the suffixes are not all equal there's no match
|
||||
if suffix != min_suffix or suffix != max_suffix:
|
||||
return False
|
||||
|
||||
# Assume compiler versions fit into semver
|
||||
tuplify = lambda x: tuple(int(y) for y in x.split('.'))
|
||||
|
||||
version = tuplify(version)
|
||||
if min_version:
|
||||
min_version = tuplify(min_version)
|
||||
if min_version > version:
|
||||
return False
|
||||
|
||||
if max_version:
|
||||
max_version = tuplify(max_version)
|
||||
if max_version < version:
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
for compiler_entry in compiler_info:
|
||||
if satisfies_constraint(compiler_entry, version):
|
||||
flags_fmt = compiler_entry['flags']
|
||||
# If there's no field name, use the name of the
|
||||
# micro-architecture
|
||||
compiler_entry.setdefault('name', self.name)
|
||||
|
||||
# Check if we need to emit a warning
|
||||
warning_message = compiler_entry.get('warnings', None)
|
||||
if warning_message:
|
||||
warnings.warn(warning_message)
|
||||
|
||||
flags = flags_fmt.format(**compiler_entry)
|
||||
return flags
|
||||
|
||||
msg = ("cannot produce optimized binary for micro-architecture '{0}'"
|
||||
" with {1}@{2} [supported compiler versions are {3}]")
|
||||
msg = msg.format(self.name, compiler, version,
|
||||
', '.join([x['versions'] for x in compiler_info]))
|
||||
raise UnsupportedMicroarchitecture(msg)
|
||||
|
||||
|
||||
def generic_microarchitecture(name):
|
||||
"""Returns a generic micro-architecture with no vendor and no features.
|
||||
|
||||
Args:
|
||||
name (str): name of the micro-architecture
|
||||
"""
|
||||
return Microarchitecture(
|
||||
name, parents=[], vendor='generic', features=[], compilers={}
|
||||
)
|
||||
|
||||
|
||||
def _known_microarchitectures():
|
||||
"""Returns a dictionary of the known micro-architectures. If the
|
||||
current host platform is unknown adds it too as a generic target.
|
||||
"""
|
||||
|
||||
# TODO: Simplify this logic using object_pairs_hook to OrderedDict
|
||||
# TODO: when we stop supporting python2.6
|
||||
|
||||
def fill_target_from_dict(name, data, targets):
|
||||
"""Recursively fills targets by adding the micro-architecture
|
||||
passed as argument and all its ancestors.
|
||||
|
||||
Args:
|
||||
name (str): micro-architecture to be added to targets.
|
||||
data (dict): raw data loaded from JSON.
|
||||
targets (dict): dictionary that maps micro-architecture names
|
||||
to ``Microarchitecture`` objects
|
||||
"""
|
||||
values = data[name]
|
||||
|
||||
# Get direct parents of target
|
||||
parent_names = values['from']
|
||||
if isinstance(parent_names, six.string_types):
|
||||
parent_names = [parent_names]
|
||||
if parent_names is None:
|
||||
parent_names = []
|
||||
for p in parent_names:
|
||||
# Recursively fill parents so they exist before we add them
|
||||
if p in targets:
|
||||
continue
|
||||
fill_target_from_dict(p, data, targets)
|
||||
parents = [targets.get(p) for p in parent_names]
|
||||
|
||||
vendor = values['vendor']
|
||||
features = set(values['features'])
|
||||
compilers = values.get('compilers', {})
|
||||
generation = values.get('generation', 0)
|
||||
|
||||
targets[name] = Microarchitecture(
|
||||
name, parents, vendor, features, compilers, generation
|
||||
)
|
||||
|
||||
targets = {}
|
||||
data = llnl.util.cpu.schema.targets_json['microarchitectures']
|
||||
for name in data:
|
||||
if name in targets:
|
||||
# name was already brought in as ancestor to a target
|
||||
continue
|
||||
fill_target_from_dict(name, data, targets)
|
||||
|
||||
# Add the host platform if not present
|
||||
host_platform = platform.machine()
|
||||
targets.setdefault(host_platform, generic_microarchitecture(host_platform))
|
||||
|
||||
return targets
|
||||
|
||||
|
||||
#: Dictionary of known micro-architectures
|
||||
targets = LazyDictionary(_known_microarchitectures)
|
||||
|
||||
|
||||
class UnsupportedMicroarchitecture(ValueError):
|
||||
"""Raised if a compiler version does not support optimization for a given
|
||||
micro-architecture.
|
||||
"""
|
832
lib/spack/llnl/util/cpu/microarchitectures.json
Normal file
832
lib/spack/llnl/util/cpu/microarchitectures.json
Normal file
@@ -0,0 +1,832 @@
|
||||
{
|
||||
"microarchitectures": {
|
||||
"x86": {
|
||||
"from": null,
|
||||
"vendor": "generic",
|
||||
"features": []
|
||||
},
|
||||
"i686": {
|
||||
"from": "x86",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": []
|
||||
},
|
||||
"pentium2": {
|
||||
"from": "i686",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx"
|
||||
]
|
||||
},
|
||||
"pentium3": {
|
||||
"from": "pentium2",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse"
|
||||
]
|
||||
},
|
||||
"pentium4": {
|
||||
"from": "pentium3",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2"
|
||||
]
|
||||
},
|
||||
"prescott": {
|
||||
"from": "pentium4",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"sse3"
|
||||
]
|
||||
},
|
||||
"x86_64": {
|
||||
"from": null,
|
||||
"vendor": "generic",
|
||||
"features": [],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"versions": "4:",
|
||||
"name": "x86-64",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"nocona": {
|
||||
"from": "x86_64",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"sse3"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"versions": "4:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"core2": {
|
||||
"from": "nocona",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"ssse3"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"versions": "4:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"nehalem": {
|
||||
"from": "core2",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"popcnt"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": [
|
||||
{
|
||||
"versions": "4.9:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
},
|
||||
{
|
||||
"versions": "4.6:4.8.5",
|
||||
"name": "corei7",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"westmere": {
|
||||
"from": "nehalem",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"popcnt",
|
||||
"aes",
|
||||
"pclmulqdq"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"versions": "4.9:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"sandybridge": {
|
||||
"from": "westmere",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"popcnt",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"avx"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": [
|
||||
{
|
||||
"versions": "4.9:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
},
|
||||
{
|
||||
"versions": "4.6:4.8.5",
|
||||
"name": "corei7-avx",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"ivybridge": {
|
||||
"from": "sandybridge",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"popcnt",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"avx",
|
||||
"rdrand",
|
||||
"f16c"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": [
|
||||
{
|
||||
"versions": "4.9:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
},
|
||||
{
|
||||
"versions": ":4.8.5",
|
||||
"name": "core-avx-i",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"haswell": {
|
||||
"from": "ivybridge",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"popcnt",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"avx",
|
||||
"rdrand",
|
||||
"f16c",
|
||||
"movbe",
|
||||
"fma",
|
||||
"avx2",
|
||||
"bmi1",
|
||||
"bmi2"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": [
|
||||
{
|
||||
"versions": "4.9:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
},
|
||||
{
|
||||
"versions": ":4.8.5",
|
||||
"name": "core-avx2",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"broadwell": {
|
||||
"from": "haswell",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"popcnt",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"avx",
|
||||
"rdrand",
|
||||
"f16c",
|
||||
"movbe",
|
||||
"fma",
|
||||
"avx2",
|
||||
"bmi1",
|
||||
"bmi2",
|
||||
"rdseed",
|
||||
"adx"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"versions": "4.9:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"skylake": {
|
||||
"from": "broadwell",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"popcnt",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"avx",
|
||||
"rdrand",
|
||||
"f16c",
|
||||
"movbe",
|
||||
"fma",
|
||||
"avx2",
|
||||
"bmi1",
|
||||
"bmi2",
|
||||
"rdseed",
|
||||
"adx",
|
||||
"clflushopt",
|
||||
"xsavec",
|
||||
"xsaveopt"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"versions": "5.3:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"skylake_avx512": {
|
||||
"from": "skylake",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"popcnt",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"avx",
|
||||
"rdrand",
|
||||
"f16c",
|
||||
"movbe",
|
||||
"fma",
|
||||
"avx2",
|
||||
"bmi1",
|
||||
"bmi2",
|
||||
"rdseed",
|
||||
"adx",
|
||||
"clflushopt",
|
||||
"xsavec",
|
||||
"xsaveopt",
|
||||
"avx512f",
|
||||
"clwb",
|
||||
"avx512vl",
|
||||
"avx512bw",
|
||||
"avx512dq",
|
||||
"avx512cd"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"name": "skylake-avx512",
|
||||
"versions": "5.3:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"cannonlake": {
|
||||
"from": "skylake",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"popcnt",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"avx",
|
||||
"rdrand",
|
||||
"f16c",
|
||||
"movbe",
|
||||
"fma",
|
||||
"avx2",
|
||||
"bmi1",
|
||||
"bmi2",
|
||||
"rdseed",
|
||||
"adx",
|
||||
"clflushopt",
|
||||
"xsavec",
|
||||
"xsaveopt",
|
||||
"avx512f",
|
||||
"avx512vl",
|
||||
"avx512bw",
|
||||
"avx512dq",
|
||||
"avx512cd",
|
||||
"avx512vbmi",
|
||||
"avx512ifma",
|
||||
"sha",
|
||||
"umip"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"versions": "8:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"cascadelake": {
|
||||
"from": "skylake_avx512",
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"popcnt",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"avx",
|
||||
"rdrand",
|
||||
"f16c",
|
||||
"movbe",
|
||||
"fma",
|
||||
"avx2",
|
||||
"bmi1",
|
||||
"bmi2",
|
||||
"rdseed",
|
||||
"adx",
|
||||
"clflushopt",
|
||||
"xsavec",
|
||||
"xsaveopt",
|
||||
"avx512f",
|
||||
"clwb",
|
||||
"avx512vl",
|
||||
"avx512bw",
|
||||
"avx512dq",
|
||||
"avx512cd",
|
||||
"avx512vnni"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"versions": "9:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"icelake": {
|
||||
"from": [
|
||||
"cascadelake",
|
||||
"cannonlake"
|
||||
],
|
||||
"vendor": "GenuineIntel",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"popcnt",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"avx",
|
||||
"rdrand",
|
||||
"f16c",
|
||||
"movbe",
|
||||
"fma",
|
||||
"avx2",
|
||||
"bmi1",
|
||||
"bmi2",
|
||||
"rdseed",
|
||||
"adx",
|
||||
"clflushopt",
|
||||
"xsavec",
|
||||
"xsaveopt",
|
||||
"avx512f",
|
||||
"avx512vl",
|
||||
"avx512bw",
|
||||
"avx512dq",
|
||||
"avx512cd",
|
||||
"avx512vbmi",
|
||||
"avx512ifma",
|
||||
"sha",
|
||||
"umip",
|
||||
"clwb",
|
||||
"rdpid",
|
||||
"gfni",
|
||||
"avx512vbmi2",
|
||||
"avx512vpopcntdq",
|
||||
"avx512bitalg",
|
||||
"avx512vnni",
|
||||
"vpclmulqdq",
|
||||
"vaes"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"name": "icelake-client",
|
||||
"versions": "8:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"barcelona": {
|
||||
"from": "x86_64",
|
||||
"vendor": "AuthenticAMD",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"sse4a",
|
||||
"abm"
|
||||
]
|
||||
},
|
||||
"bulldozer": {
|
||||
"from": "barcelona",
|
||||
"vendor": "AuthenticAMD",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"sse4a",
|
||||
"abm",
|
||||
"avx",
|
||||
"xop",
|
||||
"lwp",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"cx16",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"name": "bdver1",
|
||||
"versions": "4.6:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"piledriver": {
|
||||
"from": "bulldozer",
|
||||
"vendor": "AuthenticAMD",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"sse4a",
|
||||
"abm",
|
||||
"avx",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"cx16",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"bmi1",
|
||||
"f16c",
|
||||
"fma"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"name": "bdver2",
|
||||
"versions": "4.7:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"steamroller": {
|
||||
"from": "piledriver",
|
||||
"vendor": "AuthenticAMD",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"sse4a",
|
||||
"abm",
|
||||
"avx",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"cx16",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"bmi1",
|
||||
"f16c",
|
||||
"fma",
|
||||
"fsgsbase"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"name": "bdver3",
|
||||
"versions": "4.8:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"excavator": {
|
||||
"from": "steamroller",
|
||||
"vendor": "AuthenticAMD",
|
||||
"features": [
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"sse4a",
|
||||
"abm",
|
||||
"avx",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"cx16",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"bmi1",
|
||||
"f16c",
|
||||
"fma",
|
||||
"fsgsbase",
|
||||
"bmi2",
|
||||
"avx2",
|
||||
"movbe"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"name": "bdver4",
|
||||
"versions": "4.9:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"zen": {
|
||||
"from": "excavator",
|
||||
"vendor": "AuthenticAMD",
|
||||
"features": [
|
||||
"bmi1",
|
||||
"bmi2",
|
||||
"f16c",
|
||||
"fma",
|
||||
"fsgsbase",
|
||||
"avx",
|
||||
"avx2",
|
||||
"rdseed",
|
||||
"clzero",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"cx16",
|
||||
"movbe",
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"sse4a",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"abm",
|
||||
"xsavec",
|
||||
"xsaveopt",
|
||||
"clflushopt",
|
||||
"popcnt"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"name": "znver1",
|
||||
"versions": "6:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"zen2": {
|
||||
"from": "zen",
|
||||
"vendor": "AuthenticAMD",
|
||||
"features": [
|
||||
"bmi1",
|
||||
"bmi2",
|
||||
"f16c",
|
||||
"fma",
|
||||
"fsgsbase",
|
||||
"avx",
|
||||
"avx2",
|
||||
"rdseed",
|
||||
"clzero",
|
||||
"aes",
|
||||
"pclmulqdq",
|
||||
"cx16",
|
||||
"movbe",
|
||||
"mmx",
|
||||
"sse",
|
||||
"sse2",
|
||||
"sse4a",
|
||||
"ssse3",
|
||||
"sse4_1",
|
||||
"sse4_2",
|
||||
"abm",
|
||||
"xsavec",
|
||||
"xsaveopt",
|
||||
"clflushopt",
|
||||
"popcnt",
|
||||
"clwb"
|
||||
],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"name": "znver2",
|
||||
"versions": "9:",
|
||||
"flags": "-march={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"ppc64": {
|
||||
"from": null,
|
||||
"vendor": "generic",
|
||||
"features": [],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"versions": "4:",
|
||||
"flags": "-mcpu={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"power7": {
|
||||
"from": "ppc64",
|
||||
"vendor": "IBM",
|
||||
"generation": 7,
|
||||
"features": [],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"versions": "4.5:",
|
||||
"flags": "-mcpu={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"power8": {
|
||||
"from": "power7",
|
||||
"vendor": "IBM",
|
||||
"generation": 8,
|
||||
"features": [],
|
||||
"compilers": {
|
||||
"gcc": [
|
||||
{
|
||||
"versions": "4.9:",
|
||||
"flags": "-mcpu={name} -mtune={name}"
|
||||
},
|
||||
{
|
||||
"versions": "4.8:4.8.5",
|
||||
"warnings": "Using GCC 4.8 to optimize for Power 8 might not work if you are not on Red Hat Enterprise Linux 7, where a custom backport of the feature has been done. Upstream support from GCC starts in version 4.9",
|
||||
"flags": "-mcpu={name} -mtune={name}"
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"power9": {
|
||||
"from": "power8",
|
||||
"vendor": "IBM",
|
||||
"generation": 9,
|
||||
"features": [],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"versions": "6:",
|
||||
"flags": "-mcpu={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"ppc64le": {
|
||||
"from": null,
|
||||
"vendor": "generic",
|
||||
"features": [],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"versions": "4:",
|
||||
"flags": "-mcpu={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"power8le": {
|
||||
"from": "ppc64le",
|
||||
"vendor": "IBM",
|
||||
"generation": 8,
|
||||
"features": [],
|
||||
"compilers": {
|
||||
"gcc": [
|
||||
{
|
||||
"versions": "4.9:",
|
||||
"name": "power8",
|
||||
"flags": "-mcpu={name} -mtune={name}"
|
||||
},
|
||||
{
|
||||
"versions": "4.8:4.8.5",
|
||||
"warnings": "Using GCC 4.8 to optimize for Power 8 might not work if you are not on Red Hat Enterprise Linux 7, where a custom backport of the feature has been done. Upstream support from GCC starts in version 4.9",
|
||||
"name": "power8",
|
||||
"flags": "-mcpu={name} -mtune={name}"
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"power9le": {
|
||||
"from": "power8le",
|
||||
"vendor": "IBM",
|
||||
"generation": 9,
|
||||
"features": [],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"name": "power9",
|
||||
"versions": "6:",
|
||||
"flags": "-mcpu={name} -mtune={name}"
|
||||
}
|
||||
}
|
||||
},
|
||||
"aarch64": {
|
||||
"from": null,
|
||||
"vendor": "generic",
|
||||
"features": [],
|
||||
"compilers": {
|
||||
"gcc": {
|
||||
"versions": "4:",
|
||||
"flags": "-march=armv8-a -mtune=generic"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"feature_aliases": {
|
||||
"sse3": {
|
||||
"reason": "ssse3 is a superset of sse3 and might be the only one listed",
|
||||
"any_of": [
|
||||
"ssse3"
|
||||
]
|
||||
},
|
||||
"avx512": {
|
||||
"reason": "avx512 indicates generic support for any of the avx512 instruction sets",
|
||||
"any_of": [
|
||||
"avx512f",
|
||||
"avx512vl",
|
||||
"avx512bw",
|
||||
"avx512dq",
|
||||
"avx512cd"
|
||||
]
|
||||
},
|
||||
"altivec": {
|
||||
"reason": "altivec is supported by Power PC architectures, but might not be listed in features",
|
||||
"families": [
|
||||
"ppc64le",
|
||||
"ppc64"
|
||||
]
|
||||
},
|
||||
"sse4.1": {
|
||||
"reason": "permits to refer to sse4_1 also as sse4.1",
|
||||
"any_of": [
|
||||
"sse4_1"
|
||||
]
|
||||
},
|
||||
"sse4.2": {
|
||||
"reason": "permits to refer to sse4_2 also as sse4.2",
|
||||
"any_of": [
|
||||
"sse4_2"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
133
lib/spack/llnl/util/cpu/schema.py
Normal file
133
lib/spack/llnl/util/cpu/schema.py
Normal file
@@ -0,0 +1,133 @@
|
||||
# Copyright 2013-2019 Lawrence Livermore National Security, LLC and other
|
||||
# Spack Project Developers. See the top-level COPYRIGHT file for details.
|
||||
#
|
||||
# SPDX-License-Identifier: (Apache-2.0 OR MIT)
|
||||
import json
|
||||
import os.path
|
||||
|
||||
try:
|
||||
from collections.abc import MutableMapping
|
||||
except ImportError:
|
||||
from collections import MutableMapping
|
||||
|
||||
compilers_schema = {
|
||||
'type': 'object',
|
||||
'properties': {
|
||||
'versions': {'type': 'string'},
|
||||
'name': {'type': 'string'},
|
||||
'flags': {'type': 'string'}
|
||||
},
|
||||
'required': ['versions', 'flags']
|
||||
}
|
||||
|
||||
properties = {
|
||||
'microarchitectures': {
|
||||
'type': 'object',
|
||||
'patternProperties': {
|
||||
r'([\w]*)': {
|
||||
'type': 'object',
|
||||
'properties': {
|
||||
'from': {
|
||||
'anyOf': [
|
||||
# More than one parent
|
||||
{'type': 'array', 'items': {'type': 'string'}},
|
||||
# Exactly one parent
|
||||
{'type': 'string'},
|
||||
# No parent
|
||||
{'type': 'null'}
|
||||
]
|
||||
},
|
||||
'vendor': {
|
||||
'type': 'string'
|
||||
},
|
||||
'features': {
|
||||
'type': 'array',
|
||||
'items': {'type': 'string'}
|
||||
},
|
||||
'compilers': {
|
||||
'type': 'object',
|
||||
'patternProperties': {
|
||||
r'([\w]*)': {
|
||||
'anyOf': [
|
||||
compilers_schema,
|
||||
{
|
||||
'type': 'array',
|
||||
'items': compilers_schema
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
'required': ['from', 'vendor', 'features']
|
||||
}
|
||||
}
|
||||
},
|
||||
'feature_aliases': {
|
||||
'type': 'object',
|
||||
'patternProperties': {
|
||||
r'([\w]*)': {
|
||||
'type': 'object',
|
||||
'properties': {},
|
||||
'additionalProperties': False
|
||||
}
|
||||
},
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
schema = {
|
||||
'$schema': 'http://json-schema.org/schema#',
|
||||
'title': 'Schema for microarchitecture definitions and feature aliases',
|
||||
'type': 'object',
|
||||
'additionalProperties': False,
|
||||
'properties': properties,
|
||||
}
|
||||
|
||||
|
||||
class LazyDictionary(MutableMapping):
|
||||
"""Lazy dictionary that gets constructed on first access to any object key
|
||||
|
||||
Args:
|
||||
factory (callable): factory function to construct the dictionary
|
||||
"""
|
||||
|
||||
def __init__(self, factory, *args, **kwargs):
|
||||
self.factory = factory
|
||||
self.args = args
|
||||
self.kwargs = kwargs
|
||||
self._data = None
|
||||
|
||||
@property
|
||||
def data(self):
|
||||
if self._data is None:
|
||||
self._data = self.factory(*self.args, **self.kwargs)
|
||||
return self._data
|
||||
|
||||
def __getitem__(self, key):
|
||||
return self.data[key]
|
||||
|
||||
def __setitem__(self, key, value):
|
||||
self.data[key] = value
|
||||
|
||||
def __delitem__(self, key):
|
||||
del self.data[key]
|
||||
|
||||
def __iter__(self):
|
||||
return iter(self.data)
|
||||
|
||||
def __len__(self):
|
||||
return len(self.data)
|
||||
|
||||
|
||||
def _load_targets_json():
|
||||
"""Loads ``microarchitectures.json`` in memory."""
|
||||
directory_name = os.path.dirname(os.path.abspath(__file__))
|
||||
filename = os.path.join(directory_name, 'microarchitectures.json')
|
||||
with open(filename, 'r') as f:
|
||||
return json.load(f)
|
||||
|
||||
|
||||
#: In memory representation of the data in microarchitectures.json,
|
||||
#: loaded on first access
|
||||
targets_json = LazyDictionary(_load_targets_json)
|
@@ -1,226 +0,0 @@
|
||||
# Copyright 2013-2019 Lawrence Livermore National Security, LLC and other
|
||||
# Spack Project Developers. See the top-level COPYRIGHT file for details.
|
||||
#
|
||||
# SPDX-License-Identifier: (Apache-2.0 OR MIT)
|
||||
|
||||
import platform
|
||||
import re
|
||||
import subprocess
|
||||
import sys
|
||||
|
||||
|
||||
# Tuple of name, flags added, flags removed (default [])
|
||||
_intel_32 = [
|
||||
('i686', []),
|
||||
('pentium2', ['mmx']),
|
||||
('pentium3', ['sse']),
|
||||
('pentium4', ['sse2']),
|
||||
('prescott', ['sse3']),
|
||||
]
|
||||
|
||||
_intel_64 = [ # commenting out the ones that aren't shown through sysctl
|
||||
('nocona', ['mmx', 'sse', 'sse2', 'sse3']),#lm
|
||||
('core2', ['ssse3'], ['sse3']),
|
||||
('nehalem', ['sse4_1', 'sse4_2', 'popcnt']),
|
||||
('westmere', ['aes', 'pclmulqdq']),
|
||||
('sandybridge', ['avx']),
|
||||
('ivybridge', ['rdrand', 'f16c']),#fsgsbase (is it RDWRFSGS on darwin?)
|
||||
('haswell', ['movbe', 'fma', 'avx2', 'bmi1', 'bmi2']),
|
||||
('broadwell', ['rdseed', 'adx']),
|
||||
('skylake', ['xsavec', 'xsaves'])
|
||||
]
|
||||
|
||||
# We will need to build on these and combine with names when intel releases
|
||||
# further avx512 processors.
|
||||
# _intel_avx12 = ['avx512f', 'avx512cd']
|
||||
|
||||
|
||||
_amd_10_names = [
|
||||
('barcelona', ['mmx', 'sse', 'sse2', 'sse3', 'sse4a', 'abm'])
|
||||
]
|
||||
|
||||
_amd_14_names = [
|
||||
('btver1', ['mmx', 'sse', 'sse2', 'sse3', 'ssse3', 'sse4a', 'cx16',
|
||||
'abm']),#lm
|
||||
]
|
||||
|
||||
_amd_15_names = [
|
||||
('bdver1', ['avx', 'aes', 'pclmulqdq', 'cx16', 'sse', 'sse2', 'sse3',
|
||||
'ssse3', 'sse4a', 'sse4_1', 'sse4_2', 'abm']),#xop, lwp
|
||||
('bdver2', ['bmi1', 'f16c', 'fma',]),#tba?
|
||||
('bdver3', ['fsgsbase']),
|
||||
('bdver4', ['bmi2', 'movbe', 'avx2'])
|
||||
]
|
||||
|
||||
_amd_16_names = [
|
||||
('btver2', ['mmx', 'sse', 'sse2', 'sse3', 'ssse3', 'sse4a', 'cx16',
|
||||
'abm', 'movbe', 'f16c', 'bmi1', 'avx', 'pclmulqdq',
|
||||
'aes', 'sse4_1', 'sse4_2']),#lm
|
||||
]
|
||||
|
||||
_amd_17_names = [
|
||||
('znver1', ['bmi1', 'bmi2', 'f16c', 'fma', 'fsgsbase', 'avx', 'avx2',
|
||||
'rdseed', 'mwaitx', 'clzero', 'aes', 'pclmulqdq', 'cx16',
|
||||
'movbe', 'mmx', 'sse', 'sse2', 'sse3', 'ssse3', 'sse4a',
|
||||
'sse4_1', 'sse4_2', 'abm', 'xsavec', 'xsaves',
|
||||
'clflushopt', 'popcnt', 'adcx'])
|
||||
]
|
||||
|
||||
_amd_numbers = {
|
||||
0x10: _amd_10_names,
|
||||
0x14: _amd_14_names,
|
||||
0x15: _amd_15_names,
|
||||
0x16: _amd_16_names,
|
||||
0x17: _amd_17_names
|
||||
}
|
||||
|
||||
def supported_target_names():
|
||||
intel_names = set(t[0] for t in _intel_64)
|
||||
intel_names |= set(t[0] for t in _intel_32)
|
||||
amd_names = set()
|
||||
for family in _amd_numbers:
|
||||
amd_names |= set(t[0] for t in _amd_numbers[family])
|
||||
power_names = set('power' + str(d) for d in range(7, 10))
|
||||
return intel_names | amd_names | power_names
|
||||
|
||||
def create_dict_from_cpuinfo():
|
||||
# Initialize cpuinfo from file
|
||||
cpuinfo = {}
|
||||
try:
|
||||
with open('/proc/cpuinfo') as file:
|
||||
text = file.readlines()
|
||||
for line in text:
|
||||
if line.strip():
|
||||
key, _, value = line.partition(':')
|
||||
cpuinfo[key.strip()] = value.strip()
|
||||
except IOError:
|
||||
return None
|
||||
return cpuinfo
|
||||
|
||||
def check_output(args):
|
||||
if sys.version_info >= (3, 0):
|
||||
return subprocess.run(args, check=True, stdout=PIPE).stdout # nopyqver
|
||||
else:
|
||||
return subprocess.check_output(args) # nopyqver
|
||||
|
||||
def create_dict_from_sysctl():
|
||||
cpuinfo = {}
|
||||
try:
|
||||
cpuinfo['vendor_id'] = check_output(['sysctl', '-n',
|
||||
'machdep.cpu.vendor']).strip()
|
||||
cpuinfo['flags'] = check_output(['sysctl', '-n',
|
||||
'machdep.cpu.features']).strip().lower()
|
||||
cpuinfo['flags'] += ' ' + check_output(['sysctl', '-n',
|
||||
'machdep.cpu.leaf7_features']).strip().lower()
|
||||
cpuinfo['model'] = check_output(['sysctl', '-n',
|
||||
'machdep.cpu.model']).strip()
|
||||
cpuinfo['model name'] = check_output(['sysctl', '-n',
|
||||
'machdep.cpu.brand_string']).strip()
|
||||
|
||||
# Super hacky way to deal with slight representation differences
|
||||
# Would be better to somehow consider these "identical"
|
||||
if 'sse4.1' in cpuinfo['flags']:
|
||||
cpuinfo['flags'] += ' sse4_1'
|
||||
if 'sse4.2' in cpuinfo['flags']:
|
||||
cpuinfo['flags'] += ' sse4_2'
|
||||
if 'avx1.0' in cpuinfo['flags']:
|
||||
cpuinfo['flags'] += ' avx'
|
||||
except:
|
||||
pass
|
||||
return cpuinfo
|
||||
|
||||
def get_cpu_name():
|
||||
name = get_cpu_name_helper(platform.system())
|
||||
return name if name else platform.machine()
|
||||
|
||||
def get_cpu_name_helper(system):
|
||||
# TODO: Elsewhere create dict of codenames (targets) and flag sets.
|
||||
# Return cpu name or an empty string if one cannot be determined.
|
||||
cpuinfo = {}
|
||||
if system == 'Linux':
|
||||
cpuinfo = create_dict_from_cpuinfo()
|
||||
elif system == 'Darwin':
|
||||
cpuinfo = create_dict_from_sysctl()
|
||||
if not cpuinfo:
|
||||
return ''
|
||||
|
||||
if 'vendor_id' in cpuinfo and cpuinfo['vendor_id'] == 'GenuineIntel':
|
||||
if 'model name' not in cpuinfo or 'flags' not in cpuinfo:
|
||||
# We don't have the information we need to determine the
|
||||
# microarchitecture name
|
||||
return ''
|
||||
return get_intel_cpu_name(cpuinfo)
|
||||
elif 'vendor_id' in cpuinfo and cpuinfo['vendor_id'] == 'AuthenticAMD':
|
||||
if 'cpu family' not in cpuinfo or 'flags' not in cpuinfo:
|
||||
# We don't have the information we need to determine the
|
||||
# microarchitecture name
|
||||
return ''
|
||||
return get_amd_cpu_name(cpuinfo)
|
||||
elif 'cpu' in cpuinfo and 'POWER' in cpuinfo['cpu']:
|
||||
return get_ibm_cpu_name(cpuinfo['cpu'])
|
||||
else:
|
||||
return ''
|
||||
|
||||
def get_ibm_cpu_name(cpu):
|
||||
power_pattern = re.compile('POWER(\d+)')
|
||||
power_match = power_pattern.search(cpu)
|
||||
if power_match:
|
||||
if 'le' in platform.machine():
|
||||
return 'power' + power_match.group(1) + 'le'
|
||||
return 'power' + power_match.group(1)
|
||||
else:
|
||||
return ''
|
||||
|
||||
def get_intel_cpu_name(cpuinfo):
|
||||
model_name = cpuinfo['model name']
|
||||
if 'Atom' in model_name:
|
||||
return 'atom'
|
||||
elif 'Quark' in model_name:
|
||||
return 'quark'
|
||||
elif 'Xeon' in model_name and 'Phi' in model_name:
|
||||
# This is hacky and needs to be extended for newer avx512 chips
|
||||
return 'knl'
|
||||
else:
|
||||
ret = ''
|
||||
flag_list = cpuinfo['flags'].split()
|
||||
proc_flags = []
|
||||
for _intel_processors in [_intel_32, _intel_64]:
|
||||
for entry in _intel_processors:
|
||||
try:
|
||||
proc, flags_added, flags_removed = entry
|
||||
except ValueError:
|
||||
proc, flags_added = entry
|
||||
flags_removed = []
|
||||
proc_flags = list(filter(lambda x: x not in flags_removed, proc_flags))
|
||||
proc_flags.extend(flags_added)
|
||||
if all(f in flag_list for f in proc_flags):
|
||||
ret = proc
|
||||
return ret
|
||||
|
||||
def get_amd_cpu_name(cpuinfo):
|
||||
#TODO: Learn what the "canonical" granularity of naming
|
||||
# is for AMD processors, implement dict as for intel.
|
||||
ret = ''
|
||||
flag_list = cpuinfo['flags'].split()
|
||||
model_number = int(cpuinfo['cpu family'])
|
||||
flags_dict = _amd_numbers[model_number]
|
||||
proc_flags = []
|
||||
for proc, proc_flags_added in flags_dict:
|
||||
proc_flags.extend(proc_flags_added)
|
||||
if all(f in flag_list for f in proc_flags):
|
||||
ret = proc
|
||||
else:
|
||||
break
|
||||
return ret
|
||||
|
||||
"""IDEA: In build_environment.setup_compiler_environment, include a
|
||||
call to compiler.tuning_flags(spec.architecture.target). For gcc this
|
||||
would return "-march=%s" % str(spec.architecture.target). We only call
|
||||
this if the target is a valid tuning target (I.e. not
|
||||
platform.machine(), but a more specific target we successfully
|
||||
discovered.
|
||||
|
||||
Then set
|
||||
SPACK_TUNING_FLAGS=compiler.tuning_flags(spec.architecture.target)
|
||||
This way the compiler wrapper can just add $SPACK_TUNING_FLAGS to the
|
||||
eventual command."""
|
Reference in New Issue
Block a user