|
|
|
|
Changelog for hwloc-2.8.0-1.3.x86_64.rpm :
* Fri Jul 29 2022 Callum Farmer - Add libXNVCtrl support on TW * Mon Jul 11 2022 Dirk Müller - update to 2.8.0: * API + Add HWLOC_TOPOLOGY_FLAG_NO_DISTANCES, _NO_MEMATTRS and _NO_CPUKINDS to reduce the overhead when unneeded. + Add separate Read/Write Bandwidth/Latency memory attributes and implement them on Linux. * Backends + NUMA nodes may now have a subtype such as DRAM, HBM, SPM, or NVM on heterogeneous memory platforms on Linux. - Add DAXType and DAXParent attributes on Linux to tell where a DAX device or its corresponding NUMA node come from (SPM for Specific-Purpose or NVM for Non-Volatile Memory). + Detect heterogeneous caches in hybrid CPUs on MacOS X, thanks to Paul Bone for the help. + Max frequencies are not ignored in Linux cpukinds anymore (they were ignored in hwloc 2.7.0), but they may be slightly adjusted to avoid reporting hybrid CPUs because Intel Turbo Boost Max 3.0. - See the documentation of environment variable HWLOC_CPUKINDS_MAXFREQ. + Hardwire the PCI locality of HPE Cray EX235a nodes. * Tools + lstopo and other tools may now load Linux and x86 cpuid topology files from a tarball. + lstopo may now replace the P# and L# index prefixes with custom strings thanks to --os-index-prefix and --logical-index-prefix options. * Misc + Add --disable-readme to avoid regenerating the top-level hwloc README file from the documentation. * Fri Apr 08 2022 Dirk Müller - update to 2.7.1: * Workaround crashes when virtual machines report incoherent x86 CPUID information about numbers of cores and threads. Thanks to Peter Bense for the report. * Use setenv() instead of putenv() when trying to force enable oneAPI L0 support, to avoid issues with applications that touch the environment, thanks to Josh Hursey for the patch. * Add some warnings at the end of configure when GPU libraries are missing on the system or their path is missing in the environment. * Backends + Add support for NUMA nodes and caches with more than 64 PUs across multiple processor groups on Windows 11 and Windows Server 2022. + Group objects are not created for Windows processor groups anymore, except if HWLOC_WINDOWS_PROCESSOR_GROUP_OBJS=1 in the environment. + Expose \"Cluster\" group objects on Linux kernel 5.16+ for CPUs that share some internal cache or bus. This can be equivalent to the L2 Cache level on some platforms (e.g. x86) or a specific level between L2 and L3 on others (e.g. ARM Kungpeng 920). Thanks to Jonathan Cameron for the help. - HWLOC_DONT_MERGE_CLUSTER_GROUPS=1 may be set in the environment to prevent these groups from being merged with identical caches, etc. + Improve the oneAPI LevelZero backend: - Expose subdevices such as \"ze0.1\" inside root OS devices (\"ze0\") when the hardware contains multiple subdevices. - Add many new attributes to describe device type, and the numbers of slices, subslices, execution units and threads. - Expose the memory information as LevelZeroHBM/DDR/MemorySize infos. + Ignore the max frequencies of cores in Linux cpukinds when the base frequencies are available (to avoid exposing hybrid CPUs when Intel Turbo Boost Max 3.0 gives slightly different max frequencies to CPU cores). - May be reverted by setting HWLOC_CPUKINDS_MAXFREQ=1 in the environment. * Tools + Add --grey and --palette options to switch lstopo to greyscale or white-background-only graphics, or to tune individual colors. * Build + Windows CMake builds now support non-MSVC compilers, detect several features at build time, can build/run tests, etc. Thanks to Michael Hirsch and Alexander Neumann . * Sun Dec 05 2021 Dirk Müller - update to 2.6.0: * Backends + Expose two cpukinds for energy-efficient cores (icestorm) and high-performance cores (firestorm) on Apple M1 on Mac OS X. + Use sysfs CPU \"capacity\" to rank hybrid cores by efficiency on Linux when available (mostly on recent ARM platforms for now). + Improve HWLOC_MEMBIND_BIND (without the STRICT flag) on Linux kernel >= 5.15: If more than one node is given, the kernel may now use all of them instead of only the first one before falling back to others. + Expose cache os_index when available on Linux, it may be needed when using resctrl to configure cache partitioning, memory bandwidth monitoring, etc. + Add a \"XGMIHops\" distances matrix in the RSMI backend for AMD GPU interconnected through XGMI links. + Expose AMD GPU memory information (VRAM and GTT) in the RSMI backend. + Add OS devices such as \"bxi0\" for Atos/Bull BXI HCAs on Linux. * Tools + lstopo has a better placement algorithm with respect to I/O objects, see --children-order in the manpage for details. + hwloc-annotate may now change object subtypes and cache or memory sizes. * Build + Allow to specify the ROCm installation for building the RSMI backend: - Use a custom installation path if specified with --with-rocm=. - Use /opt/rocm- if specified with --with-rocm-version= or the ROCM_VERSION environment variable. - Try /opt/rocm if it exists. - See \"How do I enable ROCm SMI and select which version to use?\" in the FAQ for details. + Add a CMakeLists for Windows under contrib/windows-cmake/ . * Documentation + Add FAQ entry \"How do I create a custom heterogeneous and asymmetric topology?\" * Sat Jul 17 2021 Dirk Müller - update to 2.5.0: + Add hwloc/windows.h to query Windows processor groups. + Add hwloc_get_obj_with_same_locality() to convert between objects with same locality, for instance NUMA nodes and Packages, or OS devices within a PCI device. + Add hwloc_distances_transform() to modify distances structures. - hwloc-annotate and lstopo have new distances-transform options. + hwloc_distances_add() is replaced with _add_create() followed by _add_values() and _add_commit(). See hwloc/distances.h for details. + Add topology flags to mitigate binding modifications during hwloc discovery, especially on Windows: - HWLOC_TOPOLOGY_FLAG_RESTRICT_TO_CPUBINDING and _MEMBINDING restrict discovery to PUs and NUMA nodes inside the binding. - HWLOC_TOPOLOGY_FLAG_DONT_CHANGE_BINDING prevents from ever changing the binding during discovery. + Add a levelzero backend for oneAPI L0 devices, exposed as OS devices of subtype \"LevelZero\" and name such as \"ze0\". - Add hwloc/levelzero.h for interoperability between converting between L0 API devices and hwloc cpusets or OS devices. + Expose NEC Vector Engine cards on Linux as OS devices of subtype \"VectorEngine\" and name \"ve0\", etc. Thanks to Anara Kozhokanova, Tim Cramer and Erich Focht for the help. + Add a NVLinkBandwidth distances structure between NVIDIA GPUs (and POWER processor or NVSwitches) in the NVML backend, and a XGMIBandwidth distances structure between AMD GPUs in the RSMI backends. - See \"Topology Attributes: Distances, Memory Attributes and CPU Kinds\" in the documentation for details about these new distances. + Add support for NUMA node 0 being offline in Linux, thanks to Jirka Hladky. + Add --with-cuda-version= or look at the CUDA_VERSION environment variable to find the appropriate CUDA pkg-config files. Thanks to Stephen Herbein for the suggestion. - Also add --with-cuda= to specify the CUDA installation path manually (and its NVML and OpenCL components). Thanks to Andrea Bocci for the suggestion. - See \"How do I enable CUDA and select which CUDA version to use?\" in the FAQ for details. + lstopo now has a --windows-processor-groups option on Windows. + hwloc-ps now has a --short-name option to avoid long/truncated command path. + hwloc-ps now has a --single-ancestor option to return a single (possibly too large) object where a process is bound. + hwloc-ps --pid-cmd may now query environment variables, including MPI-specific variables to find out process ranks. * Tue Mar 16 2021 Dirk Müller - update to 2.4.1: * Fix AMD OpenCL device locality when PCI bus or device number >= 128. Thanks to Edgar Leon for reporting the issue. + Applications using any of the following inline functions must be recompiled to get the fix: hwloc_opencl_get_device_pci_busid() hwloc_opencl_get_device_cpuset(), hwloc_opencl_get_device_osdev(). * Fix the ranking of cpukinds on non-Windows systems, thanks to Ivan Kochin for the report. * Fix the insertion of custom Groups after loading the topology, thanks to Scott Hicks. * Add support for CPU0 being offline in Linux, thanks to Garrett Clay. * Fix missing x86 Package and Core objects FreeBSD/NetBSD. Thanks to Thibault Payet and Yuri Victorovich for the report. * Fix the import of very large distances with heterogeneous object types. * Fix a memory leak in the Linux backend, thanks to Perceval Anichini. * Sun Jan 24 2021 Dirk Müller - update to 2.4.0: + Add hwloc/cpukinds.h for reporting information about hybrid CPUs. - Use Linux cpufreq frequencies to rank cores by efficiency. - Use x86 CPUID hybrid leaf and future Linux kernels sysfs CPU type files to identify Intel Atom and Core cores. - Use the Windows native EfficiencyClass to separate kinds. + Properly handle Linux kernel 5.10+ exposing ACPI HMAT information with knowledge of Generic Initiators. + lstopo has new --cpukinds and --no-cpukinds options for showing CPU kinds or not in textual and graphical modes respectively. + hwloc-calc has a new --cpukind option for filtering PUs by kind. + hwloc-annotate has a new cpukind command for modifying CPU kinds. + Fix hwloc_bitmap_nr_ulongs(), thanks to Norbert Eicker. + Add a documentation section about \"Topology Attributes: Distances, Memory Attributes and CPU Kinds\". + Silence some spurious warnings in the OpenCL backend and when showing process binding with lstopo --ps. + Add hwloc/memattrs.h for exposing latency/bandwidth information between initiators (CPU sets for now) and target NUMA nodes, typically on heterogeneous platforms. - When available, bandwidths and latencies are read from the ACPI HMAT table exposed by Linux kernel 5.2+. - Attributes may also be customized to expose user-defined performance information. + Add hwloc_get_local_numanode_objs() for listing NUMA nodes that are local to some locality. + The new topology flag HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT causes support arrays to be loaded from XML exported with hwloc 2.3+. - hwloc_topology_get_support() now returns an additional \"misc\" array with feature \"imported_support\" set when support was imported. + Add hwloc_topology_refresh() to refresh internal caches after modifying the topology and before consulting the topology in a multithread context. + Add a ROCm SMI backend and a hwloc/rsmi.h helper file for getting the locality of AMD GPUs, now exposed as \"rsmi\" OS devices. Thanks to Mike Li. + Remove POWER device-tree-based topology on Linux, (it was disabled by default since 2.1). + Command-line options for specifying flags now understand comma-separated lists of flag names (substrings). + hwloc-info and hwloc-calc have new --local-memory --local-memory-flags and --best-memattr options for reporting local memory nodes and filtering by memory attributes. + hwloc-bind has a new --best-memattr option for filtering by memory attributes among the memory binding set. + Tools that have a --restrict option may now receive a nodeset or some custom flags for restricting the topology. + lstopo now has a --thickness option for changing line thickness in the graphical output. + Fix lstopo drawing when autoresizing on Windows 10. + Pressing the F5 key in lstopo X11 and Windows graphical/interactive outputs now refreshes the display according to the current topology and binding. + Add a tikz lstopo graphical backend to generate picture easily included into LaTeX documents. Thanks to Clement Foyer. + The default installation path of the Bash completion file has changed to ${datadir}/bash-completion/completions/hwloc. Thanks to Tomasz Kłoczko. * Sat Nov 21 2020 Thomas Blume - move hwloc manpage to main package (bsc#1178802) * Tue Aug 18 2020 Dirk Mueller - update to 2.2.0: * API + Add hwloc_bitmap_singlify_by_core() to remove SMT from a given cpuset, thanks to Florian Reynier for the suggestion. + Add --enable-32bits-pci-domain to stop ignoring PCI devices with domain >16bits (e.g. 10000:02:03.4). Enabling this option breaks the library ABI. Thanks to Dylan Simon for the help. * Backends + Add support for Linux cgroups v2. + Add NUMA support for FreeBSD. + Add get_last_cpu_location support for FreeBSD. + Remove support for Intel Xeon Phi (MIC, Knights Corner) co-processors. * Tools + Add --uid to filter the hwloc-ps output by uid on Linux. + Add a GRAPHICAL OUTPUT section in the manpage of lstopo. * Misc + Use the native dlopen instead of libltdl, unless --disable-plugin-dlopen is passed at configure time.- install systemd files using systemd macros and register it on install with systemd- build against libnuma on all architectures * Tue Oct 15 2019 Thomas Blume - update to latest released upstream version 2.1.0 (jsc#SLE-8583) * API + Add a new \"Die\" object (HWLOC_OBJ_DIE) for upcoming x86 processors with multiple dies per package, in the x86 and Linux backends. + Add the new HWLOC_OBJ_MEMCACHE object type for memory-side caches. + Add HWLOC_RESTRICT_FLAG_BYNODESET and _REMOVE_MEMLESS for restricting topologies based on some memory nodes. + Add hwloc_topology_set_components() for blacklisting some components from being enabled in a topology. + Add hwloc_bitmap_nr_ulongs() and hwloc_bitmap_from/to_ulongs() + Improve the API for dealing with disallowed resources + Group objects have a new \"dont_merge\" attribute to prevent them from being automatically merged with identical parent or children. + Add more distances-related features: - Add hwloc_distances_get_name() to retrieve a string describing what a distances structure contain. - Add hwloc_distances_get_by_name() to retrieve distances structures based on their name. - Add hwloc_distances_release_remove() - Distances may now cover objects of different types with new kind HWLOC_DISTANCES_KIND_HETEROGENEOUS_TYPES. * Backends + Add support for Linux 5.3 new sysfs cpu topology files with Die information. + Add support for Intel v2 Extended Topology Enumeration in the x86 backend. + Improve memory locality on Linux by using HMAT initiators (exposed since Linux 5.2+), and NUMA distances for CPU-less NUMA nodes. + The x86 backend now properly handles offline CPUs. + Detect the locality of NVIDIA GPU OpenCL devices. + Ignore NUMA nodes that correspond to NVIDIA GPU by default. + Add support for IBM S/390 drawers. + Rework the heuristics for discovering KNL Cluster and Memory modes to stop assuming all CPUs are online (required for mOS support). + Ignore NUMA node information from AMD topoext in the x86 backend, unless HWLOC_X86_TOPOEXT_NUMANODES=1 is set in the environment. + Expose Linux DAX devices as hwloc Block OS devices. + Remove support for /proc/cpuinfo-only topology discovery in Linux kernel prior to 2.6.16. + Disable POWER device-tree-based topology on Linux by default. + Discovery components are now divided in phases that may be individually blacklisted. * Tools + lstopo - lstopo factorizes objects by default in the graphical output when there are more than 4 identical children. - Both logical and OS/physical indexes are now displayed by default for PU and NUMA nodes. - The X11 and Windows interactive outputs support many keyboard shortcuts to dynamically customize the attributes, legend, etc. - Add --linespacing and change default margins and linespacing. - Add --allow for changing allowed sets. - Add a native SVG backend. + Add --nodeset options to hwloc-calc for converting between cpusets and nodesets. + Add --no-smt to lstopo, hwloc-bind and hwloc-calc to ignore multiple PU in SMT cores. + hwloc-annotate may annotate multiple locations at once. + Add a HTML/JS version of hwloc-ps. See contrib/hwloc-ps.www/README. + Add bash completions. * Misc + Add several FAQ entries in \"Compatibility between hwloc versions\" about API version, ABI, XML, Synthetic strings, and shmem topologies. * Tue Aug 27 2019 Thomas Blume - update to latest released upstream version 2.0.4 (jsc#SLE-8583) * Add support for Linux 5.3 new sysfs cpu topology files with Die information. * Add support for Intel v2 Extended Topology Enumeration in the x86 backend. * Tiles, Modules and Dies are exposed as Groups for now. + HWLOC_DONT_MERGE_DIE_GROUPS=1 may be set in the environment to prevent Die groups from being automatically merged with identical parent or children. * Ignore NUMA node information from AMD topoext in the x86 backend, unless HWLOC_X86_TOPOEXT_NUMANODES=1 is set in the environment. * Group objects have a new \"dont_merge\" attribute to prevent them from being automatically merged with identical parent or children. * Fix build on Cygwin, thanks to Marco Atzeri for the patches. * Fix a corner case of hwloc_topology_restrict() where children would become out-of-order. * Fix the return length of export_xmlbuffer() functions to always include the ending \\0. * Fix lstopo --children-order argument parsing. * Add support for Hygon Dhyana processors in the x86 backend, thanks to Pu Wen for the patch. * Fix symbol renaming to also rename internal components, thanks to Evan Ramos for the patch. * Fix build on HP-UX, thanks to Richard Lloyd for reporting the issues. * Detect PCI link speed without being root on Linux >= 4.13. * Add HWLOC_VERSION * macros to the public headers, thanks to Gilles Gouaillardet for the suggestion. * Bump the library soname to 15:0:0 to avoid conflicts with hwloc 1.11.x releases. The hwloc 2.0.0 soname was buggy (12:0:0), applications will have to be recompiled. * Serialize pciaccess discovery to fix concurrent topology loads in multiple threads. * Fix hwloc-dump-hwdata to only process SMBIOS information that correspond to the KNL and KNM configuration. * Add a heuristic for guessing KNL/KNM memory and cluster modes when hwloc-dump-hwdata could not run as root earlier. * Add --no-text lstopo option to remove text from some boxes in the graphical output. Mostly useful for removing Group labels. * Some minor fixes to memory binding.
|
|
|