Changelog for
libopenblas-pthreads-gnu-hpc-0.3.2-lp150.1.28.x86_64.rpm :
* Fri Aug 17 2018 idonmezAATTsuse.com- Update to version 0.3.2 common:
* Fixes for regressions caused by the rewrite of the thread initialization code in 0.3.1 x86_64:
* Added autodetection of AMD Ryzen 2
* Fixed build with older versions of MSVC power:
* Fixed cpu autodetection for the BSDs mips64:
* Fixed utest errors in AXPY, DSDOT, ROT and SWAP- Version 0.3.1 common:
* Rewritten thread initialization code with significantly reduced overhead
* Added CBLAS interfaces to the IxAMIN BLAS extension functions
* Fixed the lapack-test target
* CMAKE builds now create an OpenBLASConfig.cmake file
* ZAXPY now uses a single thread for small input sizes
* The LAPACK code was updated from Reference-LAPACK/lapack#253 power:
* Corrected CROT and ZROT behaviour with zero INC_X armv7:
* Corrected xDOT behaviour with zero INC_X or INC_Y x86_64:
* Retired some older targets of DYNAMIC_ARCH builds to a new option DYNAMIC_OLDER, this affects PENRYN,DUNNINGTON, OPTERON,OPTERON_SSE3,BOBCAT,ATOM and NANO (which will still be supported via the slower PRESCOTT kernels when this option is not set)
* Added an option DYNAMIC_LIST that (used in conjunction with DYNAMIC_ARCH) allows to specify the list of x86_64 targets to include. Any target not on the list will be supported by the Sandybridge or Nehalem kernels if available, or by Prescott.
* Improved SWITCH_RATIO on Haswell for increased GEMM throughput
* Added initial support for Intel Skylake X, including an AVX512 SGEMM kernel
* Added autodetection of Intel Cannon Lake series as Skylake X
* Added a default L2 cache size for hypervisors that return zero here (Chromebook)
* Fixed a name clash with recent Windows10 headers that broke the build with (at least) recent mingw from MSYS2
* Fixed a link error in mixed clang/gfortran builds with OpenMP
* Updated the OSX deployment target to 10.8
* Switched on parallel make for builds on MS Windows by default x86:
* Fixed SSWAP and DSWAP behaviour with zero INC_X and INC_Y- Version 0.3.0 common:
* Fixed some more thread race and locking bugs
* Added preliminary support for calling an OpenMP build of the library from multiple threads
* Removed performance impact of thread locks added in 0.2.20 on OpenMP code
* General code cleanup
* Optimized DSDOT implementation
* Improved thread distribution for GEMM
* Corrected IMATCOPY/OMATCOPY implementation
* Fixed out-of-bounds accesses in the multithreaded xBMV/xPMV and SYMV implementations
* Cmake build improvements
* pkgconfig file now contains build options
* openblas_get_config() now reports USE_OPENMP and NUM_THREADS settings used for the build
* Corrections and improvements for systems with more than 64 cpus
* LAPACK code updated to 3.8.0 including later fixes
* Added ReLAPACK, a recursive implementation of several LAPACK functions
* Rewrote ROTMG to handle cases that the netlib code failed to address
* Disabled (broken) multithreading code for xTRMV
* corrected prototypes of complex CBLAS functions to make our cblas.h match the generally accepted standard
* Shared memory access failures on startup are now handled more gracefully
* Restored utests from earlier releases (and made them pass on all affected systems) sparc:
* several fixes for cpu autodetection arm:
* Added support for CortexA53 and A72
* Added autodetection for ThunderX2T99
* Made most optimized kernels the default for generic ARMv8 targets x86_64:
* Parallelized DDOT kernel for Haswell
* Changed alignment directives in assembly kernels to boost performance on OSX
* Fixed register handling in the GEMV microkernels (bug exposed by gcc7)
* Added support for building on OpenBSD and Dragonfly
* Updated compiler options to work with Intel release 2018
* Support fully optimized build with clang/flang on Microsoft Windows
* Fixed building on AIX ibm z:
* added optimized BLAS 1/2 functions mips:
* Fixed cpu autodetection helper code
* Added mips32 1004K cpu (Mediatek MT7621 and similar SoC)
* Added mips64 I6500 cpu- Remove c_xerbla_no-void-return.patch: fixed upstream.
* Tue Jan 30 2018 roAATTsuse.de- add openblas-s390.patch to build on s390 (bsc#1079513).
* Fri Jan 05 2018 eichAATTsuse.com- Switch from gcc6 to gcc7 as additional compiler flavor for HPC on SLES.- Fix library package requires - use HPC macro (boo#1074890).- Fix unexpanded rpm macro in environment module file for HPC (boo#1074897).
* Mon Nov 27 2017 normandAATTlinux.vnet.ibm.com- Add -mvsx option for ppc64 archi (not required for ppc64le) to avoid ./kernel/power/sasum_microk_power8.c:41:3: error: \'__vector\' undeclared (first use in this function); ...
* Tue Oct 17 2017 eichAATTsuse.com- Add magic to limit the number of flavors built in the OBS to non-HPC ones.
* Thu Oct 12 2017 eichAATTsuse.com- Generate baselib.conf dynamically and only for the non-HPC builds: this avoids issues with the source validator.
* Fri Sep 08 2017 eichAATTsuse.com- Convert openblas to multibuild.- Add HPC build using environment modules. (FATE#321708).- fix-arm64-cpuid-return.patch Fix CPUID detection on ARM (From OHPC).
* Wed Aug 09 2017 dmitry_rAATTopensuse.org- Remove migration %post scripts for old library names
* Sat Jul 29 2017 badshah400AATTgmail.com- Update to version 0.2.20:
* common: - Improved CMake support - Fixed several thread race and locking bugs - Fixed default LAPACK optimization level - Updated LAPACK to 3.7.0 - Added ReLAPACK (https://github.com/HPAC/ReLAPACK), make BUILD_RELAPACK=1
* POWER: - Optimizations for Power9 - Fixed several Power8 assembly bugs
* ARM: - New optimized Vulcan and ThunderX2T99 targets - Support for ARMV7 SOFT_FP ABI (make ARM_SOFTFP_ABI=1) - Detect all cpu cores including offline ones - Fix compilation with CLANG - Support building a shared library for Android
* MIPS: - Fixed several threading issues - Fix compilation with CLANG
* x86_64: - Detect Intel Bay Trail and Apollo Lake - Detect Intel Sky Lake and Kaby Lake - Detect Intel Knights Landing - Detect AMD A8, A10, A12 and Ryzen - Support 64bit builds with Visual Studio - Fix building with Intel and PGI compilers - Fix building with MINGW and TDM-GCC - Fix cmake builds for Haswell and related cpus - Fix building for Sandybridge with CLANG 3.9 - Add support for the FLANG compiler
* IBM Z: - New target z13 with BLAS3 optimizations- Drop 0001-Fix-power8-asm.patch; fixed upstream.- Minor rebase of c_xerbla_no-void-return.patch and openblas-noexecstack.patch for updated version.- Remove installed pkgconfig file as it is not adapted to the library names we use.
* Thu May 18 2017 meissnerAATTsuse.com- 0001-Fix-power8-asm.patch: fixed power8 assembly (bsc#1039397)
* Wed Sep 07 2016 idonmezAATTsuse.com- Update to version 0.2.19 POWER:
* Optimize BLAS on Power8
* Fixed Julia+OpenBLAS bugs on Power8 MIPS:
* Optimize BLAS on MIPS P5600 and I6400 ARM:
* Improved on ARM Cortex-A57
* Wed Apr 13 2016 dmitry_rAATTopensuse.org- Update to version 0.2.18 ARM:
* Provide DGEMM 8x4 kernel for Cortex-A57 POWER:
* Optimize S and C BLAS3 on Power8
* Optimize BLAS2/1 on Power8
* Mon Mar 21 2016 dmitry_rAATTopensuse.org- Update to version 0.2.17
* Enable BUILD_LAPACK_DEPRECATED=1 by default.
* Wed Mar 16 2016 idonmezAATTsuse.com- Update to version 0.2.16
* Upgrade LAPACK to 3.6.0 version.
* Disable multi-threading for small size swap and ger.
* Improve small zger, zgemv, ztrmv using stack alloction.
* Let openblas_get_num_threads return the number of active threads.
* Fix LAPACK Dormbr, Dormlq bug.
* Avoid potential getenv segfault.
* Import LAPACK svn bugfix #142-#147,#150-#155 x86/x86_64:
* Optimize trsm kernels for AMD Bulldozer, Piledriver, Steamroller.
* Detect Intel Avoton.
* Detect AMD Trinity, Richland, E2-3200.
* Optimize c/zgemv for AMD Bulldozer, Piledriver, Steamroller
* Fix bug with scipy linalg test. ARM:
* Support and optimize Cortex-A57 AArch64.
* Update ARMV6 kernels.
* Improve DGEMM for ARM Cortex-A57. POWER:
* Fix detection of POWER architecture.
* Optimize D and Z BLAS3 functions for Power8.- Remove openblas-libs.patch, not needed.
* Tue Oct 27 2015 dmitry_rAATTopensuse.org- Update to version 0.2.15
* Enable MAX_STACK_ALLOC flags by default.
* Improve ger and gemv for small matrices.
* Improve gemv parallel with small m and large n case.
* Improve ?imatcopy when lda==ldb
* Add vecLib benchmarks
* Fix LAPACK lantr for row major matrices
* Fix LAPACKE lansy
* Import bug fixes for LAPACKE s/dormlq, c/zunmlq
* Raise the signal when pthread_create fails
* Drop obsolete openblas-arm64-build.patch x86/x86-64:
* Support pure C generic kernels for x86/x86-64.
* Support Intel Boardwell and Skylake by Haswell kernels.
* Support AMD Excavator by Steamroller kernels.
* Optimize s/d/c/zdot for Intel SandyBridge and Haswell.
* Optimize s/d/c/zdot for AMD Piledriver and Steamroller.
* Optimize s/d/c/zapxy for Intel SandyBridge and Haswell.
* Optimize s/d/c/zapxy for AMD Piledriver and Steamroller.
* Optimize d/c/zscal for Intel Haswell, dscal for Intel SandyBridge.
* Optimize d/c/zscal for AMD Bulldozer, Piledriver and Steamroller.
* Optimize s/dger for Intel SandyBridge.
* Optimize s/dsymv for Intel SandyBridge.
* Optimize ssymv for Intel Haswell.
* Optimize dgemv for Intel Nehalem and Haswell.
* Optimize dtrmm for Intel Haswell. ARM:
* Support Android NDK armeabi-v7a-hard ABI (-mfloat-abi=hard)
* Fix lock, rpcc bugs POWER:
* Support ppc64le platform (ELF ABI v2)
* Support POWER7/8 by POWER6 kernels.
* Wed Jul 29 2015 dmitry_rAATTopensuse.org- Change library name suffix
* drop openblas-soname.patch- Add RPM %post script for manual BLAS/LAPACK update-alternatives configuration update- Use update-alternatives mechanism for OpenBLAS variants (serial, openmp, pthreads). pthreads variant is default for x86 and x86_64, OpenMP for other architectures.- Fix build on ARM64
* openblas-arm64-build.patch- Add update-alternatives mechanism for CBLAS- Provide cmake module- Delete info about host cpu from openblas_config.h for dynamic arch- Add update-alternatives to \'preup\' and \'post\' requires list for libraries- Add README.SUSE
* Wed Mar 25 2015 dmitry_rAATTopensuse.org- Update to version 0.2.14
* Improve ger and gemv for small matrices by stack allocation. e.g. make -DMAX_STACK_ALLOC=2048
* Introduce openblas_get_num_threads and openblas_get_num_procs.
* Add ATLAS-style ?geadd function.
* Fix c/zsyr bug with negative incx.
* Fix race condition during shutdown causing a crash in gotoblas_set_affinity(). x86/x86-64:
* Support AMD Streamroller. ARM:
* Add Cortex-A9 and Cortex-A15 targets.
* Wed Dec 03 2014 dmitry_rAATTopensuse.org- Update to version 0.2.13
* Add SYMBOLPREFIX and SYMBOLSUFFIX makefile options for adding a prefix or suffix to all exported symbol names in the shared library.
* Remove openblas-0.1.0-soname.patch
* Add openblas-soname.patch
* Rebase openblas-noexecstack.patch x86/x86-64:
* Add generic kernel files for x86-64. make TARGET=GENERIC
* Fix a bug of sgemm kernel on Intel Sandy Bridge.
* Fix c_check bug on some amd64 systems. ARM:
* Support APM\'s X-Gene 1 AArch64 processors.
* Optimize trmm and sgemm.
* Fri Oct 17 2014 dmitry_rAATTopensuse.org- Update to version 0.2.12
* Added CBLAS interface for ?omatcopy and ?imatcopy.
* Enable ?gemm3m functions.
* Added benchmark for ?gemm3m.
* Optimized multithreading lower limits.
* Disabled SYMM3M and HEMM3M functions because of segment violations. x86/x86-64:
* Improved axpy and symv performance on AMD Bulldozer.
* Improved gemv performance on modern Intel and AMD CPUs.