|
|
|
|
Changelog for highway-devel-1.2.0-1.1.x86_64.rpm :
* Sat Jun 01 2024 Jan Engelhardt - Update to release 1.2.0 * Add utility functions Add InterleaveEven/InterleaveOdd, BitShuffle, GatherIndexNOr, IsNegative, IfNegativeThenElseZero, IfNegativeThenZeroElse, PromoteInRangeTo / ConvertInRangeTo / DemoteInRangeTo * Sun Feb 18 2024 Jan Engelhardt - Update to release 1.1.0 * Add BitCastScalar, DispatchedTarget, Foreach * Add Div/Mod and MaskedDiv/ModOr, SaturatedAbs, SaturatedNeg * Add InterleaveWholeLower/Upper, Dup128VecFromValues * Add IsInteger, IsIntegerLaneType, RemoveVolatile, RemoveCvRef * Add MaskedAdd/Sub/Mul/Div/Gather/Min/Max/SatAdd/SatSubOr * Add MaskFalse, IfNegativeThenNegOrUndefIfZero, PromoteEven/OddTo * Add ReduceMin/Max, 8-bit reductions, f16 <-> f64 conversions * Add Span, AlignedArray, matrix-vector mul * Add SumsOf2/4, I8 SumsOf8, SumsOfAdjQuadAbsDiff, SumsOfShuffledQuadAbsDiff * Extend Dot to f32 *bf16, FMA to integer * Fix: RVV 8-bit overflow, UB in vqsort, big-endian bugs, PPC HTM * New targets: HWY_Z14, HWY_Z15 * Fri Sep 22 2023 Jan Engelhardt - Update to release 1.0.7 * Add LoadNOr, GatherIndexN, ScatterIndexN * Add additional float<->int conversions * Codegen improvements for 8-bit shift, PPC Compress/Expand * Fri Aug 11 2023 Jan Engelhardt - Update to release 1.0.6 * Add MaskedGatherIndex, MaskedScatterIndex, LoadN, StoreN, SatWidenMulPairwiseAdd, SumOfMulQuadAccumulate, PromoteUpperLowerTo. * Add F64 for Wasm, F64 AbsDiff * Validate all D args in x86 function signatures * Wed Jul 19 2023 Jan Engelhardt - Update to release 1.0.5 * Add Insert/ExtractBlock, BroadcastBlock/Lane, NumBlocks * Add integer Le/Ge and [Neg]MulAdd, extend DemoteTo/PromoteTo * Add Leading/TrailingZeroCount, HighestSetBitIndex, ReverseBits * Add MaskedLoadOr, tuple Get/Set/Create, ReduceSum, WidenMulPairwiseAdd * Add [ZeroExtend]ResizeBitCast, BitwiseIfThenElse, Find[Known]LastTrue * Add AESRoundInv, AESKeyGenAssist * Add contrib/math Atan2/SinCos, contrib/unroller * Add fp16/bf16 support (Armv8, SVE, RVV), HWY_DYNAMIC_POINTER * Add OrderedTruncate2To, Per4LaneBlockShuffle, TwoTablesLookupLanes * Add SlideUp/Down[Blocks/Lanes], Slide1Up/Down, ReverseLaneBytes * Add SetBeforeFirst, SetAtOrBefore/AfterFirst, SetOnlyFirst * Add 8-bit Reverse2/4/8, Shl/Shr, RotateRight, Reverse, Mul * Add 8/16-bit DupEven/Odd, TableLookupLanes * Add F64 ApproximateReciprocal[Sqrt], 32/64-bit SaturatedAdd/Sub * Wed May 24 2023 Jan Engelhardt - Update memory limiter from 900 to 1400/process. * Fri May 12 2023 Jan Engelhardt - Add memory-constraints to build * Wed May 10 2023 Jan Engelhardt - Add no-forced-inline.diff [boo#1211093] * Fri Mar 17 2023 Jan Engelhardt - Update to release 1.0.4 * Add PPC8..10, SSE2, AVX3_ZEN4, NEON_WITHOUT_AES targets * Add Expand, LoadExpand, integer AbsDiff, SumsOf8AbsDiff * Improved Half/Twice support, codegen for Shift *Same * Faster KV128 sorting * Update RISC-V V intrinsics for 1.0-draft- Remove arm-disable-runtime-dispatch.patch (appears merged) * Thu Jan 19 2023 Jan Engelhardt - Update to release 1.0.3 * Add RearrangeToOddPlusEven, Xor3, 8-bit CompressStore, HWY_ASSUME * Add contrib/bit_pack for 8/16-bit lanes * Update for new RVV intrinsics; faster WASM min/max and extmul/q15mul * Thu Dec 15 2022 Simon Vogl - Added missing baselibs.conf so that 32bit library packages become available * Wed Nov 23 2022 Bruno Pitrus - Fix the library being built for incorrect microarchitecture on armv{6,7}hl. * add arm-disable-runtime-dispatch.patch to fix compiler error- Actually run the testsuite. * Tue Nov 08 2022 Jan Engelhardt - Have armv7 build succeed again. * Tue Nov 01 2022 Jan Engelhardt - Update to release 1.0.2 * Add ExclusiveNeither, FindKnownFirstTrue, Ne128 * Add 16-bit SumOfLanes/ReorderWidenMulAccumulate/ReorderDemote2To * Faster sort for low-entropy input, improved pivot selection * Support static dispatch to SVE2_128 and SVE_256- Leap just needs a modern gcc, no need for clang * Wed Oct 26 2022 Tom Mbrt - Fix build on openSUSE Leap by using clang * Thu Sep 22 2022 Enrico Belleri - Update to 1.0.1: * Add Eq128, i64 Mul, unsigned->float ConvertTo * Faster sort for few unique keys, more robust pivot selection * Fix: floating-point generator for sort tests, Min/MaxOfLanes for i16 * Fix: avoid always_inline in debug, link atomic * GCC warnings: string.h, maybe-uninitialized, ignored-attributes * GCC warnings: preprocessor int overflow, spurious use-after-free/overflow * Doc: <=HWY_AVX3, Full32/64/128, how to use generic-inl * ABI change: 64-bit target values, more room for expansion * Add CompressBlocksNot, CompressNot, Lt128Upper, Min/Max128Upper, TruncateTo * Add HWY_SVE2_128 target * Sort speedups especially for 128-bit * Documentation clarifications * Faster NEON CountTrue/FindFirstTrue/AllFalse/AllTrue * Improved SVE codegen * Fix u16x8 ConcatEven/Odd, SSSE3 i64 Lt * MSVC 2017 workarounds * Support for runtime dispatch on Arm/GCC/Linux * Wed Jul 13 2022 Guillaume GARDET - Use GCC11 instead of default GCC12 to build on aarch64 Tumbleweed until fixed upstream - https://github.com/google/highway/issues/776 * Sat Jul 09 2022 Jan Engelhardt - Initial package (version 0.17.0) for build.opensuse.org
|
|
|