Changelog for
python310-torch-2.3.1-75.1.x86_64.rpm :
* Thu Aug 29 2024 Guang Yee
- Enable sle15_python_module_pythons.- GCC 9.3 or newer is required, regardless if CUDA is enabled. See https://github.com/pytorch/pytorch/blob/v2.3.1/CMakeLists.txt#L48 Therefore, for SLE15 we went with GCC 11 as it seems to be the most common one.- Use %gcc_version macro for Tumbleweed.
* Thu Jul 11 2024 Christian Goll - update to 2.3.1 with following summarized highlights:
* from 2.0.x: - torch.compile is the main API for PyTorch 2.0, which wraps your model and returns a compiled model. It is a fully additive (and optional) feature and hence 2.0 is 100% backward compatible by definition - Accelerated Transformers introduce high-performance support for training and inference using a custom kernel architecture for scaled dot product attention (SPDA). The API is integrated with torch.compile() and model developers may also use the scaled dot product attention kernels directly by calling the new scaled_dot_product_attention() operato
* from 2.1.x: - automatic dynamic shape support in torch.compile, torch.distributed.checkpoint for saving/loading distributed training jobs on multiple ranks in parallel, and torch.compile support for the NumPy API. - In addition, this release offers numerous performance improvements (e.g. CPU inductor improvements, AVX512 support, scaled-dot-product-attention support) as well as a prototype release of torch.export, a sound full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x: - 2x performance improvements to scaled_dot_product_attention via FlashAttention-v2 integration, as well as AOTInductor, a new ahead-of-time compilation and deployment tool built for non-python server-side deployments.
* from 2.3.x: - support for user-defined Triton kernels in torch.compile, allowing for users to migrate their own Triton kernels from eager without experiencing performance complications or graph breaks. As well, Tensor Parallelism improves the experience for training Large Language Models using native PyTorch functions, which has been validated on training runs for 100B parameter models.- added seperate openmpi4 build- added sepetate vulcan build, although this functions isn\'t exposed to python abi- For the obs build all the vendored sources follow the pattern NAME-7digitcommit.tar.gz and not the NAME-COMMIT.tar.gz- added following patches:
* skip-third-party-check.patch
* fix-setup.patch- removed patches:
* pytorch-rm-some-gitmodules.patch
* fix-call-of-onnxInitGraph.patch
* Thu Jul 22 2021 Guillaume GARDET - Fix build on x86_64 by using GCC10 instead of GCC11 https://github.com/google/XNNPACK/issues/1550
* Thu Jul 22 2021 Guillaume GARDET - Update to 1.9.0- Release notes: https://github.com/pytorch/pytorch/releases/tag/v1.9.0- Drop upstreamed patch:
* fix-mov-operand-for-gcc.patch- Drop unneeded patches:
* removed-peachpy-depedency.patch- Refresh patches:
* skip-third-party-check.patch
* fix-call-of-onnxInitGraph.patch- Add new patch:
* pytorch-rm-some-gitmodules.patch
* Thu Jul 22 2021 Guillaume GARDET - Add _service file to ease future update of deps
* Thu Jul 22 2021 Guillaume GARDET - Update sleef to fix build on aarch64
* Fri Apr 23 2021 Matej Cepl - Don\'t build python36-
* package (missing pandas)
* Thu Jan 21 2021 Benjamin Greiner - Fix python-rpm-macros usage
* Wed Oct 07 2020 Guillaume GARDET - Use GCC9 to build on aarch64 Tumbleweed to workaround SVE problem with GCC10 with sleef, see: https://github.com/pytorch/pytorch/issues/45971
* Thu Aug 20 2020 Martin Liška - Use memoryperjob constraint instead of %limit_build macro.