SEARCH
NEW RPMS
DIRECTORIES
ABOUT
FAQ
VARIOUS
BLOG

 
 
Changelog for python311-ocrmypdf-15.4.0-17.1.noarch.rpm :

* Fri Sep 29 2023 Frank Kunz - Update to version 15.0.2 Added Python 3.12 to test matrix. Updated documentation for notes on Python 3.12, 32-bit support and some new features in v15.
* Wed Sep 27 2023 Frank Kunz - Update to version 15.0.1 v15.0.1 Wheels Python tag changed to py39. Marked as a expected fail a test that fails on recent Ghostscript versions. Clarified documentation and release notes around the extent of 32-bit support. Updated installation documentation to changes in v15. v15.0.0 Dropped support for Python 3.8. Dropped support many older dependencies - see pyproject.toml for details. Generally speaking, Ubuntu 22.04 is our baseline system. Dropped support for 32-bit Linux wheels. You must use a 64-bit operating system, and 64-bit versions of Python, Tesseract and Ghostscript to use OCRmyPDF. Many of our dependencies are dropping 32-bit builds (e.g. Pillow), and we are following suit. (Maintainers may still build 32-bit versions from source.) Changed to trusted release for PyPI publishing. pikepdf memory mapping is enabled again for improved performance, now an issue with pikepdf has been fixed. ocrmypdf.helpers.calculate_downsample previously had two variants, one that took a PIL.Image and one that took a tuple[int, int]. The latter was removed. The snap version of ocrmypdf is now based on Ubuntu core22. We now account situations where a small portion of an image on a page reports a high DPI (resolution). Previously, the entire page would be rasterized at the highest resolution, which caused performance problems. Now, the page is rasterized at a resolution based on the average DPI of the page, weighted by the area that each feature occupies. Typically, small areas of high resolution in PDFs are errors or quirks from the repeated use of assets and high resolution is not beneficial. :issue:`1010,1104,1004,1079,1010` Ghostscript color conversion strategy is now configurable. :issue:`1143` v14.4.0 Digitally signed PDFs are now detected. If the PDF is signed, OCRmyPDF will refuse to modify it. Previously, only encrypted PDFs were detected, not those that were signed but not encrypted. :issue:`1040` In addition, --invalidate-digital-signatures can be used to override the above behavior and modify the PDF anyway. :issue:`1040` tqdm progress bars replaced with \"rich\" progress bars. The rich library is a new dependency. Certain APIs that used tqdm are now deprecated and will be removed in the next major release. Improved integration with GitHub Releases. Thanks to AATTstumpylog.
* Fri Jun 23 2023 Frank Kunz - Update to version 14.3.0 Renamed master branch to main. Improve PDF rasterization accuracy by using the -dPDFSTOPONERROR option to Ghostscript. Use --continue-on-soft-render-error if you want to render the PDF anyway. The plugin specification was adjusted to support this feature; plugin authors may want to adapt PDF rasterizing and rendering plugins. :issue:`1083` The calculated deskew angle is now recorded in the logged output. :issue:`1101` Metadata can now be unset by setting a metadata type such as --title to an empty string. :issue:`1117,1059` Fixed random order of languages due to use of a set. This may have caused output to vary when multiple languages were set for OCR. :issue:`1113` Clarified the optimization ratio reported in the log output. Fixed :issue:`977`, where images inside Form XObjects were always excluded from image optimization. Added --tesseract-downsample-above to downsample larger images even when they do not exceed Tesseract\'s internal limits. This can be used to speed up OCR, possibly sacrificing accuracy. Fixed resampling AttributeError on older Pillow. :issue:`1096` Removed an error about using Ghostscript on PDFs with that have the /UserUnit feature in use. Previously, Ghostscript would fail to process these PDFs, but in all supported versions it is now supported, so the error is no longer needed. Improved documentation around installing other language packs for Tesseract.
* Thu Apr 20 2023 Frank Kunz - Update to version 14.1.0 Added --tesseract-non-ocr-timeout. This allows using Tesseract\'s deskew and other non-OCR features while disabling OCR using --tesseract-timeout 0. Added --tesseract-downsample-large-images. This downsamples larges images that exceed the maximum image size Tesseract can handle. Large images may still take a long time to process, but this allows them to be processed if that is desired. Fixed :issue:`1082`, an issue with snap packaged building. Change linter to ruff, fix lint errors, update documentation.
* Thu Apr 13 2023 Frank Kunz - Update to version 14.0.4
* Tue Feb 14 2023 Frank Kunz - Update to version 14.0.2
* Tue May 24 2022 Frank Kunz - Update to version 13.4.4
* Wed Jul 29 2020 Karl Cheng - Update to version 10.3.1
* Tue Dec 05 2017 t.grunerAATTkatodev.de- update to version 4.5.6
* Wed Mar 30 2016 t.grunerAATTkatodev.de- update to version 4.0.7
 
ICM