Name : perl-XML-RSSLite
| |
Version : 0.17
| Vendor : obs://build_opensuse_org/devel:languages:perl
|
Release : 1.24
| Date : 2024-08-05 17:21:25
|
Group : Unspecified
| Source RPM : perl-XML-RSSLite-0.17-1.24.src.rpm
|
Size : 0.02 MB
| |
Packager : (none)
| |
Summary : Lightweight, \"relaxed\" RSS (and XML-ish) parser
|
Description :
This module attempts to extract the maximum amount of content from available documents, and is less concerned with XML compliance than alternatives. Rather than rely on XML::Parser, it uses heuristics and good old-fashioned Perl regular expressions. It stores the data in a simple hash structure, and \"aliases\" certain tags so that when done, you can count on having the minimal data necessary for re-constructing a valid RSS file. This means you get the basic title, description, and link for a channel and its items.
This module extracts more usable links by parsing \"scriptingNews\" and \"weblog\" formats in addition to RDF & RSS. It also \"sanitizes\" the output for best results. The munging includes:
* Remove html tags to leave plain text
* Remove leading whitespace from URIs
* By defaul strips characters except 0-9~!AATT#$%^&*()-+=a-zA-Z[];\',.:\"< >?\\s
* Use < url> tags when < link> is empty
* Use misplaced urls in < title> when < link> is empty
* Exract links from < a href=...> if required
* Limit links to ftp and http(s)
* Join relative item urls (beginning with / or #) to the site base
|
RPM found in directory: /packages/linux-pbone/ftp5.gwdg.de/pub/opensuse/repositories/devel:/languages:/perl:/CPAN-X/openSUSE_Tumbleweed/noarch |