Name : perl-XML-RSS-FromHTML-Simple
| |
Version : 0.05
| Vendor : obs://build_opensuse_org/devel:languages:perl
|
Release : lp155.7.1
| Date : 2023-07-20 17:25:20
|
Group : Development/Libraries/Perl
| Source RPM : perl-XML-RSS-FromHTML-Simple-0.05-lp155.7.1.src.rpm
|
Size : 0.02 MB
| |
Packager : https://www_suse_com/
| |
Summary : Create RSS feeds for sites that don\'t offer them
|
Description :
\'XML::RSS::FromHTML::Simple\' helps reeling in web pages and creating RSS files from them. Typically, it is used with websites that are displaying news content in HTML, but aren\'t providing RSS files of their own. RSS files are typically used to track the content on frequently changing news websites and to provide a way for other programs to figure out if new news have arrived.
To create a new RSS generator, call \'new()\':
use XML::RSS::FromHTML::Simple;
my $f = XML::RSS::FromHTML::Simple->new({ title => \"My new cool RSS\", url => \"http://perlmeister.com/art_eng.html\", rss_file => $outfile, });
\'url\' is the URL to a site whichs content you\'d like to track. \'title\' is an optional feed title which will show up later in the newly created RSS. \'rss_file\' is the name of the resulting RSS file, it defaults to \'out.xml\'.
Instead of reeling in a document via HTTP, you can just as well use a local file:
my $f = XML::RSS::FromHTML::Simple->new({ html_file => \"art_eng.html\", base_url => \"http://perlmeister.com\", rss_file => \"perlnews.xml\", });
Note that in this case, a \'base_url\' is necessary to allow the generator to put fully qualified URLs into the RSS file later.
\'XML::RSS::FromHTML::Simple\' creates accessor functions for all of its attributes. Therefore, you could just as well create a boilerplate object and set its properties afterwards:
my $f = XML::RSS::FromHTML::Simple->new(); $f->html_file(\"art_eng.html\"); $f->base_url(\"http://perlmeister.com\"); $f->rss_file(\"perlnews.xml\");
Typically, not all links embedded in the HTML document should be copied to the resulting RSS file. The \'link_filter()\' attribute takes a subroutine reference, which decides for each URL whether to process it or ignore it:
$f->link_filter( sub { my($url, $text) = AATT_;
if($url =~ m#linux-magazine\\.com/#) { return 1; } else { return 0; } });
The \'link_filter\' subroutine gets called with each URL and its link text, as found in the HTML content. If \'link_filter\' returns 1, the link will be added to the RSS file. If \'link_filter\' returns 0, the link will be ignored.
To start the RSS generator, run
$f->make_rss() or die $f->error();
which will generate the RSS file. If anything goes wrong, \'make_rss()\' returns false and the \'error()\' method will tell why it failed.
In addition to decide if the Link is RSS-worthy, the filter may also change the value of the URL, the corresponding link text or any other RSS fields. The third argument passed to \'link_filter\' by the processor is the processor object itself, which offers a \'rss_attrs()\' method to set additional values or modify the link text or the link itself:
$f->link_filter( sub { my($url, $text, $processor) = AATT_;
if($url =~ m#linux-magazine\\.com/#) { $processor->rss_attrs({ description => \"This is cool stuff\", link => \'http://link.here.instead.com\', title => \'New Link Text\', }); return 1; } else { return 0; } });
|
RPM found in directory: /packages/linux-pbone/ftp5.gwdg.de/pub/opensuse/repositories/devel:/languages:/perl:/CPAN-X/15.5/noarch |