Name : perl-WWW-RobotRules
| |
Version : 6.02
| Vendor : SUSE LLC < https://www_suse_com/>
|
Release : 1.23
| Date : 2018-05-25 20:23:09
|
Group : Development/Libraries/Perl
| Source RPM : perl-WWW-RobotRules-6.02-1.23.src.rpm
|
Size : 0.02 MB
| |
Packager : https://www_suse_com/
| |
Summary : Database of robots.txt-derived permissions
|
Description :
This module parses _/robots.txt_ files as specified in \"A Standard for Robot Exclusion\", at < http://www.robotstxt.org/wc/norobots.html> Webmasters can use the _/robots.txt_ file to forbid conforming robots from accessing parts of their web site.
The parsed files are kept in a WWW::RobotRules object, and this object provides methods to check if access to a given URL is prohibited. The same WWW::RobotRules object can be used for one or more parsed _/robots.txt_ files on any number of hosts.
The following methods are provided:
* $rules = WWW::RobotRules->new($robot_name)
This is the constructor for WWW::RobotRules objects. The first argument given to new() is the name of the robot.
* $rules->parse($robot_txt_url, $content, $fresh_until)
The parse() method takes as arguments the URL that was used to retrieve the _/robots.txt_ file, and the contents of the file.
* $rules->allowed($uri)
Returns TRUE if this robot is allowed to retrieve this URL.
* $rules->agent([$name])
Get/set the agent name. NOTE: Changing the agent name will clear the robots.txt rules and expire times out of the cache.
|
RPM found in directory: /vol/rzm3/linux-opensuse/distribution/leap/15.5/repo/oss/noarch |