Name : perl-WWW-RobotRules
| |
Version : 6.02
| Vendor : obs://build_opensuse_org/devel:ALP
|
Release : 9.2
| Date : 2023-03-10 08:48:08
|
Group : Development/Libraries/Perl
| Source RPM : perl-WWW-RobotRules-6.02-9.2.src.rpm
|
Size : 0.02 MB
| |
Packager : (none)
| |
Summary : Database of robots.txt-derived permissions
|
Description :
This module parses _/robots.txt_ files as specified in \"A Standard for Robot Exclusion\", at < http://www.robotstxt.org/wc/norobots.html> Webmasters can use the _/robots.txt_ file to forbid conforming robots from accessing parts of their web site.
The parsed files are kept in a WWW::RobotRules object, and this object provides methods to check if access to a given URL is prohibited. The same WWW::RobotRules object can be used for one or more parsed _/robots.txt_ files on any number of hosts.
The following methods are provided:
* $rules = WWW::RobotRules->new($robot_name)
This is the constructor for WWW::RobotRules objects. The first argument given to new() is the name of the robot.
* $rules->parse($robot_txt_url, $content, $fresh_until)
The parse() method takes as arguments the URL that was used to retrieve the _/robots.txt_ file, and the contents of the file.
* $rules->allowed($uri)
Returns TRUE if this robot is allowed to retrieve this URL.
* $rules->agent([$name])
Get/set the agent name. NOTE: Changing the agent name will clear the robots.txt rules and expire times out of the cache.
|
RPM found in directory: /packages/linux-pbone/ftp5.gwdg.de/pub/opensuse/repositories/devel:/ALP/cross/noarch |