Name : perl-WWW-RobotRules
| |
Version : 6.02
| Vendor : openSUSE
|
Release : 9.30
| Date : 2012-02-20 11:48:55
|
Group : Development/Libraries/Perl
| Source RPM : perl-WWW-RobotRules-6.02-9.30.src.rpm
|
Size : 0.02 MB
| |
Packager : https://bugs_opensuse_org
| |
Summary : Database of robots.txt-derived permissions
|
Description :
This module parses _/robots.txt_ files as specified in \"A Standard for Robot Exclusion\", at < http://www.robotstxt.org/wc/norobots.html> Webmasters can use the _/robots.txt_ file to forbid conforming robots from accessing parts of their web site.
The parsed files are kept in a WWW::RobotRules object, and this object provides methods to check if access to a given URL is prohibited. The same WWW::RobotRules object can be used for one or more parsed _/robots.txt_ files on any number of hosts.
The following methods are provided:
* $rules = WWW::RobotRules->new($robot_name)
This is the constructor for WWW::RobotRules objects. The first argument given to new() is the name of the robot.
* $rules->parse($robot_txt_url, $content, $fresh_until)
The parse() method takes as arguments the URL that was used to retrieve the _/robots.txt_ file, and the contents of the file.
* $rules->allowed($uri)
Returns TRUE if this robot is allowed to retrieve this URL.
* $rules->agent([$name])
Get/set the agent name. NOTE: Changing the agent name will clear the robots.txt rules and expire times out of the cache.
|
RPM found in directory: /packages/linux-pbone/ftp5.gwdg.de/pub/opensuse/repositories/openSUSE:/ALP:/Experimental:/Slowroll/base.20240429/repo/oss/noarch |