LWP::RobotUA - A class for Web Robots
require LWP::RobotUA;
$ua = new LWP::RobotUA 'my-robot/0.1', 'me@foo.com';
$ua->delay(10); # be very nice, go slowly
...
# just use it just like a normal LWP::UserAgent
$res = $ua->request($req);
This class implements a user agent that is suitable for robot
applications. Robots should be nice to the servers they visit. They
should consult the file to ensure that they are welcomed
and they should not make requests too frequently.
But, before you consider writing a robot take a look at
<URL:http://info.webcrawler.com/mak/projects/robots/robots.html>.
When you use a LWP::RobotUA as your user agent, then you do not
really have to think about these things yourself. Just send requests
as you do when you are using a normal LWP::UserAgent and this
special agent will make sure you are nice.
The LWP::RobotUA is a sub-class of LWP::UserAgent and implements the
same methods. In addition the following methods are provided:
- $ua = LWP::RobotUA->new($agent_name, $from, [$rules])
-
Your robot's name and the mail address of the human responsible for
the robot (i.e. you) are required by the constructor.
Optionally it allows you to specify the WWW::RobotRules object to
use.
- $ua->
delay([$minutes])
-
Set the minimum delay between requests to the same server. The
default is 1 minute.
- $ua->
use_sleep([$boolean])
-
Get/set a value indicating whether the UA should
sleep() if requests
arrive too fast (before $ua->delay minutes has passed). The default is
TRUE. If this value is FALSE then an internal SERVICE_UNAVAILABLE
response will be generated. It will have an Retry-After header that
indicates when it is OK to send another request to this server.
- $ua->
rules([$rules])
-
Set/get which WWW::RobotRules object to use.
- $ua->
no_visits($netloc)
-
Returns the number of documents fetched from this server host. Yes I
know, this method should probably have been named
num_visits() or
something like that. :-(
- $ua->
host_wait($netloc)
-
Returns the number of seconds (from now) you must wait before you can
make a new request to this host.
- $ua->as_string
-
Returns a string that describes the state of the UA.
Mainly useful for debugging.
the LWP::UserAgent manpage, the WWW::RobotRules manpage
Copyright 1996-2000 Gisle Aas.
This library is free software; you can redistribute it and/or
modify it under the same terms as Perl itself.