By Bryan Young
Expert Author
Article Date: 2010-10-19
I was recently working on a project where I needed to occasionally download many different files from an internet server all at the same time, then process them as a whole. The normal process caused a serious bottleneck in my code so I set out to find a better way. Where I had been familiar LWP::Simple, I decided to stay within the LWP. This is when I found LWP::Parallel.
The syntax is very similar to LWP, there are just a few small changes to be made. Below is a simple example using LWP::Parallel.
require LWP::Parallel::UserAgent;
use HTTP::Request;
my $requests = [
HTTP::Request->new('GET', "http://example.com/file1"),
HTTP::Request->new('GET', "http://example.com/file2"),
HTTP::Request->new('GET', "http://example.com/file3"),
];
my $useragent = LWP::Parallel::UserAgent->new();
$useragent->in_order(1);
$useragent->duplicates(0);
$useragent->timeout(2);
$useragent->redirect(1);
foreach my $req (@$requests) {
$useragent->register($req);
}
my $responses = $useragent->wait();
foreach my $response (%$responses) {
my $html = $responses->{$response}->response;
print "HTML response from " . $html->request->url . "\n
" . $html->content . "\n";
}
This code will execute the HTTP requests in parallel and once all the requests are returned, it will output the returned html to stdout. The four useragent properties determine exactly how the responses are handled. in_order is a boolean property that when set to true, will return the requests in the same order they were registered, otherwise it's first come, first served. duplicates is a boolean property as well that when set to true will ignore any duplicate requests made for efficiency sake. timeout sets the max timeout in seconds. redirect is a boolean property that determines whether the requests will follow any redirects to obtain the requested data.
In my project, after running tests to determine runtime, I found that using the parallel requests shaved my execution time from over a minute and a half to less than 20 seconds. It made a huge impact on my code efficiency, nearly eradicating the bottleneck I was experiencing.