Name : perl-Parallel-MapReduce
| |
Version : 0.09
| Vendor : obs://build_opensuse_org/devel:languages:perl
|
Release : 42.3
| Date : 2016-06-10 21:51:41
|
Group : Development/Libraries/Perl
| Source RPM : perl-Parallel-MapReduce-0.09-42.3.src.rpm
|
Size : 0.05 MB
| |
Packager : (none)
| |
Summary : MapReduce Infrastructure, multithreaded
|
Description :
In a nutshell, the MapReduce algorithm is this (in sequential form):
sub mapreduce { my $mri = shift; my $map = shift; my $reduce = shift; my $h1 = shift;
my %h3; while (my ($k, $v) = each %$h1) { my %h2 = &$map ($k => $v); map { push AATT{ $h3{$_} }, $h2{$_} } keys %h2; } my %h4; while (my ($k, $v) = each %h3) { $h4{$k} = &$reduce ($k => $v); } return \\%h4; }
It is the task of the application programmer to determine the functions \'$map\' and \'$reduce\', which when applied to the hash \'$h1\' will produce the wanted result. The infrastructure \'$mri\' is not used above, but it becomes relevant when the individual invocations of \'$map\' and \'$reduce\' are (a) parallelized or (b) are distributed. And this is what this package does.
* Master
This is the host where you initiate the computation and this is where the central algorithm will be executed.
* Workers
Each worker can execute either the \'$map\' function or the \'$reduce\' over the subslice of the overall data. Workers can run local simply as subroutine (see the Parallel::MapReduce::Worker manpage, or can be a thread talking to a remote instance of a worker (see the Parallel::MapReduce::Worker::SSH manpage).
When you create your MR infrastructure you can specify which kind of workers you want to use (via a \'WorkerClass\' in the constructor).
*NOTE*: Feel free to propose more workers.
* Servers
To exchange hash data between master and workers and also between workers this package makes use of an existing \'memcached\' server pool (see the http://www.danga.com/memcached/ manpage). Obviously, the more servers there are running, the merrier.
*NOTE*: The (Debian-packaged) Perl client is somewhat flaky in multi-threaded environments. I made some work-arounds, but other options should be investigated.
|
RPM found in directory: /packages/linux-pbone/ftp5.gwdg.de/pub/opensuse/repositories/devel:/languages:/perl:/CPAN-P/openSUSE_Tumbleweed/noarch |