SEARCH
NEW RPMS
DIRECTORIES
ABOUT
FAQ
VARIOUS
BLOG

 
 

perl-Hadoop-Streaming rpm build for : openSUSE Tumbleweed. For other distributions click perl-Hadoop-Streaming.

Name : perl-Hadoop-Streaming
Version : 0.143060 Vendor : obs://build_opensuse_org/devel:languages:perl
Release : 1.58 Date : 2024-08-05 18:25:58
Group : Development/Libraries/Perl Source RPM : perl-Hadoop-Streaming-0.143060-1.58.src.rpm
Size : 0.07 MB
Packager : (none)
Summary : Contains Mapper, Combiner and Reducer roles to simplify writing Hadoop S[cut]
Description :
Hadoop::Streaming::* provides a simple perl interface to the Streaming
interface of Hadoop.

Hadoop is a system \"reliable, scalable, distributed computing.\" Hadoop was
developed at Yahoo! and is now maintained by the Apache Software
Foundation.

Hadoop provides a distributed map/reduce framework. Mappers take lines of
unstructured file data and produce key/value pairs. These key/value pairs
are merged and sorted by key and provided to Reducers. Reducers take
key/value pairs and produce higher order data. This works for data that
where output key/value pairs can be determined from a single line of data
in isolation. The Reducer is provided sho

* Hadoop\'s Streaming Interface

The Streaming interface provides a simple API for writing Hadoop jobs in
any language. Jobs are provided input on STDIN and output is expected on
STDOUT. Key value pairs are separated by a TAB character.

Streaming map jobs are provided an input of lines instead of key-value
pairs. See Hadoop::Streaming::Mapper INTERFACE DETAILS for an explanation.

Reduce jobs are provided a stream of key\\tvalue lines. multivalued keys
appear on an input line once for each key\\value. The stream is guaranteed
to be sorted by key. The reduce job must track the key/value pairs and
manually detect a key change.

* Hadoop::Streaming::Mapper interface

Hadoop::Mapper consumes and chomps lines from STDIN and calls map($line)
once per line. This is initiated by the run() method.

example mapper input:

line1
line2
line3

Hadoop::Mapper transforms this into 3 calls to map()

map(line1)
map(line2)
map(line3)

* Hadoop::Streaming::Reducer interface

Hadoop::Reducer abstracts this stream into an interface of (key,
value-iterator). reduce() is called once per key, instead of once per line.
The reduce job pulls values from the iterator and outputs key/value pairs
to STDOUT. emit() is provided as a convenience for outputing key/value
pairs.

example reducer input:

key1 value1
key2 valuea
key2 valuec
key2 valueb
key3 valuefoo
key3 valuebar

Hadoop::Streaming::Reduce transforms this input into three calls to
reduce():

reduce( key, iterator_over(qw(value1)) );
reduce( key2, iterator_over(qw(valuea valuec valueb)) );
reduce( key3, iterator_over(qw(valuefoo valuebarr)) );

* Hadoop::Streaming::Combiner interface

The Hadoop::Streaming::Combiner interface is analagous to the
Hadoop::Streaming::Reducer interface. combine() is called instead of
reduce() for each key. The above example would produce three calls to
combine():

combine( key, iterator_over(qw(value1)) );
combine( key2, iterator_over(qw(valuea valuec valueb)) );
combine( key3, iterator_over(qw(valuefoo valuebarr)) );

RPM found in directory: /packages/linux-pbone/ftp5.gwdg.de/pub/opensuse/repositories/devel:/languages:/perl:/CPAN-H/openSUSE_Tumbleweed/noarch

Content of RPM  Provides Requires

Download
ftp.icm.edu.pl  perl-Hadoop-Streaming-0.143060-1.58.noarch.rpm
     

Provides :
perl(Hadoop::Streaming)
perl(Hadoop::Streaming::Combiner)
perl(Hadoop::Streaming::Mapper)
perl(Hadoop::Streaming::Reducer)
perl(Hadoop::Streaming::Reducer::Input)
perl(Hadoop::Streaming::Reducer::Input::Iterator)
perl(Hadoop::Streaming::Reducer::Input::ValuesIterator)
perl(Hadoop::Streaming::Role::Emitter)
perl(Hadoop::Streaming::Role::Iterator)
perl-Hadoop-Streaming

Requires :
perl(:MODULE_COMPAT_5.40.0)
perl(Moo)
perl(Moo::Role)
perl(Params::Validate)
perl(Safe::Isa)
rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(FileDigests) <= 4.6.0-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1
rpmlib(PayloadIsZstd) <= 5.4.18-1


Content of RPM :
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop/Streaming
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop/Streaming.pm
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop/Streaming/Combiner.pm
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop/Streaming/Mapper.pm
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop/Streaming/Reducer
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop/Streaming/Reducer.pm
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop/Streaming/Reducer/Input
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop/Streaming/Reducer/Input.pm
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop/Streaming/Reducer/Input/Iterator.pm
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop/Streaming/Reducer/Input/ValuesIterator.pm
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop/Streaming/Role
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop/Streaming/Role/Emitter.pm
/usr/lib/perl5/vendor_perl/5.40.0/Hadoop/Streaming/Role/Iterator.pm
/usr/share/doc/packages/perl-Hadoop-Streaming
/usr/share/doc/packages/perl-Hadoop-Streaming/Changes
/usr/share/doc/packages/perl-Hadoop-Streaming/README
/usr/share/doc/packages/perl-Hadoop-Streaming/examples
/usr/share/doc/packages/perl-Hadoop-Streaming/examples/analog
/usr/share/doc/packages/perl-Hadoop-Streaming/examples/analog/map.pl
/usr/share/doc/packages/perl-Hadoop-Streaming/examples/analog/reduce.pl
/usr/share/doc/packages/perl-Hadoop-Streaming/examples/wordcount
/usr/share/doc/packages/perl-Hadoop-Streaming/examples/wordcount/example_hadoop.sh
/usr/share/doc/packages/perl-Hadoop-Streaming/examples/wordcount/example_local.sh
/usr/share/doc/packages/perl-Hadoop-Streaming/examples/wordcount/input
/usr/share/doc/packages/perl-Hadoop-Streaming/examples/wordcount/input/terms.txt
/usr/share/doc/packages/perl-Hadoop-Streaming/examples/wordcount/map.pl
/usr/share/doc/packages/perl-Hadoop-Streaming/examples/wordcount/reduce.pl
/usr/share/licenses/perl-Hadoop-Streaming
/usr/share/licenses/perl-Hadoop-Streaming/LICENSE
There is 9 files more in these RPM.

 
ICM