Name : perl-Apache-Hadoop-Config
| |
Version : 0.01
| Vendor : obs://build_opensuse_org/devel:languages:perl
|
Release : lp154.2.1
| Date : 2023-01-27 18:01:54
|
Group : Development/Libraries/Perl
| Source RPM : perl-Apache-Hadoop-Config-0.01-lp154.2.1.src.rpm
|
Size : 0.02 MB
| |
Packager : https://www_suse_com/
| |
Summary : Perl extension for Hadoop node configuration
|
Description :
Perl extension Apache::Hadoop::Config is designed to address Hadoop deployment and configuration practices, enabling rapid provisioning of Hadoop cluster with customization. It has two distinct capabilities (1) to generate configuration files, (2) create namenode and datanode repositories.
This package need to be installed ideally on at least one of the nodes in the cluster, assuming that all nodes have identical hardware configuration. However, this package can be installed on any other node and required hardware information can be supplied using arguments and configuration files can be generated and copied to actual cluster nodes.
This package is capable of creating repositories for namenode and datanodes, for which it should be installed on ALL hadoop cluster nodes.
Create a new Apache::Hadoop::Config object, either using system configuration or by supplying from command line arguments.
my $h = Apache::Hadoop::Config->new;
Basic configuration and memory settings are available using two functions. Calling basic configuration function is required while memory configuration is recommended.
$h->basic_config; $h->memory_config;
The package can print or create XML configuration files independently, using print and write functions, for configuration. It is necessary to provide conf directory, writable, to write configuration XML files.
$h->print_config; $h->write_config (confdir=>\'etc/hadoop\');
Additional configuration parameters can be supplied at the time of creating the object.
my $h = Apache::Hadoop::Config->new ( config=> { \'mapred-site.xml\' => { \'mapreduce.task.io.sort.mb\' => 256, }, \'core-site.xml\' => { \'hadoop.tmp.dir\' => \'/tmp/hadoop\', }, }, );
These parameters will override any automatically generated parameters, built into this package.
The package creates namenode and datanode volumes along with setting permission of hadoop.tmp.dir and log directories. The disk information can be supplied at object construction time.
my $h = Apache::Hadoop::Config->new ( hdfs_name_disks => [ \'/hdfs/namedisk1\', \'/hdfs/namedisk2\' ], hdfs_data_disks => [ \'/hdfs/datadisk1\', \'/hdfs/datadisk2\' ], hdfs_tmp => \'/hdfs/tmp\', hdfs_logdir => [ \'/logs\', \'/logs/userlog\' ], );
Note that name disks and data disks accept reference to array type of data. The package creates all the namenode and datanode volumes and creates log and tmp directories.
$h->create_hdfs_name_disks; $h->create_hdfs_data_disks; $h->create_hdfs_tmpdir; $h->create_hadoop_logdir;
The permission will be set as appropriate. It is strongly recommended that this package and associated script is executed by Hadoop Admin user (hduser).
Some of the basic configuration can be customized externally using object arguments. Namenode, secondary namenode, proxy node informations can be customized. Default is localhost for each of them.
my $h = Apache::Hadoop::Config->new ( namenode => \'nn.myorg.com\', secondary=> \'nn2.myorg.com\', proxynode=> \'pr.myorg.com\', proxyport=> \'8888\', # default, optional );
These are optional and required only when secondary namenode and proxy node are different than primary namenode.
|
RPM found in directory: /packages/linux-pbone/ftp5.gwdg.de/pub/opensuse/repositories/devel:/languages:/perl:/CPAN-A/15.4/noarch |