parallel.cluster.Hadoop

Create Hadoop cluster object

Syntax

hadoopCluster = parallel.cluster.Hadoop

hadoopCluster = parallel.cluster.Hadoop(Name,Value)

Description

hadoopCluster = parallel.cluster.Hadoop creates a parallel.cluster.Hadoop object representing the Hadoop^® cluster.

You use the resulting object as input to the mapreduce and mapreducer functions, for specifying the Hadoop cluster as the parallel execution environment for tall arrays and mapreduce.

hadoopCluster = parallel.cluster.Hadoop(Name,Value) uses the specified names and values to set properties on the created parallel.cluster.Hadoop object.

Examples

collapse all

Set Hadoop Cluster as Execution Environment for mapreduce and mapreducer

This example shows how to create and use a parallel.cluster.Hadoop object to set a Hadoop cluster as the mapreduce parallel execution environment.

hadoopCluster = parallel.cluster.Hadoop('HadoopInstallFolder','/host/hadoop-install');
mr = mapreducer(hadoopCluster);

Set Hadoop Cluster as Execution Environment for tall arrays

This example shows how to create and use a parallel.cluster.Hadoop object to set a Hadoop cluster as the tall array parallel execution environment.

hadoopCluster = parallel.cluster.Hadoop(...
    'HadoopInstallFolder','/host/hadoop-install', ...
    'SparkInstallFolder','/host/spark-install');
mr = mapreducer(hadoopCluster);

Input Arguments

collapse all

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'HadoopInstallFolder','/share/hadoop/a1.2.1'

collapse all

`'ClusterMatlabRoot'` — Path to MATLAB^® for workers
character vector

Path to MATLAB for workers, specified as the comma-separated pair consisting of 'ClusterMatlabRoot' and a character vector. This points to the installation of MATLAB Distributed Computing Server™ for the workers, whether local to each machine or on a network share.

`'HadoopConfigurationFile'` — Path to Hadoop application configuration file
character vector

Path to Hadoop application configuration file, specified as the comma-separated pair consisting of 'HadoopConfigurationFile' and a character vector.

`'HadoopInstallFolder'` — Path to Hadoop installation on worker machines
character vector

Path to Hadoop installation on worker machines, specified as the comma-separated pair consisting of 'HadoopInstallFolder' and a character vector. If this property is not set, the default is the value specified by the environment variable HADOOP_PREFIX, or if that is not set, then HADOOP_HOME.

`'SparkInstallFolder'` — Path to Spark^® enabled Hadoop installation on worker machines
character vector

Path to Spark enabled Hadoop installation on worker machines, specified as the comma-separated pair consisting of 'SparkInstallFolder' and a character vector. If this property is not set, the default is the value specified by the environment variable SPARK_PREFIX, or if that is not set, then SPARK_HOME.

Output Arguments

collapse all

`hadoopCluster` — Hadoop cluster
parallel.cluster.Hadoop object

Hadoop cluster, returned as a parallel.cluster.Hadoop object.

Documentation

parallel.cluster.Hadoop

Syntax

Description

Examples

Set Hadoop Cluster as Execution Environment for mapreduce and mapreducer

Set Hadoop Cluster as Execution Environment for tall arrays

Input Arguments

Name-Value Pair Arguments

`'ClusterMatlabRoot'` — Path to MATLAB^® for workers
character vector

`'HadoopConfigurationFile'` — Path to Hadoop application configuration file
character vector

`'HadoopInstallFolder'` — Path to Hadoop installation on worker machines
character vector

`'SparkInstallFolder'` — Path to Spark^® enabled Hadoop installation on worker machines
character vector

Output Arguments

`hadoopCluster` — Hadoop cluster
parallel.cluster.Hadoop object

See Also

Topics

Introduced in R2014b

Parallel Computing Toolbox Documentation

Other Documentation

Support

Documentation

parallel.cluster.Hadoop

Syntax

Description

Examples

Set Hadoop Cluster as Execution Environment for mapreduce and mapreducer

Set Hadoop Cluster as Execution Environment for tall arrays

Input Arguments

Name-Value Pair Arguments

'ClusterMatlabRoot' — Path to MATLAB® for workers character vector

'HadoopConfigurationFile' — Path to Hadoop application configuration file character vector

'HadoopInstallFolder' — Path to Hadoop installation on worker machines character vector

'SparkInstallFolder' — Path to Spark® enabled Hadoop installation on worker machines character vector

Output Arguments

hadoopCluster — Hadoop cluster parallel.cluster.Hadoop object

See Also

Topics

Introduced in R2014b

Parallel Computing Toolbox Documentation

Other Documentation

Support

`'ClusterMatlabRoot'` — Path to MATLAB^® for workers
character vector

`'HadoopConfigurationFile'` — Path to Hadoop application configuration file
character vector

`'HadoopInstallFolder'` — Path to Hadoop installation on worker machines
character vector

`'SparkInstallFolder'` — Path to Spark^® enabled Hadoop installation on worker machines
character vector

`hadoopCluster` — Hadoop cluster
parallel.cluster.Hadoop object