Hadoop Compiler

Package MATLAB programs for deployment to Hadoop clusters as MapReduce programs

Description

The Hadoop Compiler app packages MATLAB® map and reduce functions into a deployable archive. You can incoporate the archive into a Hadoop® mapreduce job by passing it as a payload argument to job submitted to a Hadoop cluster.

Open the Hadoop Compiler App

  • MATLAB Toolstrip: On the Apps tab, under Application Deployment, click the app icon.

  • MATLAB command prompt: Enter hadoopCompiler.

Parameters

expand all

Function for the mapper, specified as a character vector.

Function for the reducer, specified as a character vector.

A file containing a datastore representing the data to be processed, specified as a character vector.

In most cases, you will start off by working on a small sample dataset residing on a local machine that is reprensetative of the actual dataset on the cluster. This sample dataset has the same structure and variables as the actual dataset on the cluster. By creating a datastore object to the dataset residing on your local machine you are taking a snapshot of that structure. By having access to this datastore object, a Hadoop job executing on the cluster will know how to access and process the actual dataset residing on HDFS™.

Format of output from Hadoop mapreduce job, specified as a keyvalue or tabular text.

Additional parameters to configure how Hadoop executes the job, specified as a character vector. For more information, see Configuration File for Creating Deployable Archive Using the mcc Command.

Files that must be included with generated artifacts, specified as a list of files.

Settings

Flags controlling the behavior of the compiler, specified as a character vector.

Folder where files for testing are stored, specified as a character vector.

Folder where generated artifacts are stored, specified as a character vector.

Programmatic Use

See Also

Introduced in R2014b

Was this topic helpful?