Configuration File for Creating Deployable Archive Using the mcc Command

When creating a deployable archive using the mcc command, you must create a text file containing the following information:

Parameter TypeDescription

mw.ds.out.type

Output type of data fromHadoop® mapreduce job

The options are:

  • keyvalue

  • tabulartext

mw.mapper

Name of MATLAB® map function

mw.reducer

Name of MATLAB reduce function

mw.ds.in.format

Name of MAT-file containing a datastore object representing the format of the data to be processed.

In most cases, you will start off by working on a small sample dataset residing on a local machine that is reprensetative of the actual dataset on the cluster. This sample dataset has the same structure and variables as the actual dataset on the cluster. By creating a datastore object to the dataset residing on your local machine you are taking a snapshot of that structure. By having access to this datastore object, a Hadoop job executing on the cluster will know how to access and process the actual dataset residing on HDFS™.

mw.ds.in.type

Input type of data to Hadoop mapreduce job

The options are:

  • keyvalue

  • tabulartext

mw.ds.in.fullfile

Default value is false

Note

The Hadoop Compiler app automatically populates the above parameters after selecting the map function, the reduce function, the input type, and the output type. You can view the contents of your settings file in the Configuration file contents section of the Hadoop Compiler app.

Example:

 config.txt

Related Topics

Was this topic helpful?