Hadoop Configuration

When Using Hadoop Standalone Mode

To execute a deployed MATLAB® application or run a deployable archive as a Hadoop® job in standalone mode, first set the appropriate environment variables in the Hadoop environment shell:

  • Modify HADOOP_CLASSPATH according to your Hadoop version.

    • If you are working with Hadoop V1, use mcr_root/toolbox/mlhadoop/jar/a1.2.1/mwmapreduce.jar

    • If you are working with Hadoop V2, use mcr_root/toolbox/mlhadoop/jar/a2.2.0/mwmapreduce.jar

      where, mcr_root is the base of the install area for MATLAB Runtime

  • Export LD_LIBRARY_PATH to include the following entries:

    • mcr_root/runtime/glnxa64 :mcr_root/bin/glnxa64 mcr_root/sys/os/glnxa64 :mcr_root/sys/opengl/glnxa64

      where, mcr_root is the base of the install area for MATLAB Runtime

Hadoop Version Considerations

  • If you are working with Hadoop V1, improve the performance by setting mapred.job.reuse.jvm.num.tasks to -1.

  • If you are working with Hadoop V2, the performance-improvement property is not supported.

Was this topic helpful?