Big Data

Accelerate mapreduce and datastore programs by running on a parallel pool or Hadoop® cluster

Parallel Computing Toolbox™ extends the capabilities of MATLAB® MapReduce and Datastore, so that you can run big data applications on a parallel pool for improved performance. MATLAB Distributed Computing Server™ also supports running parallel MapReduce programs on Hadoop clusters.

Functions

mapreduce Programming technique for analyzing data sets that do not fit in memory
mapreducer Define parallel execution environment for mapreduce
partition Partition a datastore
numpartitions Number of partitions
parpool Create parallel pool on cluster
gcp Get current parallel pool

Classes

parallel.Pool Access parallel pool
parallel.cluster.Hadoop Hadoop cluster for mapreducer

Related Information

Was this topic helpful?