Distribute Arrays

Distributed Versus Codistributed Arrays

You can create a distributed array in the MATLAB client, and its data is stored on the workers of the open parallel pool. A distributed array is distributed in one dimension, along the last nonsingleton dimension, and as evenly as possible along that dimension among the workers. You cannot control the details of distribution when creating a distributed array.

You can create a codistributed array by executing on the workers themselves, either inside an spmd statement, in pmode, or inside a communicating job. When creating a codistributed array, you can control all aspects of distribution, including dimensions and partitions.

The relationship between distributed and codistributed arrays is one of perspective. Codistributed arrays are partitioned among the workers from which you execute code to create or manipulate them. Distributed arrays are partitioned among workers in the parallel pool. When you create a distributed array in the client, you can access it as a codistributed array inside an spmd statement. When you create a codistributed array in an spmd statement, you can access is as a distributed array in the client. Only spmd statements let you access the same array data from two different perspectives.

Create Distributed Arrays

You can create a distributed array in any of several ways:

  • Use the distributed function to distribute an existing array from the client workspace to the workers of a parallel pool.

  • Use any of the overloaded distributed object methods to directly construct a distributed array on the workers. This technique does not require that the array already exists in the client, thereby reducing client workspace memory requirements. These overloaded functions include eye(___,'distributed'), rand(___,'distributed'), etc. For a full list, see the distributed object reference page.

  • Create a codistributed array inside an spmd statement, then access it as a distributed array outside the spmd statement. This lets you use distribution schemes other than the default.

The first two of these techniques do not involve spmd in creating the array, but you can see how spmd might be used to manipulate arrays created this way. For example:

Create an array in the client workspace, then make it a distributed array:

parpool('local',2) % Create pool
W = ones(6,6);
W = distributed(W); % Distribute to the workers
spmd
    T = W*2; % Calculation performed on workers, in parallel.
             % T and W are both codistributed arrays here.
end
T            % View results in client.
whos         % T and W are both distributed arrays here.
delete(gcp)  % Stop pool

Create Codistributed Arrays

You can create a codistributed array in any of several ways:

  • Use the codistributed function inside an spmd statement, a communicating job, or pmode to codistribute data already existing on the workers running that job.

  • Use any of the overloaded codistributed object methods to directly construct a codistributed array on the workers. This technique does not require that the array already exists in the workers. These overloaded functions include eye(___,'codistributed'), rand(___,'codistributed'), etc. For a full list, see the codistributed object reference page.

  • Create a distributed array outside an spmd statement, then access it as a codistributed array inside the spmd statement running on the same parallel pool.

In this example, you create a codistributed array inside an spmd statement, using a nondefault distribution scheme. First, define 1-D distribution along the third dimension, with 4 parts on worker 1, and 12 parts on worker 2. Then create a 3-by-3-by-16 array of zeros.

parpool('local',2) % Create pool
spmd
    codist = codistributor1d(3,[4,12]);
    Z = zeros(3,3,16,codist);
    Z = Z + labindex;
end
Z  % View results in client.
   % Z is a distributed array here.
delete(gcp) % Stop pool

For more details on codistributed arrays, see Working with Codistributed Arrays.

Was this topic helpful?