You have options for performing MATLAB calculations on the GPU:
You can transfer or create data on the GPU, and use the resulting gpuArray as input to enhanced built-in functions that support them. For more information and a list of functions that support gpuArray as inputs, see Run Built-In Functions on a GPU.
You can run your own MATLAB function of element-wise operations on a GPU.
Your decision on which solution to adopt depends on whether the functions you require are enhanced to support gpuArray, and the performance impact of transferring data to/from the GPU.
To execute your MATLAB function on a GPU, call arrayfun
or bsxfun
with
a function handle to the MATLAB function as the first input argument:
result = arrayfun(@myFunction,arg1,arg2);
Subsequent arguments provide inputs to the MATLAB function.
These input arguments can be workspace data or gpuArray. If any of
the input arguments is a gpuArray, the function executes on the GPU
and returns a gpuArray. (If none of the inputs is a gpuArray, then arrayfun
and bsxfun
execute
in the CPU.)
Note
|
See the arrayfun
and bsxfun
reference pages for descriptions
of their available options.
In this example, a small function applies correction data to
an array of measurement data. The function defined in the file myCal.m
is:
function c = myCal(rawdata, gain, offst)
c = (rawdata .* gain) + offst;
The function performs only element-wise operations when applying
a gain factor and offset to each element of the rawdata
array.
Create some nominal measurement:
meas = ones(1000)*3; % 1000-by-1000 matrix
The function allows the gain and offset to be arrays of the
same size as rawdata
, so that unique corrections
can be applied to individual measurements. In a typical situation,
you might keep the correction data on the GPU so that you do not have
to transfer it for each application:
gn = rand(1000,'gpuArray')/100 + 0.995; offs = rand(1000,'gpuArray')/50 - 0.01;
Run your calibration function on the GPU:
corrected = arrayfun(@myCal,meas,gn,offs);
This runs on the GPU because the input arguments gn
and offs
are
already in GPU memory.
Retrieve the corrected results from the GPU to the MATLAB workspace:
results = gather(corrected);
The function you pass into arrayfun
or bsxfun
can
contain the following built-in MATLAB functions and operators:
abs and acos acosh acot acoth acsc acsch asec asech asin asinh atan atan2 atanh beta betaln bitand bitcmp bitget bitor bitset bitshift bitxor ceil complex conj cos cosh cot coth csc csch | double eps eq erf erfc erfcinv erfcx erfinv exp expm1 false fix floor gamma gammaln ge gt hypot imag Inf int8 int16 int32 int64 intmax intmin isfinite isinf isnan ldivide le log log2 | log10 log1p logical lt max min minus mod NaN ne not or pi plus pow2 power rand randi randn rdivide real reallog realmax realmin realpow realsqrt rem round sec sech sign sin single | sinh sqrt tan tanh times true uint8 uint16 uint32 uint64 xor + - .* ./ .\ .^ == ~= < <= > >= & | ~ && || | Scalar expansion versions of the following:* / \ ^ break continue else elseif for if return while |
The function you pass to arrayfun
or bsxfun
for
execution on a GPU can contain the random number generator functions rand
, randi
,
and randn
. However, the GPU
does not support the complete functionality that MATLAB does.
arrayfun
and bsxfun
support
the following functions for random matrix generation on the GPU:
rand rand() rand('single') rand('double') randn randn() randn('single') randn('double') | randi randi() randi(IMAX, ...) randi([IMIN IMAX], ...) randi(..., 'single') randi(..., 'double') randi(..., 'int32') randi(..., 'uint32') |
You do not specify the array size for random generation. Instead, the number of generated random values is determined by the sizes of the input variables to your function. In effect, there will be enough random number elements to satisfy the needs of any input or output variables.
For example, suppose your function myfun.m
contains
the following code that includes generating and using the random matrix R
:
function Y = myfun(X) R = rand(); Y = R.*X; end
If you use arrayfun
to run this function
with an input variable that is a gpuArray, the function runs on the
GPU, where the number of random elements for R
is
determined by the size of X
, so you do not need
to specify it. The following code passes the gpuArray matrix G
to myfun
on
the GPU.
G = 2*ones(4,4,'gpuArray')
H = arrayfun(@myfun, G)
Because G
is a 4-by-4 gpuArray, myfun
generates
16 random value scalar elements for R
, one for
each calculation with an element of G
.
Random number generation by arrayfun
and bsxfun
on
the GPU uses the same global stream as gpuArray random generation
as described in Control the Random Stream for gpuArray.
For more information about generating random numbers on a GPU, and
a comparison between GPU and CPU generation, see Control Random Number Streams. For an
example that shows performance comparisons for different random generators,
see Generating
Random Numbers on a GPU.
The following limitations apply to the code within the function
that arrayfun
or bsxfun
is evaluating
on a GPU.
Like arrayfun
in MATLAB, matrix
exponential power, multiplication, and division (^
, *
, /
, \
)
perform element-wise calculations only.
Operations that change the size or shape of the input
or output arrays (cat
, reshape
,
etc.), are not supported.
When generating random matrices with rand
, randi
,
or randn
, you do not need to specify the matrix
size, and each element of the matrix has its own random stream. See Generate Random Numbers on a GPU.
arrayfun
and bsxfun
support
read-only indexing (subsref
) and access to variables
of the parent (outer) function workspace from within nested functions,
i.e., those variables that exist in the function before the arrayfun
/bsxfun
evaluation
on the GPU. Assignment or subsasgn
indexing of
these variables from within the nested function is not supported.
For an example of the supported usage see Stencil
Operations on a GPU
Anonymous functions do not have access to their parent function workspace.
Overloading the supported functions is not allowed.
The code cannot call scripts.
There is no ans
variable to hold
unassigned computation results. Make sure to explicitly assign to
variables the results of all calculations that you need to access.
The following language features are not supported:
persistent or global variables, parfor
, spmd
, switch
,
and try
/catch
.
P-code files cannot contain a call to arrayfun
or bsxfun
with
gpuArray data.