Consider this line of code:
gpuArray(-1)^0.5;
Which results in:
ans = 0.0000 + 1.0000i
Now consider the following line of code:
gpuArray(-1).^0.5;
Which results in:
Error using .^ POWER: needs to return a complex result, but this is not supported for real input X and Y on the GPU. Use POWER(COMPLEX(X), COMPLEX(Y,0)) instead.
The problem clearly has something to do with a double -> complex double
conversion on the GPU, which is not allowed. Indeed, when I apply the workaround (which is also mentioned in the docs) it solves the problem - but I don't understand why.
Would anybody shed some light on this? Is this some limitation of VRAM? Of the specific card I'm using (mine is GTX 660, having a CC of 3.0)? Of the MATLAB implementation (I'm using R2018b)? Of the OS?
There are a few methods of
gpuArray
that behave this way, and the reason is simple: performance.It is perfectly possible to write an implementation of e.g.
sqrt
that behaves on the GPU the same way that MATLAB's CPU implementation works (i.e. compute a real result unless a complex result is required - in which case, return a complex result). Part of the work is already performed - otherwise thegpuArray
method wouldn't know when to throw an error. However, the expensive part is then re-allocating the (complex) output, and performing the operation again.There are other slight noticeable quirks relating to
gpuArray
and complex numbers - on the GPU, all-zero imaginary parts are not removed when the MATLAB CPU implementation would remove them. For example:(Remembering of course that MATLAB's
isreal
function tells you about storage, not values).EDIT: Just realised that there's a specific doc reference for the functions of
gpuArray
that behave this way.