RunMat
GitHub

gather — Gather gpuArray data back to host memory in MATLAB and RunMat.

gather(X) copies GPU-resident data back to host memory. In RunMat, gpuArray handles become dense MATLAB values, while CPU-resident inputs pass through unchanged.

Syntax

X = gather(X)
[X1, X2, ...] = gather(X1, X2, ...)

Inputs

NameTypeRequiredDefaultDescription
XAnyYesInput value to gather from GPU to host.
X1AnyYesFirst input value to gather.
XnAnyVariadicAdditional input values to gather.

Returns

NameTypeDescription
XAnyHost-resident value gathered from input.
XAnyHost-resident outputs matching each gathered input.

Returned values from gather depend on how many outputs the caller requests.

Errors

IdentifierWhenMessage
RunMat:gather:NotEnoughInputsNo input arguments were provided.gather: not enough input arguments
RunMat:gather:TooManyOutputsRequested outputs exceed one for single-input gather.gather: too many output arguments
RunMat:gather:OutputCountMismatchRequested output count does not match number of input arguments.gather: number of outputs must match number of inputs
RunMat:gather:InternalErrorInternal output container construction failed.gather: internal error

How gather works

  • Accepts any MATLAB value. Non-GPU inputs (numbers, logicals, structs, strings, etc.) pass through untouched, so gather is safe to call unconditionally at API boundaries.
  • Downloads gpuArray tensors via the active acceleration provider, producing dense double-precision matrices. Logical gpuArray inputs return logical arrays with MATLAB-compatible 0/1 encoding.
  • Recursively descends into cells, structs, and objects, gathering every nested gpuArray handle. This mirrors MATLAB's behaviour when you gather composite data structures.
  • Clears residency metadata so the auto-offload planner treats the gathered value as host-resident.
  • Supports multiple inputs: in a single-output context it returns a 1×N cell array preserving the original order; in a multi-output assignment the number of inputs and outputs must match, mirroring MATLAB's requirement.
  • Raises gather: no acceleration provider registered when you attempt to download gpuArray data without an active provider, and propagates provider-specific download errors verbatim.

Does RunMat run gather on the GPU?

gather itself runs on the CPU. When the input contains gpuArray handles, the builtin calls the provider's download hook to retrieve a HostTensorOwned view, converts the result into MATLAB data, and clears residency via runmat_accelerate_api::clear_residency. If the provider does not implement download, the builtin surfaces the provider error so you know the backend must be extended. When the input is already on the host, no provider work is required.

GPU memory and residency

RunMat's auto-offload planner keeps tensors on the GPU until a builtin marked as a sink (such as gather, plotting functions, or I/O) requests host access. You usually call gather at API boundaries, for example to log results or hand them to CPU-only libraries. If the upstream computation never leaves the GPU, you can omit gather and keep chaining gpu-aware builtins.

Examples

Converting a gpuArray back to host memory

G = gpuArray([1 2 3; 4 5 6]);
H = gather(G)

Expected output:

H =
     1     2     3
     4     5     6

Gathering data that is already on the CPU

x = [10 20 30];
y = gather(x)

Expected output:

y =
    10    20    30

Preserving logical values when gathering

mask = gpuArray(logical([1 0 1 0]));
hostMask = gather(mask)

Expected output:

hostMask =
  1×4 logical array
   1   0   1   0

Gathering gpuArray values stored inside a cell array

C = {gpuArray([1 2]), 42};
hostC = gather(C)

Expected output:

hostC =
  1×2 cell array
    {[1 2]}    {[42]}

Gathering struct fields that live on the GPU

S.data = gpuArray(magic(3));
S.label = "gpu result";
S_host = gather(S)

Expected output:

S_host =
  struct with fields:
     data: [3×3 double]
    label: "gpu result"

Gathering multiple gpuArrays into one cell result

A = gpuArray(eye(3));
B = gpuArray(ones(3));
cellOut = gather(A, B)

Expected output:

cellOut =
  1×2 cell array
    {[3×3 double]}    {[3×3 double]}

Gathering results at the end of a GPU pipeline

A = gpuArray(rand(1024, 1));
B = sin(A) .* 5;
result = gather(B)

Expected output:

result(1:3) =
    4.1377
    2.4884
    0.1003

Using gather with coding agents

Open a RunMat example with live inputs, then ask the agent to explain how gather changes the result.

Run a small gather example, explain the result, then change one input and compare the output.

FAQ

Does gather modify the original gpuArray?

No. gather returns a host-side copy. The original gpuArray value remains valid and continues to reside on the GPU until it goes out of scope.

What happens if the input does not live on the GPU?

Nothing changes—the value is returned as-is. This makes gather safe to sprinkle into code paths that may or may not run on the GPU.

How are logical gpuArray values represented after gathering?

Logical handles are tagged during gpuArray creation. gather reads that metadata and produces a MATLAB logical array with the same shape, ensuring comparisons like isa(result, 'logical') behave as expected.

Does gather recurse into cells, structs, and objects?

Yes. Every nested gpuArray handle inside a cell array, struct field, or object property is downloaded and replaced with host data.

What happens when I pass multiple inputs but capture a single output?

RunMat follows MATLAB: it gathers each input and returns a 1×N cell array so you can unpack values later. In multi-output assignments you must request the same number of outputs as inputs.

What if no acceleration provider is registered?

RunMat raises gather: no acceleration provider registered when you attempt to gather a gpuArray without an active provider. Register a provider (for example, via runmat-accelerate) before calling gather.

Does gather free GPU memory automatically?

No. The gpuArray remains on the device. Free the handle explicitly (by clearing the variable) if you no longer need it.

Open-source implementation

Unlike proprietary runtimes, every RunMat function is open-source. Read exactly how gather is executed, line by line, in Rust.

About RunMat

RunMat is an open-source runtime that executes MATLAB-syntax code blazing on any GPU. It is licensed under the Apache 2.0 license.

  • RunMat automatically optimizes your math for GPU execution on Apple, Nvidia, and AMD hardware. No code changes needed. Simulations that took hours now take minutes.
  • Start running code in seconds. RunMat runs in the browser, on the desktop, or from the CLI. No license server, no IT ticket.

Getting started · Benchmarks · Pricing

Download RunMat

Download RunMat for full performance, or use RunMat in your browser for zero setup.