RunMat
GitHub

trace — Sum main-diagonal elements of matrix and matrix-like inputs with MATLAB-compatible numeric behavior.

trace(A) returns the sum of values on the main diagonal of A for square or rectangular matrix-like inputs. It follows MATLAB-compatible typing and complex-value behavior across supported containers.

Syntax

t = trace(A)

Inputs

NameTypeRequiredDefaultDescription
AAnyYesInput matrix-like value.

Returns

NameTypeDescription
tNumericArrayDiagonal-sum trace result.

Errors

IdentifierWhenMessage
RunMat:trace:InvalidInputInput is unsupported or not matrix-shaped.trace: input must be 2-D
RunMat:trace:InternalRuntime cannot materialize or transport trace results.trace: internal runtime failure

How trace works

  • Operates on the leading two dimensions. Higher dimensions must be singleton; otherwise an error is raised.
  • Works for non-square matrices by summing up to min(size(A, 1), size(A, 2)).
  • Scalars (real or complex) return their own value.
  • Logical inputs are promoted to double precision (true → 1.0, false → 0.0).
  • Complex inputs retain both real and imaginary parts in the result.
  • Empty matrices yield 0. Empty complex matrices yield 0 + 0i.
  • gpuArray inputs stay on the device when the provider implements diagonal extraction and sum reductions; otherwise RunMat gathers once, computes on the host, and uploads a 1×1 scalar.

Does RunMat run trace on the GPU?

1. When the input already lives on the GPU and the active provider exposes both diag_extract and reduce_sum, RunMat extracts the diagonal on device and performs the reduction there, returning a 1×1 gpuArray that stays resident for downstream work. 2. If either hook is missing or the provider declines (unsupported precision, shape, or size), RunMat gathers the matrix exactly once, computes the diagonal sum on the CPU, and uploads the scalar back to the provider so subsequent GPU-friendly code keeps running on device memory. 3. Mixed-residency calls automatically upload host matrices before these steps, matching MATLAB's gpuArray behaviour while letting the auto-offload planner decide which tier benefits the most.

GPU memory and residency

You usually do NOT need to call gpuArray yourself in RunMat (unlike MATLAB).

The auto-offload planner keeps residency on the GPU when expressions benefit from it. When the active provider exposes both diag_extract and reduce_sum, trace executes entirely on the GPU. If either hook is missing, RunMat performs a single gather, computes the scalar on the CPU, and uploads a 1×1 result back to the device so downstream fused expressions continue to operate on GPU data.

To preserve backwards compatibility with MathWorks MATLAB—and for situations where you want to explicitly manage residency—you can wrap inputs with gpuArray. This mirrors MATLAB while still letting RunMat's planner decide whether the GPU offers an advantage for the surrounding code.

Examples

Summing the diagonal of a square matrix

A = [1 2 3; 4 5 6; 7 8 9];
t = trace(A)

Expected output:

t = 15

Computing the trace of a rectangular matrix

B = [4 2; 1 3; 5 6];
result = trace(B)

Expected output:

result = 7

Getting the trace of a triangular matrix

U = [4 1 2; 0 5 3; 0 0 6];
tri_trace = trace(U)

Expected output:

tri_trace = 15

Working with complex-valued matrices

Z = [1+2i 2; 3 4-5i];
zTrace = trace(Z)

Expected output:

zTrace = 5.0000 - 3.0000i

Tracing a gpuArray without gathering

G = gpuArray(rand(1024));
gpuResult = trace(G);     % stays on the GPU
scalarHost = gather(gpuResult)

Handling empty matrices safely

E = zeros(0, 5);
value = trace(E)

Expected output:

value = 0

Using trace with coding agents

Open a RunMat example with live inputs, then ask the agent to explain how trace changes the result.

Run a small trace example, explain the result, then change one input and compare the output.

FAQ

What happens if my matrix is not square?

trace sums along the main diagonal up to min(m, n), matching MATLAB behaviour for rectangular matrices.

Does trace accept higher-dimensional arrays?

Only when trailing dimensions are singleton. Otherwise it raises an error because MATLAB restricts trace to 2-D matrix slices.

How are logical inputs handled?

Logical values are promoted to double precision (0.0 or 1.0) before summing, mirroring MATLAB semantics.

What is returned for empty inputs?

Empty real matrices produce 0; empty complex matrices produce 0 + 0i, exactly like MATLAB.

Does the result stay on the GPU?

Yes, when the provider implements the required hooks. Otherwise RunMat re-uploads the scalar so later GPU-friendly code still sees a gpuArray.

Can I call trace on complex data?

Absolutely. The result is a complex scalar containing the sum of the diagonal's real and imaginary parts.

Is there any precision loss with large matrices?

trace accumulates in double precision (f64), matching MATLAB's default numeric type.

Does trace modify the input matrix?

No. It reads the diagonal and returns a new scalar without altering the original matrix or its residency.

How does trace interact with sparse matrices?

Sparse support is planned; current releases operate on dense arrays. Inputs are treated as dense matrices.

Can I rely on trace inside fused GPU expressions?

Fused kernels treat trace as a scalar reduction boundary. The planner emits GPU kernels when hooks are available; otherwise it falls back gracefully.

Factor

chol · eig · lu · qr · svd

Solve

cond · det · inv · linsolve · norm · pinv · rank · rcond · rref

Open-source implementation

Unlike proprietary runtimes, every RunMat function is open-source. Read exactly how trace is executed, line by line, in Rust.

About RunMat

RunMat is an open-source runtime that executes MATLAB-syntax code blazing on any GPU. It is licensed under the Apache 2.0 license.

  • RunMat automatically optimizes your math for GPU execution on Apple, Nvidia, and AMD hardware. No code changes needed. Simulations that took hours now take minutes.
  • Start running code in seconds. RunMat runs in the browser, on the desktop, or from the CLI. No license server, no IT ticket.

Getting started · Benchmarks · Pricing

Download RunMat

Download RunMat for full performance, or use RunMat in your browser for zero setup.