RunMat
GitHub

All functions

CategoryMath: Signal
Auto GPU

RunMat automatically offloads this function to the GPU when it estimates a speedup, without requiring explicit gpuArray inputs.

Learn more about Auto GPU →

conv2 — Two-dimensional convolution with MATLAB-compatible padding modes.

conv2 performs two-dimensional linear convolution. By default it returns the *full* convolution (size(A) + size(B) - 1), but it can also return the *same* or *valid* regions so results match MATLAB exactly. The builtin accepts real or complex inputs, logical arrays (promoted to double), and the separable form conv2(hcol, hrow, A) that is common in image processing pipelines.

Syntax

C = conv2(A, B)
C = conv2(u, v, A)
C = conv2(___, shape)
  • A is the 2-D input matrix (real, complex, or logical; logicals are promoted to double).
  • B is the 2-D convolution kernel. It is flipped along both axes before the sum-of-products, which is what makes the operation true convolution rather than correlation.
  • conv2(u, v, A) is the separable form: u is a column vector applied down the rows of A and v is a row vector applied across the columns. This is equivalent to conv2(u(:) * v(:).', A) but runs as two 1-D passes, which is much faster whenever the effective kernel u*v' is rank-1 (box filters, Gaussians, Sobel components).
  • shape selects the output region: 'full' (default) returns an array of size size(A) + size(B) - 1; 'same' slices the central portion so the output matches size(A); 'valid' returns only the fully-overlapping region of size size(A) - size(B) + 1 (empty when B is larger than A along any dimension).

How conv2 works

  • conv2(A, B) returns the full 2-D convolution of A and B.
  • conv2(A, B, 'same') slices the central part of the full convolution so the output matches the shape of A.
  • For even-sized kernels with 'same', alignment follows MATLAB's top-left convention in each even dimension.
  • conv2(A, B, 'valid') returns only those points where B overlaps A completely.
  • conv2(hcol, hrow, A) is syntactic sugar for conv2(hcol(:) * hrow(:)', A).
  • Scalars are treated as 1×1 matrices and preserve the orientation of the other input.
  • Empty inputs follow MATLAB’s rules: conv2([], X) and conv2(X, []) return empty matrices (or zero-sized slices for 'same').
  • Logical inputs are promoted to double precision before computation; complex inputs preserve their imaginary part throughout the convolution.

How RunMat runs conv2 on the GPU

RunMat Accelerate keeps tensors on the GPU when the active provider implements a conv2d hook (the in-process provider uses the host implementation and returns a GPU handle; the WGPU backend will adopt a native kernel). When the hook is unavailable, RunMat gathers GPU inputs to the host, performs the convolution on the CPU, and returns a host tensor. Documentation and the GPU metadata make this fallback explicit so providers can add native implementations without changing this builtin.

Examples

Smoothing an image patch with a 3×3 averaging kernel

A = [1 2 3; 4 5 6; 7 8 9];
h = ones(3) / 9;
smoothed = conv2(A, h, 'same')

Expected output:

smoothed =
    1.3333    2.3333    1.7778
    3.0000    5.0000    3.6667
    2.6667    4.3333    3.1111

Computing the full convolution of two small kernels

K1 = [1 2; 3 4];
K2 = [1 1; 1 1];
C = conv2(K1, K2)

Expected output:

C =
     1     3     2
     4    10     6
     3     7     4

Extracting the same-sized result to preserve dimensions

edge = conv2([1 2 3; 4 5 6; 7 8 9], [1 0 -1; 1 0 -1; 1 0 -1], 'same')

Expected output:

edge =
    -7    -4     7
   -15    -6    15
   -13    -4    13

Valid convolution for sliding-window statistics

block = magic(4);
kernel = ones(2);
valid = conv2(block, kernel, 'valid')

Expected output:

valid =
    34    26    34
    32    34    36
    34    42    34

Using the separable form with column and row vectors

hcol = [1; 2; 1];
hrow = [1 0 -1];
A = [3 4 5; 6 7 8; 9 10 11];
gx = conv2(hcol, hrow, A, 'same')

Expected output:

gx =
    27    -6   -27
    28    -8   -28
    15    -6   -15

Convolving gpuArray inputs with transparent fallbacks

G = gpuArray(rand(128, 128));
H = gpuArray([1 2 1; 0 0 0; -1 -2 -1]);
gx = conv2(G, H, 'same');
result = gather(gx)

How RunMat validates conv2

conv2 uses an in-repo implementation for both the direct (conv2(A, B)) and separable (conv2(u, v, A)) forms. The module-level tests cover 'full', 'same', and 'valid' shapes against reference outputs. The GPU path currently defers to the CPU implementation via the in-process provider and returns a GPU handle; a native WGPU conv2d kernel is tracked as follow-up.

See Correctness & Trust for the full methodology and coverage table.

FAQ

Does conv2 support the three MATLAB shape modes?

Yes. Pass 'full', 'same', or 'valid' as the final argument and RunMat will mirror MATLAB’s output sizes and edge handling precisely.

How do I use the separable form?

Call conv2(hcol, hrow, A) (optionally with a shape argument). RunMat converts the vectors into an outer-product kernel internally so it behaves exactly like MATLAB.

What happens if one input is empty?

An empty input produces an empty output (or a zero-sized slice for 'same'). This follows MATLAB’s behaviour and avoids surprising dimension growth.

Do logical inputs work?

Yes. Logical arrays are promoted to double precision before convolution so the result is numeric.

Will the result stay on the GPU?

If the active provider exposes the conv2d hook the result stays device-resident. Otherwise RunMat falls back to the CPU path and returns a host tensor; this fallback is documented so providers can add native kernels without breaking compatibility.

What does conv2 actually compute?

— Two-dimensional convolution. For every output pixel, conv2 flips the kernel B across both axes and sums the element-wise product of B with the corresponding neighbourhood of A. If you want correlation (no flip), use filter2 instead.

When is the separable form conv2(u, v, A) faster than conv2(A, B)?

— Whenever the kernel is rank-1, i.e. B = u * v' for a column vector u and a row vector v. The separable form runs a 1-D column pass followed by a 1-D row pass, costing roughly O(n*(m+k)) operations instead of O(n*m*k) for the full 2-D kernel — a dramatic win for Gaussians, box filters, and Sobel components.

Should I use conv2, filter2, or imfilter?

— Use conv2 for true convolution (the kernel is flipped); use filter2 for correlation with the same kernel (no flip); use imfilter when you need the Image Processing Toolbox's extended boundary handling ('replicate', 'symmetric', 'circular'). All three produce the same result when the kernel is symmetric.

Signal

blackman · conv · deconv · filter · hamming · hann · sawtooth · sinc · square

Elementwise

abs · angle · complex · conj · double · exp · expm1 · factorial · gamma · hypot · imag · ldivide · log · log10 · log1p · log2 · minus · nextpow2 · plus · pow2 · power · rdivide · real · sign · single · sqrt · times

Trigonometry

acos · acosh · asin · asinh · atan · atan2 · atanh · cos · cosd · cosh · deg2rad · rad2deg · sin · sind · sinh · tan · tand · tanh

Reduction

all · any · cummax · cummin · cumprod · cumsum · cumtrapz · diff · gradient · max · mean · median · min · nnz · prod · std · sum · trapz · var

Rounding

ceil · fix · floor · mod · rem · round

Factor

chol · eig · lu · qr · svd

Solve

cond · det · inv · linsolve · norm · pinv · rank · rcond

Fft

fft · fft2 · fftshift · ifft · ifft2 · ifftshift

Interpolation

interp1 · interp2 · pchip · ppval · spline

Ode

ode15s · ode23 · ode45

Open-source implementation

Unlike proprietary runtimes, every RunMat function is open-source. Read exactly how conv2 works, line by line, in Rust.

About RunMat

RunMat is an open-source runtime that executes MATLAB-syntax code — faster, on any GPU, with no license required.

  • Simulations that took hours now take minutes. RunMat automatically optimizes your math for GPU execution on Apple, Nvidia, and AMD hardware. No code changes needed.
  • Start running code in seconds. Open the browser sandbox or download a single binary. No license server, no IT ticket, no setup.
  • A full development environment. GPU-accelerated 2D and 3D plotting, automatic versioning on every save, and a browser IDE you can share with a link.

Getting started · Benchmarks · Pricing

Try RunMat for free

Write code or describe what you want to compute. The sandbox is free, no account required.