RunMat
GitHub

regexpi — Perform case-insensitive regular expression matching with MATLAB-compatible outputs.

regexpi(text, pattern) evaluates regular expression matches while ignoring case by default. Outputs mirror MATLAB: you can retrieve 1-based match indices, substrings, capture tokens, token extents, named tokens, or the text split around matches. Flags such as 'once', 'tokens', 'match', 'split', 'tokenExtents', 'names', 'emptymatch', and 'forceCellOutput' are supported, together with case toggles ('ignorecase', 'matchcase') and newline behaviour ('dotall', 'dotExceptNewline', 'lineanchors').

How regexpi works in RunMat

  • Case-insensitive matching is the default; include 'matchcase' when you need case-sensitive behaviour.
  • With one output, regexpi returns a numeric row vector of 1-based match start indices.
  • With multiple outputs, the default order is match starts, match ends, matched substrings.
  • When the input is a string array or cell array of character vectors, outputs are cell arrays whose shape matches the input container.
  • 'forceCellOutput' forces cell outputs even for scalar inputs, matching MATLAB semantics.
  • 'once' limits each element to its first match, influencing every requested output.
  • 'emptymatch','allow' keeps zero-length matches; 'emptymatch','remove' is the default filter.
  • Named tokens (using (?<name>...)) return scalar struct values per match when 'names' is requested. Unmatched names resolve to empty strings for MATLAB compatibility.

How regexpi runs on the GPU

regexpi executes entirely on the CPU. If inputs or previously computed intermediates are resident on the GPU, RunMat gathers the necessary data before evaluation and returns host-side outputs. Acceleration providers do not offer specialised hooks today; computed tensors remain on the host unless explicit GPU transfers are requested later.

Examples

Finding indices regardless of case

idx = regexpi('Abracadabra', 'a')

Expected output:

idx =
     1     4     6     8    11

Returning matched substrings ignoring case

matches = regexpi('abcXYZ123', '[a-z]{3}', 'match')

Expected output:

matches =
  1×2 cell array
    {'abc'}    {'XYZ'}

Extracting capture tokens case-insensitively

tokens = regexpi('ID:AB12', '(?<prefix>[a-z]+)(?<digits>\d+)', 'tokens');
first = tokens{1}{1};
second = tokens{1}{2}

Expected output:

first =
    'AB'
second =
    '12'

Limiting regexpi to the first match

firstMatch = regexpi('aXaXaX', 'ax', 'match', 'once')

Expected output:

firstMatch =
    'aX'

Splitting a string array without worrying about letter case

parts = regexpi(["Color:Red"; "COLOR:Blue"], 'color:', 'split')

Expected output:

parts =
  2×1 cell array
    {1×2 cell}
    {1×2 cell}

parts{2}{2}
ans =
    'Blue'

Enforcing case-sensitive matches with 'matchcase'

idx = regexpi('CaseTest', 'case', 'matchcase')

Expected output:

idx =
     []

FAQ

How are the outputs ordered when I request several?

If you do not specify explicit output flags, the default order is match starts, match ends, and matched substrings—identical to MATLAB. Providing flags such as 'match' or 'tokens' returns only the requested outputs.

Can I make regexpi behave like regexp with case sensitivity?

Yes. Include the 'matchcase' flag to disable the default case-insensitive mode. You can also pass 'ignorecase' explicitly to emphasise the default.

Does regexpi support string arrays and cell arrays?

Yes. Outputs mirror the input container shape, and each element stores results for the corresponding string or character vector.

How do zero-length matches behave?

By default ('emptymatch','remove'), zero-length matches are omitted. Use 'emptymatch','allow' to keep them, which is helpful when inspecting optional pattern components.

Does regexpi run on the GPU?

No. All matching occurs on the CPU. RunMat gathers GPU-resident inputs before processing and leaves outputs on the host. Explicit gpuArray calls are required if you want to move the results back to the GPU.

Are named tokens supported?

Yes. Use the (?<name>...) syntax and request the 'names' output flag. Each match produces a scalar struct with fields for every named group.

What happens with 'once'?

'once' restricts each input element to the first match. All requested outputs honour that limit, returning scalars instead of per-match cells.

Can I keep scalar outputs in cells?

Yes. Pass 'forceCellOutput' to wrap even scalar results in cells, which is useful when writing code that must treat scalar and array inputs uniformly.

These functions work well alongside regexpi. Each page has runnable examples you can try in the browser.

regexp, regexprep, contains, split, strfind

Open-source implementation

Unlike proprietary runtimes, every RunMat function is open-source. Read exactly how regexpi works, line by line, in Rust.

About RunMat

RunMat is an open-source runtime that executes MATLAB-syntax code — faster, on any GPU, with no license required.

  • Simulations that took hours now take minutes. RunMat automatically optimizes your math for GPU execution on Apple, Nvidia, and AMD hardware. No code changes needed.
  • Start running code in seconds. Open the browser sandbox or download a single binary. No license server, no IT ticket, no setup.
  • A full development environment. GPU-accelerated 2D and 3D plotting, automatic versioning on every save, and a browser IDE you can share with a link.

Getting started · Benchmarks · Pricing

Try RunMat — free, no sign-up

Start running MATLAB code immediately in your browser.