π§° Built-ins Reference
fimod injects the following functions and globals into every mold β no import needed.
π Regex functions (re_*)
Powered by fancy-regex β supports PCRE2 features: lookahead, lookbehind, backreferences, atomic groups.
Two functions for replacements β re_sub uses Python syntax (\1, \g<name>); re_sub_fancy uses fancy-regex syntax ($1, ${name}).
Pattern syntax differences from Python re
Only the replacement syntax differs by mode. The pattern syntax always uses fancy-regex:
- Named groups:
(?P<name>...)(same as Python) or(?<name>...) - Advanced features: atomic groups
(?>...), possessive quantifiersa++β not in Pythonre - Flags: inline
(?i),(?m),(?s)β no separatere.IGNORECASEetc.
Match result format
re_search and re_match return a dict (or None):
{
"match": "full match text",
"start": 0, # byte offset
"end": 5, # byte offset
"groups": ["group1", "group2", ...], # numbered capture groups (1..N)
"named": {"name": "value", ...} # named groups, or None if no named groups
}
groups is always present (empty list if no capture groups). named is None unless the pattern uses (?P<name>...).
Function reference
| Function | Signature | Returns |
|---|---|---|
re_search |
re_search(pattern, text) |
Match dict (see above) or None |
re_match |
re_match(pattern, text) |
Match dict or None β anchored to start of text |
re_findall |
re_findall(pattern, text) |
Python-style: see below |
re_sub |
re_sub(pattern, replacement, text [, count]) |
str β Python syntax: \1, \g<name> |
re_sub_fancy |
re_sub_fancy(pattern, replacement, text [, count]) |
str β fancy-regex syntax: $1, ${name} |
re_split |
re_split(pattern, text) |
[str, ...] β captured groups included |
re_findall β Python-style group behaviour
| Pattern groups | Returns | Example |
|---|---|---|
| No groups | ["match1", "match2", ...] |
re_findall(r"\d+", "a1b2") β ["1", "2"] |
| 1 group | ["group1_val", ...] |
re_findall(r"(\d+)@", "1@2@") β ["1", "2"] |
| N groups | [["g1", "g2"], ...] |
re_findall(r"(\w+)=(\d+)", "a=1 b=2") β [["a","1"], ["b","2"]] |
re_sub / re_sub_fancy β count and syntax
Both functions replace all occurrences by default. Pass an optional count to limit substitutions.
re_sub(r"\d+", "X", "a1b2c3") # β "aXbXcX" (all)
re_sub(r"\d+", "X", "a1b2c3", 1) # β "aXb2c3" (first only)
re_sub β Python re syntax in replacements:
re_sub(r"(\w+)@(\w+)", r"\2/\1", "user@host") # β "host/user"
re_sub(r"(?P<u>\w+)@(?P<d>\w+)", r"\g<d>/\g<u>", "a@b") # β "b/a"
re_sub_fancy β fancy-regex syntax ($1, ${name}):
re_sub_fancy(r"(\w+)@(\w+)", "$2/$1", "user@host") # β "host/user"
re_sub_fancy(r"(\w+)@(\w+)", "$2/$1", "user@host", 1) # first only
re_split β captured groups included
When the pattern has capture groups, captured text is included in the result (same as Python re.split):
re_split(r"([,;])\s*", "a, b;c") # β ["a", ",", "b", ";", "c"]
re_split(r"[,;]\s*", "a, b;c") # β ["a", "b", "c"] (no groups = no extras)
ποΈ Dotpath functions (dp_*)
Navigate and mutate nested structures using dot-separated paths.
| Function | Signature | Returns |
|---|---|---|
dp_get |
dp_get(data, path) |
Value at path, or None if not found |
dp_get |
dp_get(data, path, default) |
Value at path, or default if not found |
dp_set |
dp_set(data, path, value) |
New deep copy of data with value at path |
dp_has |
dp_has(data, path) |
True if the path resolves, False otherwise |
dp_delete |
dp_delete(data, path) |
New deep copy with the key/index at path removed |
Path syntax
| Segment | Meaning | Example |
|---|---|---|
| Text | dict key | "user.address.city" |
| Integer | array index | "items.0" (first), "items.-1" (last) |
Tip
Missing intermediate keys or out-of-range indices return None for dp_get. dp_set creates missing intermediate keys automatically. dp_delete is a silent no-op when the path is missing; it shifts array elements (no null holes) and rejects empty paths.
π Iteration helpers (it_*)
Convenience functions for common list/dict operations not natively supported by Monty.
| Function | Signature | Returns |
|---|---|---|
it_keys |
it_keys(dict) |
List of keys |
it_values |
it_values(dict) |
List of values |
it_flatten |
it_flatten(array) |
Recursively flattened list |
it_group_by |
it_group_by(array, key) |
Dict of lists, grouped by field name (insertion order) |
it_sort_by |
it_sort_by(array, key [, reverse]) |
Sorted list by field name (stable sort); pass True for descending |
it_unique |
it_unique(array) |
Deduplicated list (first occurrence kept) |
it_unique_by |
it_unique_by(array, key) |
Deduplicated by field name (first occurrence kept) |
it_count_by |
it_count_by(array, key) |
Dict of counts, grouped by field name (insertion order) |
it_min_by |
it_min_by(array, key) |
Element with smallest field value, or None if empty |
it_max_by |
it_max_by(array, key) |
Element with largest field value, or None if empty |
Field name, not lambda
it_group_by, it_sort_by, it_unique_by, it_count_by, it_min_by, and it_max_by take a field name string β not a lambda.
it_flatten is recursive
[1, [2, [3, 4]]] β [1, 2, 3, 4]
Ties in min_by / max_by
When multiple elements share the extremum, the first one is returned (stable).
#οΈβ£ Hash functions (hs_*)
| Function | Signature | Returns |
|---|---|---|
hs_md5 |
hs_md5(text) |
MD5 hex digest (lowercase) |
hs_sha1 |
hs_sha1(text) |
SHA-1 hex digest (lowercase) |
hs_sha256 |
hs_sha256(text) |
SHA-256 hex digest (lowercase) |
All functions accept a single string and return a lowercase hex string.
π Template functions (tpl_*)
Dataβtext generation using Jinja2 templates (via MiniJinja). Extends Fimod's dataβdata pipeline to dataβtext for generating configs, reports, Dockerfiles, k8s manifests, etc.
| Function | Signature | Returns |
|---|---|---|
tpl_render_str |
tpl_render_str(template, ctx, auto_escape=False) |
Rendered string |
tpl_render_from_mold |
tpl_render_from_mold(path, ctx, auto_escape=False) |
Rendered string |
tpl_render_str(template, ctx, auto_escape=False) β Render a Jinja2 template string with a context dict. All built-in Jinja2 filters (upper, join, tojson, β¦), loops, conditions, and macros are available.
def transform(data, args, env, headers):
return tpl_render_str("""
FROM python:{{ python_version }}-slim
{% for pkg in packages %}
RUN uv pip install {{ pkg }}
{% endfor %}
""", data)
echo '{"python_version":"3.12","packages":["flask","requests"]}' \
| fimod s -e 'tpl_render_str("Hello {{ name }}!", data)' --output-format txt
tpl_render_from_mold(path, ctx, auto_escape=False) β Load a .j2 file relative to the mold's directory and render it. Works with directory molds and registry molds. Enables clean separation of logic (Python) and presentation (Jinja2).
# my_mold/my_mold.py
def transform(data, args, env, headers):
tpl = args.get("template", "Dockerfile.j2")
return tpl_render_from_mold(f"templates/{tpl}", data)
Note
tpl_render_from_mold requires a file-based or registry mold β it cannot be used with inline expressions (-e). Path traversal outside the mold directory is blocked for security.
Set auto_escape=True when generating HTML to automatically escape <, >, &, etc.
π’ Message functions (msg_*)
Output diagnostic messages to stderr without affecting the data pipeline. All functions take a single string and return None.
Which functions produce output depends on the --quiet / --msg-level flags:
| Function | Stderr output | Visible by default | --quiet |
--msg-level=verbose |
--msg-level=trace |
|---|---|---|---|---|---|
msg_print |
text |
β | β | β | β |
msg_info |
[INFO] text |
β | β | β | β |
msg_warn |
[WARN] text |
β | β | β | β |
msg_error |
[ERROR] text |
β | β | β | β |
msg_verbose |
[VERBOSE] text |
β | β | β | β |
msg_trace |
[TRACE] text |
β | β | β | β |
def transform(data, args, env, headers):
msg_verbose(f"Input has {len(data)} records")
missing = [r for r in data if not r.get("email")]
if missing:
msg_warn(f"{len(missing)} records without email")
msg_trace(f"First record: {data[0]}")
return data
π‘οΈ Gatekeeper functions (gk_*)
Validation helpers for asserting conditions and controlling pipeline failure. Work with set_exit() β gk_fail and gk_assert set exit code to 1.
| Function | Signature | Behavior |
|---|---|---|
gk_fail |
gk_fail(msg) |
Emit [ERROR] msg to stderr, set exit code to 1 |
gk_assert |
gk_assert(condition, msg) |
If condition is falsy β gk_fail(msg) |
gk_warn |
gk_warn(condition, msg) |
If condition is falsy β [WARN] msg to stderr (no exit) |
gk_assert and gk_warn use Python-style truthiness: None, False, 0, 0.0, "", [] are falsy.
def transform(data, args, env, headers):
gk_assert(data.get("version"), "missing 'version' field")
gk_warn(len(data.get("items", [])) > 0, "items list is empty")
if data.get("coverage", 0) < 80:
gk_fail(f"Coverage {data['coverage']}% below 80% threshold")
return data
Tip
The mold continues executing after gk_fail / gk_assert β this lets you collect multiple errors in one run. The exit code is set to 1 at process exit.
π Environment substitution (env_subst)
| Function | Signature | Returns |
|---|---|---|
env_subst |
env_subst(template, dict) |
str β template with ${VAR} placeholders replaced |
Unknown variables are left as-is (standard envsubst behavior). Only ${VAR} syntax is supported ($VAR without braces is not substituted).
def transform(data, args, env, headers):
url = env_subst("https://${HOST}:${PORT}/api", env)
return {"url": url, "data": data}
π¦ Exit control
| Function | Signature | Returns |
|---|---|---|
set_exit |
set_exit(code) |
None |
Sets the process exit code from inside a mold. code is an integer 0β255. The mold continues executing to completion after the call.
See Exit Codes for the interaction with --check.
π Format control
| Function | Signature | Returns |
|---|---|---|
set_input_format |
set_input_format(name) |
None |
cast_input_format |
cast_input_format(name, value) |
value |
set_output_format |
set_output_format(name) |
None |
set_input_format(name) β re-parses the output of the current step as the given format before feeding it as input to the next step. Useful with --input-format http to re-parse a string body as JSON, CSV, etc.
cast_input_format(name, value) β same as set_input_format but returns value. Useful as a single-expression one-liner when both the format hint and the return value are needed.
set_output_format(name) β overrides the final output format (like a dynamic --output-format). Also accepts "raw" for binary pass-through (HTTP downloads).
# Fetch raw HTTP response, then re-parse body as JSON
fimod s -i https://jsonplaceholder.typicode.com/todos/1 \
--input-format http \
-e 'cast_input_format("json", data["body"])' \
-e 'data["title"]' --output-format txt
Supported format names for set_input_format: json, ndjson, yaml, toml, csv, txt, lines, http.
set_output_format additionally accepts "raw" (binary pass-through, requires --input-format http).
π§© Transform parameters
All mold scripts receive four parameters: def transform(data, args, env, headers).
args
Dict of --arg name=value pairs. Empty dict {} when no --arg is passed:
def transform(data, args, env, headers):
limit = int(args["threshold"])
prefix = args.get("prefix", "") # with default
return [u for u in data if u["name"].startswith(prefix) and u["age"] > limit]
env
Dict of filtered environment variables. Populated by --env PATTERN (glob patterns, comma-separated, repeatable). Empty dict {} when no --env is passed:
fimod s -i data.json --env 'HOME,USER' -e 'env["HOME"]'
fimod s -i data.json --env 'GITHUB_*' -e 'env'
fimod s -i data.json --env '*' -e 'env.get("CI", "false")'
headers
List of CSV column names when the input format is CSV with a header row. None for non-CSV input or when using --csv-no-input-header: