NFQL
NFQL is the language you type into the Query page of the GUI
(or obserae-cli query on the command line) to investigate
traffic. It is a small, pipeline-oriented language designed to
read like a sentence and to integrate with your cartography.
FROM flows | WHERE ip == "production" AND protocol == TCP | LIMIT 100
Reads as: “From the flows table, keep rows whose source or destination IP is somewhere in the production group and whose protocol is TCP, then return at most 100 rows.”
This page is a hands-on guide covering the syntax, the operators, and a library of copy-able recipes.
Running a query
Web GUI
Open the Query page (sidebar). Type your pipeline, press
Ctrl+Enter (or click Run). Results render in a sortable table
below.
CLI
# Plain table
obserae-cli query 'FROM flows | LIMIT 5'
# Bind values into ? placeholders
obserae-cli query --arg 443 'FROM flows | WHERE dst_port == ? | LIMIT 5'
# JSON (one object per row)
obserae-cli query --json 'FROM flows | KEEP src_addr, bytes | LIMIT 3'
The pipeline model
Every query is a chain of stages separated by |. Each stage
transforms the rows produced by the previous one:
FROM flows # source — always required, always first
| WHERE protocol == TCP # filter
| KEEP src_addr, dst_addr # narrow the column set
| SORT bytes DESC # order rows
| LIMIT 50 # cap row count
Stages run left-to-right, like Unix pipes. A WHERE after a
KEEP only sees the kept columns.
Stages
| Stage | Purpose |
|---|---|
FROM | Source table. |
WHERE | Filter rows by a boolean predicate. |
KEEP | Project to the listed columns, in the listed order. |
DROP | Remove the listed columns, keep the rest in their original order (reciprocal of KEEP). |
RENAME | Rename columns in place: RENAME new = old, … (new name on the left). |
SORT | Order rows by one or more keys (ASC default, DESC available). |
LIMIT | Cap the number of rows. |
LAST | Time window: rows from the last N seconds. |
BETWEEN | Time window: explicit <from> / <to> bounds. |
STATS | Aggregate: STATS <alias> = <fn>(<arg>) [, …] [BY <col> …]. |
EVAL | Compute new columns: EVAL <alias> = <expr>. |
HAVING | Filter after STATS (the post-aggregation analogue of WHERE). |
PIVOT | Cross-pipeline semi-join (see Cross-pipeline lookups). |
PIVOT NOT | Anti-semi-join. |
JOIN | Inner equi-join. |
SORT and LIMIT may appear once each. WHERE, KEEP, LAST,
BETWEEN may appear multiple times; their effects intersect.
Tables
| Table | Purpose | Time column |
|---|---|---|
flows | Append-only, one row per NetFlow record. | time_received |
sessions | Bidirectional, role-aware consolidation, per exporter. Carries correlation_id. See Sessions. | opened_at |
sessions_consolidated | One row per conversation: the per-exporter sessions of the same 5-tuple merged. min/max volumes across exporters + coherence_pct. | opened_at |
session_matches | One row per (session, rule) hit. JOIN with sessions to recover endpoints. | matched_at |
enrichment_ranges | CIDR catalogue (cloud + threat). WITHIN-join a flow set against cidr. See IP Enrichment. | fetched_at |
enrichment_ips | Insert-time dimension: one row per (ip, source) resolved at ingest. Equi-join on the IP — see IP Enrichment. | resolved_at |
arp | ARP observations from supported device connectors, currently OPNsense. Use them to attach MAC/hostname context to traffic. Equi-join/PIVOT on ip. See Device Connectors. | timestamp |
dhcp | DHCP lease observations from supported device connectors, currently OPNsense. Resolves dynamic IPs to hostname/MAC context. Equi-join/PIVOT on ip. See Device Connectors. | timestamp |
LAST and BETWEEN always filter on the listed time column —
syntax stays the same regardless of which table you query.
Most-used columns
flows: src_addr, dst_addr, src_port, dst_port,
protocol, bytes, packets, tcp_flags, time_received,
time_flow_start, time_flow_end, sampler_address.
sessions: session_id, ip_a/port_a, ip_b/port_b,
protocol, state (active/half_open/closed),
opened_at, closed_at, close_reason, server_ip,
server_port, role_method, role_conf,
ab_bytes/ba_bytes, ab_pkts/ba_pkts, correlation_id
(conversation group; NULL while open).
sessions_consolidated: correlation_id, client_ip/client_port,
server_ip/server_port, protocol, session_count, sampler_count,
samplers, coherence_pct (0–100),
min_client_to_server_bytes/max_client_to_server_bytes (and _packets,
and the server_to_client_ pair), opened_at, closed_at. Endpoints are
client/server oriented (not the raw sessions’ canonical a/b); the
virtual ip/port cover both sides.
session_matches: session_id, rule_id, matched_at.
Predicates in WHERE
Comparisons
FROM flows | WHERE bytes > 1000000
FROM flows | WHERE protocol == 6 # int eq
FROM flows | WHERE src_port < 1024
FROM flows | WHERE dst_port >= 5000
Operators: ==, !=, <, <=, >, >=.
For CIDR containment on INET columns, use the explicit WITHIN
operator:
FROM sessions | WHERE server_ip WITHIN "10.0.0.0/8"
Reads “left value is contained by right value”. Both operands must
be INET. Note: == against a CIDR is strict equality (the
column must equal that exact IP/CIDR), not containment. Use
WITHIN for containment.
Logical operators
FROM flows | WHERE protocol == TCP AND dst_port == 443
FROM flows | WHERE dst_port == 80 OR dst_port == 443
FROM flows | WHERE NOT (protocol == TCP)
FROM flows | WHERE (dst_port == 80 OR dst_port == 443) AND protocol == TCP
Precedence (tightest first): NOT, comparisons, AND, OR. Use
parentheses when in doubt.
Set membership
FROM flows | WHERE dst_port IN (80, 443, 8080)
FROM flows | WHERE protocol IN (TCP, UDP)
Ergonomic features
Named protocol constants
| Constant | Value | Meaning |
|---|---|---|
TCP | 6 | TCP |
UDP | 17 | UDP |
ICMP | 1 | ICMP (IPv4) |
ICMPv6 | 58 | ICMP (IPv6) |
SCTP | 132 | SCTP |
GRE | 47 | GRE |
Case-insensitive. Resolve to their integer value:
FROM flows | WHERE protocol == tcp # same as protocol == 6
FROM flows | WHERE protocol IN (TCP, UDP)
FROM flows | WHERE protocol == "TCP" # string form also works
Virtual columns
Two columns expand to a disjunction over their src/dst pair:
| Virtual | Expands to |
|---|---|
ip | src_addr OR dst_addr |
port | src_port OR dst_port |
FROM flows | WHERE ip == "10.0.0.1"
# rewritten as: src_addr == "10.0.0.1" OR dst_addr == "10.0.0.1"
FROM flows | WHERE port == 443
# rewritten as: src_port == 443 OR dst_port == 443
!= flips the connector to AND (De Morgan): ip != X means
neither src_addr nor dst_addr equal X. IN distributes over
both sides.
On sessions, ip and port expand to ip_a/ip_b and
port_a/port_b respectively — same syntax, same semantics.
Cartography references
When you compare any INET column with a string literal, NFQL
picks the right interpretation:
| String | Meaning |
|---|---|
"192.168.1.1" | The exact IP (IPv4 or IPv6, e.g. "2001:db8::1") |
"10.0.0.0/8" | The exact CIDR (IPv4 or IPv6, e.g. "2001:db8::/64"; use WITHIN for containment) |
"any4" / "any6" | Reserved: every IP of one family (0.0.0.0/0 or ::/0) |
"internet4" / "internet6" | Public unicast of one family (excludes RFC1918 / ULA, loopback, link-local, CGNAT, multicast, reserved) |
"internal4" / "internal6" | Non-routable IPs of one family — the exact complement of "internet4" / "internet6" (RFC1918 / ULA, loopback, link-local, CGNAT, multicast, reserved, mapped) |
"network:NAME" | The named network’s CIDR |
"NAME.dhcp" | The network’s DHCP pool only (needs a DHCP range) |
"NAME.static" | The network’s CIDR minus its DHCP pool |
"host:NAME" | Every interface IP of the host |
"host:NAME:IFACE" | One specific interface IP |
"group:NAME" | Every IP of every member host (recursive) |
"NAME" (bare) | Looked up across networks/hosts/groups |
FROM flows | WHERE src_addr == "internet4" AND dst_addr == "loadbalancers"
FROM flows | WHERE ip IN ("10.0.0.0/8", "host:proxy", "group:backends")
FROM sessions | WHERE ip == "host:proxy"
FROM sessions | WHERE server_ip == "group:databases" AND server_port == 5432
# the dynamic side vs the fixed side of a DHCP segment
FROM sessions | WHERE ip == "office.dhcp"
FROM flows | WHERE src_addr == "office.static"
The lookup is dynamic — if the cartography changes between two runs of the same compiled query, the next run reflects the new state.
IPv6 addresses use the same quoted-string syntax as IPv4 — there is no special form:
FROM flows | WHERE ip == "2001:db8::1"
FROM sessions | WHERE server_ip WITHIN "2001:db8::/64"
The reserved keywords are family-specific: "internet4",
"internal4" and "any4" only match IPv4 addresses; "internet6",
"internal6" and "any6" only IPv6. To match both families in one
query, combine them — e.g. WHERE ip == "internet4" OR ip == "internet6".
"internal4" / "internal6" are the exact complement of their
internet twin: every routable v4 IP satisfies "internet4", every
non-routable v4 IP satisfies "internal4", and the two never overlap.
Use it to write “anything on the LAN” without spelling out every
RFC1918 / loopback / link-local CIDR.
Time windows
LAST N — last N seconds
FROM flows | LAST 60 # last minute
FROM flows | LAST 3600 # last hour
FROM flows | LAST 86400 # last day
BETWEEN <from> AND <to>
Both sides accept:
| Form | Meaning |
|---|---|
| Positive integer | Absolute Unix timestamp in seconds |
| Negative integer | -N = N seconds before now |
| RFC 3339 string | "2026-05-01T12:00:00Z", with TZ offsets allowed |
"YYYY-MM-DD" | Date-only, midnight UTC |
* | Open: no constraint on this side |
now | Right-side alias for * |
? | Bound at run time, sign-based dispatch |
FROM flows | BETWEEN -3600 AND -60 # 1h ago to 1min ago
FROM flows | BETWEEN -3600 AND * # last hour
FROM flows | BETWEEN "2026-05-01" AND "2026-05-02"
Time stages compose by intersection — mixing LAST with BETWEEN,
or several BETWEENs, narrows the window.
Parameters (?)
Use ? placeholders for runtime values. They are positional,
left-to-right. The CLI binds them via --arg:
obserae-cli query \
--arg 443 \
--arg "10.0.0.0/8" \
'FROM flows | WHERE dst_port == ? AND src_addr == ?'
Why use them?
- Safety — values never become part of the SQL string. User input can’t change the query’s structure.
- Reuse — the same compiled query runs with different inputs without recompilation.
Aggregation: STATS, EVAL, HAVING
STATS folds rows into groups and applies aggregate functions:
STATS <alias> = <fn>(<arg>) [, <alias> = <fn>(<arg>)]* [BY <col> [, <col>]*]
<alias>is mandatory — the output column is named after it.<fn>is one ofCOUNT,COUNT_DISTINCT,SUM,AVG,MIN,MAX, and the statistical functionsSTDDEV,MEDIAN,PERCENTILE(col, n)(n = the percentile rank, an integer1..99—PERCENTILE(out, 95)is the 95th percentile). These three return fractionalDOUBLEvalues, which you can filter with a decimal threshold inHAVING(e.g.... | HAVING sd > 20.5) — NFQL supports decimal literals like20.5or0.95alongside integers.- A
BYterm can be a column orBUCKET(time_col, seconds)for a time histogram — e.g.... BY ip_a, BUCKET(opened_at, 60)(the bucket column is namedbucket). - The output schema is the
BYcolumns followed by the aliases. Original columns are dropped — downstream sees the new shape.
# Top 20 talker pairs by volume, last hour
FROM flows
| LAST 3600
| STATS bytes_total = SUM(bytes), n = COUNT(*) BY src_addr, dst_addr
| SORT bytes_total DESC
| LIMIT 20
# Overall traffic in one row
FROM flows
| LAST 3600
| STATS total_bytes = SUM(bytes),
n_flows = COUNT(*),
uniq_srcs = COUNT_DISTINCT(src_addr)
HAVING filters the aggregated rows:
FROM flows
| LAST 3600
| STATS n = COUNT(*) BY src_addr
| HAVING n > 1000
| SORT n DESC
EVAL adds computed columns before or after STATS:
FROM flows
| LAST 3600
| STATS bytes = SUM(bytes), pkts = SUM(packets) BY dst_addr
| EVAL bytes_per_pkt = bytes / pkts
| SORT bytes_per_pkt DESC
Cross-pipeline lookups
The > operator chains two pipelines. The right pipeline can
filter (or extend) its rows against a column of the left pipeline.
Use this whenever you have a signal in one table and want to look
up the matching rows in another.
| Keyword | Result keeps | Use it for… |
|---|---|---|
PIVOT | current columns only | “rows of B that match a row of A” |
PIVOT NOT | current columns only | “rows of B that match no row of A” |
JOIN | current columns + prev_* | “rows of B paired with their A row, both visible” |
Either side of the predicate can be a virtual column (ip, port):
it matches any of its real columns. So … > FROM flows | JOIN ip == ip
pairs a flow whose source or destination equals the other pipeline’s
ip — handy to find every flow touching a threat IP regardless of
direction.
Examples
Top 10 sessions by volume → which rules they matched
FROM sessions | LAST 3600 | SORT ab_bytes DESC | LIMIT 10
> FROM session_matches | PIVOT session_id == session_id
| KEEP rule_id, matched_at
Sessions that matched NO rule — the canonical anomaly query:
FROM session_matches | LAST 3600
> FROM sessions | LAST 3600 | WHERE state == "closed"
| PIVOT NOT session_id == session_id
| KEEP ip_a, ip_b, server_ip, role_method
Drill all the way down: matches → sessions → raw flows
FROM session_matches | LAST 3600
> FROM sessions | PIVOT session_id == session_id
> FROM flows | PIVOT last_flow_id == flow_id
| KEEP src_addr, dst_addr, bytes, packets
| LIMIT 100
Note.
sessions.last_flow_idis present for this contract but always NULL under the in-memory sessionizer (a flow’s id is assigned when it is stored, after it has been folded), so the last PIVOT step currently returns no rows. To reach the raw flows behind a session today, pivot on the endpoints and the session’s time window instead.
Sessions whose server is in an AWS range (CIDR catalogue, WITHIN):
FROM enrichment_ranges | WHERE source == "aws" | KEEP cidr
> FROM sessions | PIVOT server_ip WITHIN cidr
| LAST 3600
| KEEP ip_a, ip_b, server_ip, ab_bytes
Sessions talking to a known threat IP (insert-time table, ==) —
an equi-join, so it stays fast on large result sets:
FROM enrichment_ips | WHERE nature == "threat" | KEEP ip
> FROM sessions | PIVOT ip == server_ip
| LAST 3600
| KEEP ip_a, ip_b, server_ip, ab_bytes
Recipes
The recipes below assume you have imported a typical cartography
with loadbalancers, backends, databases, bastion, proxy
entities.
Recent traffic to the load balancers on HTTPS
FROM flows
| LAST 3600
| WHERE ip == "loadbalancers" AND port == 443 AND protocol == TCP
| KEEP src_addr, dst_addr, bytes
Backends reaching the internet without the proxy
FROM flows
| WHERE src_addr == "backends"
AND (dst_addr == "internet4" OR dst_addr == "internet6")
AND dst_addr != "host:proxy:eth1"
Top 10 talkers to the database tier
FROM flows
| LAST 3600
| WHERE dst_addr == "databases"
| STATS bytes = SUM(bytes) BY src_addr
| SORT bytes DESC
| LIMIT 10
SSH attempts that bypassed the bastion
FROM flows
| WHERE port == 22
AND src_addr != "bastion"
AND dst_addr != "bastion"
AND protocol == TCP
Postgres traffic outside the data network
FROM flows
| WHERE port == 5432
AND ip != "network:data"
AND protocol == TCP
TCP scans surfaced by the half-open detector
FROM sessions
| WHERE close_reason == "no_reply" AND protocol == TCP
| KEEP ip_a, ip_b, port_b, opened_at
| SORT opened_at DESC
Recent sessions to the database tier
FROM sessions is closed-only — open sessions live in the sessionizer’s
RAM and never reach the table, so scope with LAST instead of a
state == "active" filter. (For live/open counts, watch the cockpit
gauges.)
FROM sessions
| LAST 3600
| WHERE ip == "databases" AND port == 5432
| KEEP ip_a, ip_b, ab_pkts, ba_pkts, opened_at
| SORT opened_at DESC
Closed sessions in an incident window
FROM sessions
| BETWEEN "2026-05-01T12:00:00Z" AND "2026-05-01T12:30:00Z"
| WHERE ip == "production" AND state == "closed"
| KEEP ip_a, ip_b, server_ip, server_port, role_method
| SORT ab_bytes DESC
| LIMIT 200
Conversations seen by more than one exporter
FROM sessions_consolidated
| WHERE sampler_count > 1
| KEEP client_ip, server_ip, samplers, min_client_to_server_bytes, max_client_to_server_bytes, coherence_pct
| SORT coherence_pct ASC
A low coherence_pct means the exporters disagree on volume —
expected under sampling, drops, or partial visibility on one side.
Reading error messages
NFQL errors carry a line:column position pointing at the faulty
token in your source.
| Stage | Example | Fix |
|---|---|---|
lex | lex: 1:14: unexpected character '@' | Drop the stray byte. |
parse | parse: 1:18: expected an expression, got "AND" | Missing operand before AND. |
sem | sem: 1:18: unknown column "prtocol" | Typo on a column name. |
sem | sem: 1:32: host "srv-typoo": not found | Cartography typo, fix in query or topology. |
sem | sem: 1:18: virtual column "ip" does not support < | Use src_addr / dst_addr explicitly. |
In the GUI, errors are shown inline under the offending token; in
the CLI, they print to stderr with the line:column prefix.