Why pipelines matter
A lot of analytics work starts as a good query and ends as an unreliable process.
An analyst writes the logic once, copies it into a notebook, exports a file manually, changes a filter next month, and then hopes the numbers still line up. The problem is usually not the analysis. The problem is that the workflow never became repeatable.
That is exactly what SnapQL pipelines are for.
In DataLAB, a pipeline turns a multi-step analytical process into a named asset that the team can rerun, review, and improve over time.
The SnapQL pipeline structure
The canonical structure is intentionally simple:
Think about a team that produces the same reporting pack every month. The source file changes, the reporting period changes, but the workflow should not. A pipeline gives that recurring task a stable shape:
PIPELINE monthly_report(@period DEFAULT '2026-04'):
LOAD "sales.csv" AS sales WITH detect_types=true
WITH sales_period AS
SELECT *
FROM sales
WHERE period = @period
EXPORT sales_period TO BROWSER AS monthly_report_output
END PIPELINEThen when you want to run it:
CALL monthly_report('2026-04');That is the core idea. Put the logic in one place, parameterise what changes, and call it consistently.
What can go inside a pipeline?
SnapQL pipelines can orchestrate a wide mix of work:
LOADfor files and named data sources- Inline
SELECTsteps for SQL transformations TRANSFORMsteps for expression-driven column changesCREATE MODELfor supervised model trainingPREDICT USING MODELfor batch inferenceVALIDATEfor dataset or model checksRECONCILEfor finance workflowsEXPORTfor browser outputs and file outputsCALLfor composing pipelines out of other saved pipelines
That means the same syntax surface can support operational analytics, financial close processes, and predictive workflows.
Example: an analytical pipeline
Here is a more realistic example in the style we use for demonstrations:
Imagine a finance or commercial analytics team building an executive summary each month. They need one workflow that starts with the revenue register, calculates margin by product category, and produces a clean entity-level summary for management review.
PIPELINE executive_sales_summary(@period DEFAULT '2026-03'):
LOAD "revenue_register.csv" AS revenue WITH detect_types=true
WITH product_summary AS
SELECT entity,
product_category,
SUM(revenue_amount) AS total_revenue,
SUM(cost_amount) AS total_cost,
ROUND(100.0 * SUM(revenue_amount - cost_amount)
/ NULLIF(SUM(revenue_amount), 0), 2) AS margin_pct
FROM revenue
WHERE period = @period
GROUP BY entity, product_category
WITH executive_summary AS
SELECT entity,
COUNT(*) AS category_count,
SUM(total_revenue) AS revenue,
AVG(margin_pct) AS avg_margin
FROM product_summary
GROUP BY entity
EXPORT executive_summary TO BROWSER AS executive_summary_output
END PIPELINEThis is the kind of workflow that often starts life as a saved SQL file and gradually grows into something more brittle. In SnapQL, it becomes an explicit pipeline with a name, parameter, intermediate outputs, and a repeatable final export.
Example: a finance workflow
Pipelines become even more useful when the work crosses from SQL into audit and finance operations.
This is the sort of flow a controller, audit senior, or engagement team could use at month-end: bring in the ledger and bank data, scope it to the period, validate the inputs, reconcile, and immediately expose discrepancies for review.
PIPELINE monthly_close_reconciliation(@period DEFAULT '2026-03'):
LOAD "general_ledger.csv" AS gl WITH detect_types=true
LOAD "bank_statement.csv" AS bank WITH detect_types=true
WITH gl_period AS
SELECT *
FROM gl
WHERE period = @period
WITH bank_period AS
SELECT *
FROM bank
WHERE period = @period
VALIDATE gl_period WHERE amount IS NOT NULL
VALIDATE bank_period WHERE amount IS NOT NULL
RECONCILE gl_period TO bank_period
ON account_id = account_id
COMPARE amount
TOLERANCE 0.01
EXPORT discrepancies TO BROWSER AS monthly_recon_discrepancies
END PIPELINEThis is where the language starts to earn its keep. The workflow does not stop at data shaping. It continues through validation, comparison, and export.
Example: a model pipeline
Pipelines can also hold feature engineering and scoring logic around machine learning workflows.
Here the real-world scenario is usually an analytics or retention team that already has a trained model and wants a reliable scoring process for each new customer snapshot, not a one-off notebook run that someone has to remember how to reproduce next month.
PIPELINE churn_scoring(@snapshot DEFAULT '2026-04'):
LOAD "customer_snapshot.csv" AS customers WITH detect_types=true
WITH scoring_input AS
SELECT customer_id,
tenure,
monthly_charges,
total_charges,
support_calls,
contract_type
FROM customers
WHERE snapshot_month = @snapshot
PREDICT USING MODEL churn_predictor
ON scoring_input
AS churn_scores
EXPORT churn_scores TO BROWSER AS churn_scores_output
END PIPELINEThat lets a team separate model training from repeatable scoring operations while keeping both in the same language family.
Why this is better than ad hoc orchestration
The alternative is not "no pipeline." The alternative is usually a messy mix of:
- Saved SQL files
- Manual spreadsheet steps
- Copy-paste exports
- Notebook fragments
- Analyst memory about what changed last month
Pipelines give teams something much more durable:
- One named place for the workflow logic
- Parameters for period or entity changes
- Reusable intermediate outputs
- Easier review and handover
- Better fit for engagement and finance processes
That is especially valuable in desktop-first environments where analysts are doing serious data work locally and still need repeatability.
Where DataLAB fits today
DataLAB is strongest today as a desktop-first platform. The web application exists, but it is still earlier in maturity. For teams evaluating Snaplytics right now, SnapQL pipelines are best understood as part of a serious desktop analytics workflow rather than a cloud-only orchestration product.
Final take
If your team already knows SQL, SnapQL pipelines give you a practical path from one-off analysis to repeatable operational workflows.
That is the real value. Not more syntax for the sake of it. A better way to turn analysis into something the whole team can run again.