Skip to main content

Data Streaming Framework

Build data APIs, LLM tooling, and Iceberg views easily without the data plumbing.

DataSQRL unlocks the value of your data
CREATE TABLE UserTokens (
userid INT NOT NULL,
tokens BIGINT NOT NULL,
request_time TIMESTAMP_LTZ(3) METADATA FROM 'timestamp'
);

/*+query_by_all(userid) */
TotalUserTokens := SELECT userid, sum(tokens) as tokens,
count(tokens) as requests FROM UserTokens GROUP BY userid;

UsageAlert := SUBSCRIBE SELECT * FROM UserTokens
WHERE tokens > 100000;

Integrated SQL

Implement the entire data pipeline in SQL to ingest, process, analyze, store, and serve your data.

DataSQRL unlocks the value of your data

DataSQRL Compiler

DataSQRL compiles SQL to an integrated data pipeline that runs on mature open-source technologies.

Deploy with Docker, Kubernetes, or cloud-managed services.

# Run the entire pipeline locally for quick iteration
docker run -it --rm -p 8888:8888 -v $PWD:/build \
datasqrl/cmd run usertokens.sqrl;
# Run test cases locally or in CI/CD
docker run --rm -v $PWD:/build \
datasqrl/cmd test usertokens.sqrl;
# Compile deployment assets to deploy in K8s or cloud
docker run --rm -v $PWD:/build \
datasqrl/cmd compile usertokens.sqrl;
# See compiled plan, schemas, indexes, etc
(cd build/deploy/plan; ls)

Developer Tooling

Local development, automated tests, CI/CD support, pipeline optimization, introspection, debugging - DataSQRL brings software engineering best practices to data engineers.

Get StartedLearn More
Automate Data Plumbing

Automate Data Plumbing

DataSQRL allows you to focus on your data by automating the busywork: data mapping, connector management, schema alignment, data serving, SQL dialect translation, API generation, and configuration management.

Easy to Use

Easy to Use

Implement your data pipelines with the SQL you already know. DataSQRL allows you to focus on the "what" and worry less about the "how". Develop locally, iterate quickly, and deploy with confidence.

Production Grade

Production Grade

DataSQRL compiles efficient data pipelines that run on proven open-source technologies. Out of the box data consistency, high availability, scalability, and observability.