Matthias Broecheler

CEO of DataSQRL

View all authors

DataSQRL 0.7 Release: The Data Delivery Interface

July 27, 2025 · 3 min read

Matthias Broecheler

CEO of DataSQRL

DataSQRL 0.7 marks a major milestone in our journey to automate data pipelines, thanks to significant improvements to the serving layer:

Support for the Model Context Protocol (MCP) for tooling and resource access
REST API support
JWT-based authentication and authorization

These features enable developers to build a wide range of production-ready data interfaces. This release also includes performance and configuration improvements to the serving layer of DataSQRL-generated pipelines.

You can find the full release notes and source code on our GitHub release page. To update your local installation of DataSQRL, simply pull the latest Docker image:

docker pull datasqrl/cmd:0.7.0

The Last Mile: Data Delivery

Data delivery is the final and most visible stage of any data pipeline. It's how users, applications, and AI agents actually access and consume data. Most enterprise data interactions happen through APIs, making the delivery interface a critical component. At DataSQRL, we've invested heavily in automating the upstream parts of the pipeline: from Flink-powered data processing to Postgres-backed storage. With version 0.7, we turn our focus to the serving layer: introducing support for the Model Context Protocol (MCP) and REST APIs, as well as JWT-based authentication and authorization. These additions ensure seamless integration with most authentication providers and enable secure, token-based data access, with fine-grained authorization logic enforced directly in the SQRL script. This completes our vision of end-to-end pipeline automation, where consumption patterns inform data storage and processing—closing the loop between data production and usage.

Check out the interface documentation for more information.

Flink SQL Runner: Run Flink SQL Without JARs or Glue Code

June 9, 2025 · 2 min read

Matthias Broecheler

CEO of DataSQRL

Apache Flink has long been a powerhouse for streaming and batch data processing. And with the rise of Flink SQL, developers can now build sophisticated pipelines using a declarative language they already know. But getting Flink SQL applications into production still comes with friction: packaging JARs, managing connectors, injecting secrets, and wiring up deployment infrastructure.

Flink SQL Runner is here to change that. It's an open-source toolkit that simplifies development, deployment, and operation of Flink SQL applications—locally or in Kubernetes—without manual JAR assembly or scripting custom infrastructure pipelines.

Defining Data Interfaces with FlinkSQL

May 9, 2025 · 4 min read

Matthias Broecheler

CEO of DataSQRL

FlinkSQL is an amazing innovation in data processing: it packages the power of realtime stream processing within the simplicity of SQL. That means you can start with the SQL you know and introduce stream processing constructs as you need them.

FlinkSQL adds the ability to process data incrementally to the classic set-based semantics of SQL. In addition, FlinkSQL supports source and sink connectors making it easy to ingest data from and move data to other systems. That's a powerful combination which covers a lot of data processing use cases.

In fact, it only takes a few extensions to FlinkSQL to build entire data applications. Let's see how that works.

Building Data APIs with FlinkSQL

CREATE TABLE UserTokens (
userid BIGINT NOT NULL,
tokens BIGINT NOT NULL,
request_time TIMESTAMP_LTZ(3) NOT NULL METADATA FROM 'timestamp'
);

/*+query_by_all(userid) */
TotalUserTokens := SELECT userid, sum(tokens) as total_tokens,
count(tokens) as total_requests
FROM UserTokens GROUP BY userid;

UserTokensByTime(userid BIGINT NOT NULL, fromTime TIMESTAMP NOT NULL, toTime TIMESTAMP NOT NULL):=
                SELECT * FROM UserTokens WHERE userid = :userid,
                request_time >= :fromTime AND request_time < :toTime ORDER BY request_time DESC;

UsageAlert := SUBSCRIBE SELECT * FROM UserTokens WHERE tokens > 100000;

This script defines a sequence of tables. We introduce := as syntactic sugar for the verbose CREATE TEMPORARY VIEW syntax.

The UserTokens table does not have a configured connector, which mean we treat it as an API mutation endpoint connected to Flink via a Kafka topic that captures the events. This makes it easy to build APIs that capture user activity, transactions, or other types of events.

DataSQRL 0.6 Release: The Streaming Data Framework

May 7, 2025 · 3 min read

Matthias Broecheler

CEO of DataSQRL

The DataSQRL community is proud to announce the release of DataSQRL 0.6. This release marks a major milestone in the evolution of our open-source project, bringing enhanced alignment with Flink SQL and powerful new capabilities to the real-time serving layer.

You can find the full release notes and source code on our GitHub release page. To get started with the latest compiler, simply pull the latest Docker image:

docker pull datasqrl/cmd:0.6.0

A New Chapter: Flink SQL Integration

With DataSQRL 0.6, we are embracing the Flink ecosystem more deeply than ever before. This release introduces a complete re-architecture of the DataSQRL compiler to build directly on top of Flink SQL's parser and planner. By aligning our internal model with Flink SQL semantics, we unlock a host of new capabilities and bring DataSQRL users closer to the vibrant Flink ecosystem.

This architectural shift allows DataSQRL to:

Use Flink SQL syntax as the foundation, enabling more intuitive query definitions and easier onboarding for users familiar with Flink.
Extend Flink SQL with domain-specific features, such as declarative relationship definitions and functions to define the data interface.
Transpile FlinkSQL to database dialects for query execution.

Why Temporal Join is Stream Processing’s Superpower

July 10, 2023 · 8 min read

Matthias Broecheler

CEO of DataSQRL

Stream processing technologies like Apache Flink introduce a new type of data transformation that’s very powerful: the temporal join. Temporal joins add context to data streams while being efficient and fast to execute.

This article introduces the temporal join, compares it to the traditional inner join, explains when to use it, and why it is a secret superpower.

Table of Contents:

The Join: A Quick Review
The Temporal Join: Linking Stream and State
Temporal Join vs Inner Join
Why Temporal Joins are Fast and Efficient
Temporal Joins Made Easy to Use
Summary

Let's Uplevel Our Database Game: Meet DataSQRL

May 15, 2023 · 5 min read

Matthias Broecheler

CEO of DataSQRL

We need to make it easier to build data-driven applications. Databases are great if all your application needs is storing and retrieving data. But if you want to build anything more interesting with data - like serving users recommendations based on the pages they are visiting, detecting fraudulent transactions on your site, or computing real-time features for your machine learning model - you end up building a ton of custom code and infrastructure around the database.

You need a queue like Kafka to hold your events, a stream processor like Flink to process data, a database like Postgres to store and query the result data, and an API layer to tie it all together.

And that’s just the price of admission. To get a functioning data layer, you need to make sure that all these components talk to each other and that data flows smoothly between them. Schema synchronization, data model tuning, index selection, query batching … all that fun stuff.

The point is, you need to do a ton of data plumbing if you want to build a data-driven application. All that data plumbing code is time-consuming to develop, hard to maintain, and expensive to operate.

We need to make building with data easier. That’s why we are sending out this call to action to uplevel our database game. Join us in figuring out how to simplify the data layer.

We have an idea to get us started: Meet DataSQRL.

The Last Mile: Data Delivery​

Building Data APIs with FlinkSQL​

A New Chapter: Flink SQL Integration​

The Last Mile: Data Delivery

Building Data APIs with FlinkSQL

A New Chapter: Flink SQL Integration