Skip to main content

DataSQRL Configuration (package.json file)

DataSQRL projects are configured with one or more JSON files.
Unless a file is passed explicitly to datasqrl compile -c ..., the compiler looks for a package.json in the working directory; if none is found the built-in default (shown here) is applied.

Multiple files can be provided; they are merged in order โ€“ latter files override earlier ones, objects are deep-merged, and array values are replaced wholesale.


Top-Level Keysโ€‹

KeyTypeDefaultPurpose
versionnumber1Configuration schema version โ€“ must be 1.
enabled-enginesstring[]["vertx","postgres","kafka","flink"]Ordered list of engines that form the runtime pipeline.
enginesobjectโ€“Engine specific configuration (see below).
connectorsobjectsee defaultsExternal system connectors configuration (see below).
compilerobjectsee defaultsControls compilation, logging, and generated artifacts.
dependenciesobject{}Aliases for packages that can be IMPORT-ed from SQRL.
discoveryobject{}Rules for automatic table discovery when importing data files.
scriptobjectโ€“Points to the main SQRL script and GraphQL schema.
packageobjectโ€“Optional metadata (name, description, etc.) for publishing.
test-runnerobjectsee defaultsIntegration test execution settings (see below).

Engines (engines)โ€‹

Each sub-key below engines must match one of the IDs in enabled-engines.

{
"engines": {
"<engine-id>": {
"type": "<engine-id>", // optional; inferred from key if omitted
"config": { /*...*/ } // engine-specific knobs (Flink SQL options, etc.)
}
}
}
KeyTypeDefaultNotes
configobjectsee belowCopied verbatim into the generated Flink SQL job (e.g. "table.exec.source.idle-timeout": "5 s").
{
"engines": {
"flink": {
"config": {
"execution.runtime-mode": "STREAMING",
"execution.target": "local",
"execution.attached": true,
"rest.address": "localhost",
"rest.port": 8081,
"state.backend.type": "rocksdb",
"table.exec.resource.default-parallelism": 1,
"taskmanager.memory.network.max": "800m"
}
}
}
}

Built-in connector templates
postgres, postgres_log-source, postgres_log-sink,
kafka, kafka-keyed, kafka-upsert,
iceberg, localfile, print.

Kafka (kafka)โ€‹

The default configuration only declares the engine; topic definitions are injected at plan time.
Additional keys (e.g. bootstrap.servers) may be added under config.

Vert.x (vertx)โ€‹

A GraphQL server that routes queries to the backing database/log engines.
No mandatory keys; connection pools are generated from the overall plan. In terms of security, we support JWT auth, that can be specified under the config section.

KeyTypeDefaultNotes
configobjectsee belowVert.x JWT configuration.
{
"engines": {
"vertx" : {
"authKind": "JWT",
"config": {
"jwtAuth": {
"pubSecKeys": [
{
"algorithm": "HS256",
"buffer": "<signer-secret>" // Base64 encoded signer secret string
}
],
"jwtOptions": {
"issuer": "<jwt-issuer>",
"audience": ["<jwt-audience>"],
"expiresInSeconds": "3600",
"leeway": "60"
}
}
}
}
}
}

Postgres (postgres)โ€‹

No mandatory keys. Physical DDL (tables, indexes, views) is produced automatically.

Iceberg (iceberg)โ€‹

Used as a table-format engine together with a query engine such as Flink, Snowflake, or DuckDB.

DuckDB (duckdb)โ€‹

KeyTypeDefaultDescription
urlstring"jdbc:duckdb:"Full JDBC URL.

Snowflake (snowflake)โ€‹

KeyTypeDefaultDescription
catalog-namestringโ€“Glue catalog.
external-volumestringโ€“Snowflake external volume name.
urlstringโ€“Full JDBC URL including auth params.

Connectors (connectors)โ€‹

{
"connectors": {
"postgres": { "connector": "jdbc-sqrl", /*...*/ },
"kafka-mutation": { "connector" : "kafka", /*...*/ },
"kafka": { "connector" : "kafka", /*...*/ },
"localfile": { "connector": "filesystem", /*...*/ },
"iceberg": { "connector": "iceberg", /*...*/ },
"postgres_log-source": { "connector": "postgres-cdc", /*...*/ },
"postgres_log-sink": { "connector": "jdbc-sqrl", /*...*/ },
"print": { "connector": "print", /*...*/ }
}
}

Compiler (compiler)โ€‹

{
"compiler": {
"logger": "print", // "print" | any configured log engine | "none"
"extended-scalar-types": true, // expose extended scalar types in generated GraphQL
"compile-flink-plan": true, // compile Flink physical plans where supported
"cost-model": "DEFAULT", // cost model to use for DAG optimization

"explain": { // artifacts in build/pipeline_*.*
"text": true,
"sql": false,
"logical": true,
"physical": false,
"sorted": true, // deterministic ordering (mostly for tests)
"visual": true
},

"api": {
"protocols": [ // protocols that are being exposed by the server
"GRAPHQL",
"REST",
"MCP"
],
"endpoints": "FULL", // endpoint generation strategy (FULL, GRAPHQL, OPS_ONLY)
"add-prefix": true, // add an operation-type prefix before function names
"max-result-depth": 3 // maximum depth of graph traversal when generating operations from a schema
}
}
}

Dependencies (dependencies)โ€‹

{
"dependencies": {
"my-alias": {
"folder": "folder-name"
}
}
}

If only folder is given the dependency key (my-alias in the above example) acts as a local folder alias.


Discovery (discovery)โ€‹

KeyTypeDefaultPurpose
patternstring (regex)nullFilters which external tables are automatically exposed in IMPORT โ€ฆ statements. Example: "^public\\..*"

Script (script)โ€‹

KeyTypeDescription
mainstringPath to the main .sqrl file.
graphqlstringOptional GraphQL schema file (defaults to schema.graphqls).
operationsstring[]Optional GraphQL operation definitions.

Package Metadata (package)โ€‹

KeyRequiredDescription
nameyesReverse-DNS style identifier (org.project.module).
descriptionnoShort summary.
licensenoSPDX license id or free-text.
homepagenoWeb site.
documentationnoDocs link.
topicsnoString array of tags/keywords.

Test-Runner (test-runner)โ€‹

KeyTypeDefaultMeaning
snapshot-folderstring./snapshotsSnapshots output directory.
test-folderstring./testsTests output directory.
delay-secnumber30Wait between data-load and snapshot. Set -1 to disable.
mutation-delay-secnumber0Pause(s) between mutation queries.
required-checkpointsnumber0Minimum completed Flink checkpoints before assertions run (requires delay-sec = -1).
create-topicsstring[]-Kafka topics to create before tests start.
headersobject-Any HTTP headers to add during the test execution. For example, JWT auth header.

Templating & Variable Resolutionโ€‹

The DataSQRL launcher supports dynamic resolution of variable placeholders at runtime.

  • Environment variables: use ${VAR_NAME} as a placeholder. Example: ${POSTGRES_PASSWORD}.
  • SQRL variables use ${sqrl:<identifier>} and are filled automatically by the compiler, mostly inside connector templates.
    Common identifiers include table-name, original-table-name, filename, format, and kafka-key.
warning

Unresolved ${sqrl:*} placeholders raise a validation error.


Internal Environment Variablesโ€‹

For engines that may be running as standalone services inside the DataSQRL Docker container, we use the following environment variables internally:

  • Kafka
    • KAFKA_BOOTSTRAP_SERVERS
    • KAFKA_GROUP_ID
  • PostgreSQL
    • POSTGRES_VERSION
    • POSTGRES_HOST
    • POSTGRES_PORT
    • POSTGRES_DATABASE
    • POSTGRES_AUTHORITY
    • POSTGRES_JDBC_URL
    • POSTGRES_USERNAME
    • POSTGRES_PASSWORD

Default Configurationโ€‹

The built-in fallback (excerpt - full version here):

{
"version": 1,
"enabled-engines": ["vertx", "postgres", "kafka", "flink"],
"engines": {
"flink": {
"config": {
"execution.runtime-mode": "STREAMING",
"execution.target": "local",
"execution.attached": true,
"rest.address": "localhost",
"rest.port": 8081,
"state.backend.type": "rocksdb",
"table.exec.resource.default-parallelism": 1,
"taskmanager.memory.network.max": "800m"
}
},
"duckdb": {
"url": "jdbc:duckdb:"
}
},
"compiler": {
"logger": "print",
"extended-scalar-types": true,
"compile-flink-plan": true,
"cost-model": "DEFAULT",
"explain": {
"text": true,
"sql": false,
"logical": true,
"physical": false,
"sorted": true,
"visual": true
},
"api": {
"protocols": ["GRAPHQL", "REST", "MCP"],
"endpoints": "FULL",
"add-prefix": true,
"max-result-depth": 3
}
},
"connectors": {
"postgres": { "connector": "jdbc-sqrl", /*...*/ },
"kafka-mutation": { "connector" : "kafka", /*...*/ },
"kafka": { "connector" : "kafka", /*...*/ },
"localfile": { "connector": "filesystem", /*...*/ },
"iceberg": { "connector": "iceberg", /*...*/ },
"postgres_log-source": { "connector": "postgres-cdc", /*...*/ },
"postgres_log-sink": { "connector": "jdbc-sqrl", /*...*/ },
"print": { "connector": "print", /*...*/ }
},
"test-runner": {
"snapshot-folder": "./snapshots",
"test-folder": "./tests",
"delay-sec": 30,
"mutation-delay-sec": 0,
"required-checkpoints": 0
}
}