rembrembdocs

Quickstart

Get started creating a write-behind pipeline

This guide takes you through the creation of a write-behind pipeline.

Concepts

Write-behind is a processing pipeline used to synchronize data in a Redis database with a downstream data store. You can think about it as a pipeline that starts with change data capture (CDC) events for a Redis database and then filters, transforms, and maps the data to the target data store data structures.

The target data store to which the write-behind pipeline connects and writes data.

The write-behind pipeline is composed of one or more jobs. Each job is responsible for capturing change for one key pattern in Redis and mapping it to one or more tables in the downstream data store. Each job is defined in a YAML file.

Supported data stores

Write-behind currently supports these target data stores:

Data Store

Cassandra

MariaDB

MySQL

Oracle

PostgreSQL

Redis Enterprise

SQL Server

Prerequisites

The only prerequisite for running Write-behind is Redis Gears Python >= 1.2.6 installed on the Redis Enterprise Cluster and enabled for the database you want to mirror to the downstream data store. For more information, see RedisGears installation.

Preparing the write-behind pipeline

In order to prepare the pipeline, fill in the correct information for the target data store. Secrets can be provided using a reference to a secret (see below) or by specifying a path.

The applier section has information about the batch size and frequency used to write data to the target.

Some of the applier attributes such as target_data_type, wait_enabled, and retry_on_replica_failure are specific for the Write-behind ingest pipeline and can be ignored.

Write-behind jobs

Write-behind jobs are a mandatory part of the write-behind pipeline configuration. Under the jobs directory (parallel to config.yaml) you should have a job definition in a YAML file for every key pattern you want to write to a downstream database table.

The YAML file can be named using the destination table name or another naming convention, but it has to have a unique name.

Job definition has the following structure:

source:
  redis:
    key_pattern: emp:*
    trigger: write-behind
    exclude_commands: ["json.del"]
transform:
  - uses: rename_field
    with:
      from_field: after.country
      to_field: after.my_country
output:
  - uses: relational.write
    with:
      connection: my-connection
      schema: my-schema
      table: my-table
      keys:
        - first_name
        - last_name
      mapping:
        - first_name
        - last_name
        - address
        - gender

Source section

The source section describes the source of data in the pipeline.

The redis section is common for every pipeline initiated by an event in Redis, such as applying changes to data. In the case of write-behind, it has the information required to activate a pipeline dealing with changes to data. It includes the following attributes:

Note: Write-behind does not support the expired event. Therefore, keys that are expired in Redis will not be deleted from the target database automatically. Notes: The redis attribute is a breaking change replacing the keyspace attribute. The key_pattern attribute replaces the pattern attribute. The exclude_commands attributes replaces the exclude-commands attribute. If you upgrade to version 0.105 and beyond, you must edit your existing jobs and redeploy them.

Output section

The output section is critical. It specifies a reference to a connection from the config.yaml connections section:

Note: The columns used in keys will be automatically included, so there's no need to repeat them in the mapping section.

Apply filters and transformations to write-behind

The Write-behind jobs can apply filters and transformations to the data before it is written to the target. Specify the filters and transformations under the transform section.

Filters

Use filters to skip some of the data and not apply it to target. Filters can apply simple or complex expressions that take as arguments the Redis entry key, fields, and even the change op code (create, delete, update, etc.). See Filter for more information.

Transformations

Transformations manipulate the data in one of the following ways:

To learn more about transformations, see data transformation pipeline.

Provide target's secrets

The target's secrets (such as TLS certificates) can be read from a path on the Redis node's file system. This allows the consumption of secrets injected from secret stores.

Deploy the write-behind pipeline

To start the pipeline, run the deploy command:

redis-di deploy

You can check that the pipeline is running, receiving, and writing data using the status command:

redis-di status

Monitor the write-behind pipeline

The Write-behind pipeline collects the following metrics:

Metric Description

Metric in Prometheus

Total incoming events by stream

Calculated as a Prometheus DB query: sum(pending, rejected, filtered, inserted, updated, deleted)

Created incoming events by stream

rdi_metrics_incoming_entries{data_source:"…",operation="inserted"}

Updated incoming events by stream

rdi_metrics_incoming_entries{data_source:"…",operation="updated"}

Deleted incoming events by stream

rdi_metrics_incoming_entries{data_source:"…",operation="deleted"}

Filtered incoming events by stream

rdi_metrics_incoming_entries{data_source:"…",operation="filtered"}

Malformed incoming events by stream

rdi_metrics_incoming_entries{data_source:"…",operation="rejected"}

Total events per stream (snapshot)

rdi_metrics_stream_size{data_source:""}

Time in stream (snapshot)

rdi_metrics_stream_last_latency_ms{data_source:"…"}

To use the metrics you can either:

Upgrading

If you need to upgrade Write-behind, you should use the upgrade command that provides for a zero downtime upgrade:

redis-di upgrade ...

On this page