Skip to main content

Step 1: Split JSON Arrays

This step teaches the most fundamental pattern in content splitting: how to turn a single message containing a JSON array into multiple messages, one for each element in the array.

The Goal

You will transform a single input message like this:

Input Message
{
"device_id": "sensor-001",
"location": "warehouse-a",
"readings": [
{"sensor": "temp-1", "value": 72.5},
{"sensor": "temp-2", "value": 85.3}
]
}

Into two separate output messages, each preserving the parent context:

Output Message 1
{
"sensor": "temp-1",
"value": 72.5,
"device_id": "sensor-001",
"location": "warehouse-a"
}
Output Message 2
{
"sensor": "temp-2",
"value": 85.3,
"device_id": "sensor-001",
"location": "warehouse-a"
}

The "Store -> Split -> Restore" Pattern

The key to this transformation is a three-processor pattern that you must always follow in this exact order:

  1. Store Context: Save the parent fields (device_id, location) into metadata.
  2. Split Array: Use the unarchive processor to split the readings array.
  3. Restore Context: Add the parent fields from metadata back into each new message.

Implementation

  1. Start with the Foundation: Copy the content-splitting-foundation.yaml file to a new file named array-splitter.yaml.

    cp examples/data-routing/content-splitting-foundation.yaml array-splitter.yaml
  2. Add the Splitting Logic: Open array-splitter.yaml and replace the entire pipeline section with the three-processor block below.

pipeline: processors:

1. STORE parent context into metadata

  • mapping: | meta device_id = this.device_id meta location = this.location root = this

2. SPLIT the array into individual messages

  • unarchive: format: json_array field: readings

3. RESTORE the context from metadata into each new message

  • mapping: | root = this root.device_id = meta("device_id") root.location = meta("location")
This is the only change needed. The `input` and `output` remain the same.

3. **Deploy and Test:**
```bash
# Send the test message
curl -X POST http://localhost:8080/sensors/bulk \
-H "Content-Type: application/json" \
-d '{
"device_id": "sensor-001",
"location": "warehouse-a",
"readings": [
{"sensor": "temp-1", "value": 72.5},
{"sensor": "temp-2", "value": 85.3}
]
}'
  1. Verify: Check the logs or output of your pipeline. You will see two distinct messages, each containing the fields from the readings array plus the device_id and location from the original parent object.

You have now mastered the fundamental pattern of array splitting.