Step 1: Split JSON Arrays
This step teaches the most fundamental pattern in content splitting: how to turn a single message containing a JSON array into multiple messages, one for each element in the array.
The Goal
You will transform a single input message like this:
Input Message
{
"device_id": "sensor-001",
"location": "warehouse-a",
"readings": [
{"sensor": "temp-1", "value": 72.5},
{"sensor": "temp-2", "value": 85.3}
]
}
Into two separate output messages, each preserving the parent context:
Output Message 1
{
"sensor": "temp-1",
"value": 72.5,
"device_id": "sensor-001",
"location": "warehouse-a"
}
Output Message 2
{
"sensor": "temp-2",
"value": 85.3,
"device_id": "sensor-001",
"location": "warehouse-a"
}
The "Store -> Split -> Restore" Pattern
The key to this transformation is a three-processor pattern that you must always follow in this exact order:
- Store Context: Save the parent fields (
device_id,location) into metadata. - Split Array: Use the
unarchiveprocessor to split thereadingsarray. - Restore Context: Add the parent fields from metadata back into each new message.
Implementation
-
Start with the Foundation: Copy the
content-splitting-foundation.yamlfile to a new file namedarray-splitter.yaml.cp examples/data-routing/content-splitting-foundation.yaml array-splitter.yaml -
Add the Splitting Logic: Open
array-splitter.yamland replace the entirepipelinesection with the three-processor block below.
pipeline: processors:
1. STORE parent context into metadata
- mapping: | meta device_id = this.device_id meta location = this.location root = this
2. SPLIT the array into individual messages
- unarchive: format: json_array field: readings
3. RESTORE the context from metadata into each new message
- mapping: | root = this root.device_id = meta("device_id") root.location = meta("location")
This is the only change needed. The `input` and `output` remain the same.
3. **Deploy and Test:**
```bash
# Send the test message
curl -X POST http://localhost:8080/sensors/bulk \
-H "Content-Type: application/json" \
-d '{
"device_id": "sensor-001",
"location": "warehouse-a",
"readings": [
{"sensor": "temp-1", "value": 72.5},
{"sensor": "temp-2", "value": 85.3}
]
}'
- Verify: Check the logs or output of your pipeline. You will see two distinct messages, each containing the fields from the
readingsarray plus thedevice_idandlocationfrom the original parent object.
You have now mastered the fundamental pattern of array splitting.