Skip to main content

Advanced Splitting Patterns

Once you've mastered the basic "Store -> Split -> Restore" pattern, you can apply it to more complex scenarios. This section covers advanced techniques for handling diverse and complex data structures.

Pattern 1: Multi-Array Splitting

Some messages contain multiple arrays that each need to be split into different event types. Use the branch processor to split each array in isolation.

Use Case: An order message that contains both items and applied_discounts arrays.

multi-array-splitter.yaml
pipeline:
processors:
# Store the context that is common to ALL branches
- mapping: `meta order_id = this.order_id`

# Process the 'items' array in the first branch
- branch:
request_map: `root = {"items": this.items}`
processors:
- unarchive:
format: json_array
field: items
- mapping: |
root = this
root.order_id = meta("order_id")
root.type = "line_item"

# Process the 'discounts' array in a second branch
- branch:
request_map: `root = {"discounts": this.applied_discounts}`
processors:
- unarchive:
format: json_array
field: discounts
- mapping: |
root = this
root.order_id = meta("order_id")
root.type = "discount_application"

Pattern 2: Conditional Splitting

You may only want to split an array if it meets a certain condition (e.g., it contains more than one item).

Use Case: Process single-item orders as a whole, but split multi-item orders for individual fulfillment.

conditional-splitter.yaml
pipeline:
processors:
- switch:
cases:
# If 1 or fewer items, process as a single batch
- check: this.items.length() <= 1
processors:
- mapping: `root.processing_type = "batch"`

# Otherwise, use the standard splitting pattern
- processors:
- mapping: |
meta order_id = this.order_id
root = this
- unarchive:
format: json_array
field: items
- mapping: |
root = this
root.order_id = meta("order_id")
root.processing_type = "individual"

Pattern 3: Handling Commas in CSV Fields

The simple .split(",") method fails if your text fields contain commas. For properly formatted CSVs that quote these fields, a more robust parsing approach is needed.

Use Case: Parsing a description field that contains commas.

Input: txn-005,2025-10-20T11:00:00Z,89.99,"Restaurant Purchase, Table 5"

A simple split would incorrectly create 5 fields. While Expanso doesn't have a built-in CSV parser that handles quotes, for production you would typically use a small script processor (e.g., in JavaScript or Python) with a dedicated CSV library to handle these edge cases robustly.

Production Considerations

  • Validation is Key: Before splitting, always validate the input. Ensure the array field exists, is actually an array, and is not excessively large, to prevent memory exhaustion.
  • Error Handling: Wrap your splitting logic in a try/catch block to handle malformed data gracefully without crashing the entire pipeline.
  • Performance: For very large arrays or files, consider the memory implications. The unarchive and file processors are highly optimized for streaming, but complex mappings on very large messages can still consume significant memory.