Advanced Splitting Patterns
Once you've mastered the basic "Store -> Split -> Restore" pattern, you can apply it to more complex scenarios. This section covers advanced techniques for handling diverse and complex data structures.
Pattern 1: Multi-Array Splitting
Some messages contain multiple arrays that each need to be split into different event types. Use the branch processor to split each array in isolation.
Use Case: An order message that contains both items and applied_discounts arrays.
pipeline:
processors:
# Store the context that is common to ALL branches
- mapping: `meta order_id = this.order_id`
# Process the 'items' array in the first branch
- branch:
request_map: `root = {"items": this.items}`
processors:
- unarchive:
format: json_array
field: items
- mapping: |
root = this
root.order_id = meta("order_id")
root.type = "line_item"
# Process the 'discounts' array in a second branch
- branch:
request_map: `root = {"discounts": this.applied_discounts}`
processors:
- unarchive:
format: json_array
field: discounts
- mapping: |
root = this
root.order_id = meta("order_id")
root.type = "discount_application"
Pattern 2: Conditional Splitting
You may only want to split an array if it meets a certain condition (e.g., it contains more than one item).
Use Case: Process single-item orders as a whole, but split multi-item orders for individual fulfillment.
pipeline:
processors:
- switch:
cases:
# If 1 or fewer items, process as a single batch
- check: this.items.length() <= 1
processors:
- mapping: `root.processing_type = "batch"`
# Otherwise, use the standard splitting pattern
- processors:
- mapping: |
meta order_id = this.order_id
root = this
- unarchive:
format: json_array
field: items
- mapping: |
root = this
root.order_id = meta("order_id")
root.processing_type = "individual"
Pattern 3: Handling Commas in CSV Fields
The simple .split(",") method fails if your text fields contain commas. For properly formatted CSVs that quote these fields, a more robust parsing approach is needed.
Use Case: Parsing a description field that contains commas.
Input: txn-005,2025-10-20T11:00:00Z,89.99,"Restaurant Purchase, Table 5"
A simple split would incorrectly create 5 fields. While Expanso doesn't have a built-in CSV parser that handles quotes, for production you would typically use a small script processor (e.g., in JavaScript or Python) with a dedicated CSV library to handle these edge cases robustly.
Production Considerations
- Validation is Key: Before splitting, always validate the input. Ensure the array field exists, is actually an array, and is not excessively large, to prevent memory exhaustion.
- Error Handling: Wrap your splitting logic in a
try/catchblock to handle malformed data gracefully without crashing the entire pipeline. - Performance: For very large arrays or files, consider the memory implications. The
unarchiveandfileprocessors are highly optimized for streaming, but complex mappings on very large messages can still consume significant memory.