Merge enables the automated combination of multiple data files into a single consolidated file. This process integrates the contents—such as records, fields, or datasets—while preserving data integrity and applying rules to manage duplicates, conflicts, and ordering. The resulting merged file acts as a unified data source for subsequent processing or analysis.
Examples:
- Merging customer data from multiple regional databases into a centralized master file
- Aggregating sensor data files collected over intervals into a single dataset for analysis
- Streamlining workflows where data arrives in fragments but needs to be processed as a whole
🌐 Global Settings
The Global settings control how auto-merging is applied across outputs. You can enable merging, define a time window for when files should be grouped, restrict merging to files from the same client, and specify the output schema used for the resulting merged file.
- Enabled – Turn on to enable auto-merging of consumer types
- Window (minutes) – Set the time interval between received files for auto-merging (e.g., 10 minutes)
- By Client Only – When enabled, merging occurs only if files are from the same client
- Output Schema Output – Choose the output schema used for the merged files
➕ Union
Union allows you to set the minimum batch size required before files are merged. This ensures that only sufficiently large groups of files are processed together.
Benefits:
- Improves processing efficiency by reducing overhead from small merges
- Ensures merged files contain meaningful or complete data
- Prevents premature merging when more data is still incoming
- Helps align with downstream systems that expect structured batches
🔗 Join Settings
The Join settings let you define how files are matched and combined based on conditions you configure.
- Window Tolerance – Adds extra time before and after the join window to capture slightly early or late files
- Table – Use this to configure join actions.
- Click the Add button to add a row
- Use the Trash Can icon to delete a row
Join Actions:
- JOIN – Combine matching records based on the output schema (data and key)
- SELECT – Keep records that match, without combining
- OMIT – Exclude matched records from the output