This utility synchronizes data files from your BettrData instance via API and SFTP, executes user-defined commands (on a local host) on downloaded files, and manages processing status updates.
- βοΈ Overview
- π§Ύ Configuration File β default.ini
- Structure
- π§ Configuration Sections
- πΉ [bettrdata]
- πΉ [sftp]
- πΉ [list] β Query Settings
- Example Filters
- πΉ [processing] β File Handling & Commands
- Jinja2 Command Variables
- Example
- πΉ [email] β Optional Notifications
- πΉ [logging]
- π Runtime Options
- Examples
- 1. Run once for testing
- 2. Run continuously every 5 minutes
- 3. Use custom config file
- β±οΈ Controlling Wait Time & Batch Size
- π§° Example End-to-End Setup
- πͺ΅ Logs & Troubleshooting
- π§± File Status Conventions
- π¬ Email Behavior
- π§© Typical Workflow
βοΈ Overview
The BettrData Sync Utility automates data retrieval, processing, and synchronization using your BettrData API and SFTP credentials. It can:
- Fetch unprocessed or specific files from your BettrData instance.
- Download files securely via SFTP.
- Run a custom shell command or script for each file.
- Update processing status via configurable attributes.
- Optionally send success/failure email notifications.
- Run continuously (polling) or once (batch mode).
π§Ύ Configuration File β default.ini
The configuration file defines runtime behavior.
Example path: config/default.ini
Structure
π§ Configuration Sections
πΉ [bettrdata]
Key | Description |
base_url | Base URL of your BettrData instance |
api_token | API token for authentication (optional if public endpoint) |
πΉ [sftp]
Key | Description |
host | SFTP hostname of your BettrData instance |
port | SFTP port (default: 22 or 222) |
username | SFTP username |
remote_password | SFTP password |
πΉ [list] β Query Settings
This section defines which files are fetched from BettrData.
Key | Description | Example |
where | MongoDB-style filter for selecting files | { "status.processed.done": true, "attributes.client_status": { "$exists": false } } |
projection | Limit fields returned (optional) | { "fileName": 1, "md5": 1 } |
sort | Sort order for results | { "$natural": -1 } |
limit | Batch size β max number of records per cycle | 50 |
skip | Skip N records from query start | 0 |
Example Filters
Use Case | where Example |
Only unprocessed files | { "attributes.client_status": { "$exists": false } } |
Re-run failed files | { "attributes.client_status": "FAILED" } |
Process specific file ID | { "_id": "67a61639faa4f800168c60bf" } |
Ignore virtual/combined files | { "virtual": { "$ne": true } } |
π‘ WARNING! Do not combine filters using $and; however OR $or is allowed
// AND example:
{
"status.processed.done": true,
"virtual": { "$ne": true },
"attributes.client_status": { "$exists": false }
}
// OR example:
{ "$or": [
{ "status.processed.done": true },
{ "virtual": { "$ne": true } },
{ "attributes.client_status": { "$exists": false } }
] }πΉ [processing] β File Handling & Commands
Key | Description |
target_source_directory | Local path where files are downloaded |
status_attribute | Attribute name used to track file status |
error_attribute | Attribute name for error reporting |
success_dir_name | Subdirectory for successfully processed files |
command_1 | Shell command or script template (Jinja2 syntax) |
Jinja2 Command Variables
These variables are rendered per file and can be used in your command:
Variable | Description |
id | File _id |
createDate | Creation timestamp |
fileName | Original file name |
recordCount | Number of records |
byteSize | File size in bytes |
md5 | MD5 hash |
alias | Optional alias |
format | File format (e.g., csv, tab) |
length | Logical length field |
filePath | Full local path to the downloaded file |
file | Entire MongoDB record (JSON) |
Example
command_1 = c:\scripts\process_file.bat {{filePath}} {{recordCount}} {{md5}}
πΉ [email] β Optional Notifications
Key | Description |
smtp_host | SMTP server address (empty disables email) |
smtp_port | SMTP port (465 for SSL) |
smtp_user | Username |
smtp_password | Password |
use_ssl | true or false |
from | Sender address |
success_email | Recipient for successful runs |
alert_email | Recipient for failed runs |
Emails are only sent if smtp_host is defined.
πΉ [logging]
Key | Description |
level | Log verbosity ( DEBUG, INFO, ERROR) |
log_dir | Directory for log files |
π Runtime Options
The script accepts runtime parameters for controlling execution frequency.
python main.py [--config path/to/config.ini] [--once] [--interval N]
Option | Description | Default |
--config | Path to .ini configuration file | config/default.ini |
--once | Run once and exit (no loop) | False |
--interval | Time between iterations in seconds | 60 |
Examples
1. Run once for testing
python main.py --config config/dev.ini --once
2. Run continuously every 5 minutes
python main.py --interval 300
3. Use custom config file
python main.py --config c:\client\sync\custom.ini
β±οΈ Controlling Wait Time & Batch Size
Purpose | Setting | Example |
Wait time between runs | --interval argument | --interval 600 (every 10 min) |
Batch size (records per cycle) | [list] limit | limit = 25 |
Example: Fetch 25 records every 5 minutes:python main.py --interval 300and in
default.ini:[list] limit = 25
π§° Example End-to-End Setup
1. Configure default.ini
Set:
- API token and SFTP credentials
- Output directory
command_1to run your transformation or load step- Optional email recipients
2. Test single run
python main.py --once3. Schedule recurring sync
- Run continuously with
-interval - Or register the
.exebuild in Windows Task Scheduler / startup
πͺ΅ Logs & Troubleshooting
- Logs are stored under the directory defined in
[logging] log_dir. - Default path:
logs/ - Log level is controlled by
level. DEBUGβ most verboseINFOβ normal operationERRORβ only failures
π§± File Status Conventions
Status Value | Meaning |
(missing) | File not processed yet |
STARTED | File picked up and currently processing |
SUCCESS | File processed successfully |
FAILED | Error occurred during processing |
These are stored in the BettrData document under:
attributes[client_status]
(or your custom status_attribute)
π¬ Email Behavior
If email is enabled:
- On successful file processing β email sent to
success_email - On failure or exception β email sent to
alert_email
SMTP credentials must be valid; otherwise, email sending is silently skipped with a log entry.
π§© Typical Workflow
- The script queries BettrData for eligible files (
wherefilter). - It downloads each file via SFTP.
- The command defined in
command_1runs locally. - The fileβs status is updated in BettrData (
STARTED,SUCCESS,FAILED). - Logs are written and optional emails sent.
- Waits
intervalseconds before repeating (unless-onceused).