TABLE OF CONTENTS
What Does This Service Do?
The Job Export Service automatically watches folders the servers for data files (Parquet format), uploads them to cloud storage (AWS S3 or SFTP server), and then moves them to an archive folder. Think of it as an automated file delivery system that runs 24/7 in the background.
Key Features
- Automatic File Detection - No manual uploads needed
- Reliable Delivery - Retries automatically if network fails
- Priority Processing - Real-time data gets uploaded before historical data
- Safe Archival - Keeps local copies after successful upload
- Runs as Windows Service - Starts automatically when server boots
Future To-dos
- The service is still in its early stages
- See potential future roadmap items here:
How It Works (Simple Explanation)
1. Data files arrive in watched folders
↓
2. Service detects new files within seconds
↓
3. Files are uploaded to AWS S3 or SFTP server
↓
4. Successfully uploaded files are moved to archive folder
↓
5. Process repeats for next fileWhat Folders Does It Watch?
- The service can be configured to monitor three different folders on the local file system:
- Real-time Measurements -
C:\Data\Output\measurements\- High priority
- Processed immediately
- Real-time Events/Alarms -
C:\Data\Output\events\- High priority
- Processed immediately
- Historical Measurements -
C:\Data\Historical\measurements\- Lower priority
- Waits for real-time data to finish first
- Real-time Measurements -
Installation
Prerequisites
- Windows Server 2016 or later
- .NET 8.0 Runtime installed
- Network access to AWS S3 or SFTP server
- Administrator rights to install Windows Service
Installation Steps
- Copy Files
- Extract the service files to:
C:\Program Files\Bazefield\Baze Job Export Service\
- Extract the service files to:
- Install as Windows Service
# Run PowerShell as Administrator sc.exe create "Job Export Service" binPath="C:\Program Files\Bazefield\Baze Job Export Service\JobExportService.exe" sc.exe config "Job Export Service" start=auto - Configure Settings (see Configuration section below)
- Start Service
net start "Job Export Service" - Verify It's Running
- Open Services (
services.msc) - Find "Job Export Service"
- Status should show "Running"
- Open Services (
Configuration
All settings are currently in the appsettings.json file located in the service installation folder.
Basic Settings (Required)
Open appsettings.json in Notepad and configure these sections:
1. Folder Paths
"ParquetWorkerOptions":{ "OutputFolderPath": "C:\\Data\\Output", "HistoricalOutputFolderPath": "C:\\Data\\Historical", "CloudUploadRetryCount": 3, "UploadMode": "S3" }
Settings Explained:
OutputFolderPath- Where real-time measurement and event files arriveHistoricalOutputFolderPath- Where historical measurement files arriveCloudUploadRetryCount- How many times to retry if upload fails- Default: 3 attempts
- Recommended: 3-5 for unreliable networks
UploadMode- Choose"S3"or"Sftp"
2. AWS S3 Settings (if using S3)
"S3Options":{ "S3AccessPointArn": "arn:aws:s3:us-east-1:123456789012:accesspoint/myaccesspoint", "S3Region": "us-east-1" }
Settings Explained:
S3AccessPointArn- Your AWS S3 Access Point ARN (get from AWS Console)S3Region- AWS region where your S3 bucket is located- Examples:
us-east-1,eu-west-1,ap-southeast-2
- Examples:
Authentication Methods
This service uses IAM Role-based authentication (recommended by AWS for services running on EC2 or on-premises).
- If running on EC2: Attach an IAM role to the EC2 instance with S3 access permissions
- If running on-premises: Configure AWS credentials using one of these methods:
- AWS CLI (
aws configure) - stores credentials inC:\Users\[username]\.aws\credentials - Environment variables:
AWS_ACCESS_KEY_IDandAWS_SECRET_ACCESS_KEY - Windows Credential Manager
- AWS CLI (
How to Get S3 Access Point ARN:
- Log into AWS Console
- Go to S3 → Access Points
- Click on your access point
- Copy the ARN (starts with
arn:aws:s3:...)
Setting Up IAM Permissions (for IT administrators):
The IAM role/user needs these permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:PutObjectAcl"
],
"Resource": "arn:aws:s3:us-east-1:123456789012:accesspoint/myaccesspoint/object/*"
}
]
}3. SFTP Settings (if using SFTP)
Option A: Password Authentication (Not Recommended)
"SftpOptions":{ "SftpHost": "sftp.example.com", "SftpPort": 22, "SftpUsername": "datauploader", "SftpPassword": "YourSecurePassword" }
Option B: Private Key Authentication (Recommended)
"SftpOptions":{ "SftpHost": "sftp.example.com", "SftpPort": 22, "SftpUsername": "datauploader", "SftpPrivateKeyPath": "C:\\Keys\\sftp_private_key.pem", "SftpPrivateKeyPassphrase": "KeyPassphrase123" // Optional, only if key is encrypted }
Settings Explained:
SftpHost- SFTP server address (domain name or IP address)SftpPort- SFTP port (usually 22)SftpUsername- Your SFTP usernameSftpPassword- Your SFTP password (for password auth)SftpPrivateKeyPath- Path to SSH private key file (for key-based auth)SftpPrivateKeyPassphrase- Passphrase if private key is encrypted (optional)
⚠️ Security Note:
- Private key authentication is more secure than password authentication
- Store private keys in a secure location (e.g.,
C:\Keys\) with restricted permissions - Never commit private keys to source control
- Use encrypted keys with passphrases for additional security
Folder Structure
The service automatically creates these folders:
C:\Data\Output\
├── measurements\ ← Place real-time measurement files here
├── events\ ← Place real-time event/alarm files here
└── archive\
├── measurement\ ← Processed measurement files moved here
└── event\ ← Processed event files moved here
C:\Data\Historical\
├── measurements\ ← Place historical measurement files here
└── archive\
└── measurement\ ← Processed historical files moved hereWhat Happens to Files?
- Files are in
measurements\orevents\folder - During Upload: Service reads file, extracts metadata, uploads to cloud
- After Upload: File is moved to
archive\measurement\orarchive\event\ - Archive Naming: Files get timestamp added (e.g.,
data_20241211103045123.parquet)
Common Tasks
Starting the Service
Option 1: Services Console
- Press
Windows + R- Type
services.mscand press Enter - Find "Job Export Service"
- Right-click → Start
- Type
Option 2: Command Line
net start "Job Export Service"Stopping the Service
Option 1: Services Console
- Open
services.msc - Find "Job Export Service"
- Right-click → Stop
Option 2: Command Line
net stop "Job Export Service"Restarting After Configuration Changes
- Stop the service
- Edit
appsettings.json - Save the file
- Start the service
Checking If Service Is Running
Option 1: Services Console
- Open
services.msc - Look for "Job Export Service"
- Status column shows "Running" or "Stopped"
Option 2: Command Line
Get-Service "Job Export Service"Monitoring
Where Are the Logs?
⚠️ See upcoming work notes here
Logs are written to:
- Windows Event Viewer: Application logs
Viewing Logs in Event Viewer
- Press
Windows + R - Type
eventvwr.mscand press Enter - Navigate to: Windows Logs → Application
- Filter for Source: "Job Export Service"
What to Look For
✅ Good Signs:
"File watchers initialized and monitoring directories""Successfully processed and uploaded: [filename]""Processing X real-time measurement file(s)"
⚠️ Warning Signs:
"Parquet file upload failed""Error during periodic directory scan""Failed to move file to archive"
? Critical Issues:
"Fatal error in ParquetArchiveProcessor"- Service status shows "Stopped"
Common Log Messages
| Message | Meaning | Action Needed |
|---|---|---|
| "File watchers initialized" | Service started successfully | None - normal |
| "Processing X files" | Files are being uploaded | None - normal |
| "Successfully processed and uploaded" | File uploaded successfully | None - normal |
| "Parquet file upload failed" | Network issue or S3/SFTP problem | Check network, verify credentials |
| "No metadata found" | File is corrupted or invalid | Check file, may need to delete |
| "File already queued or in progress" | Duplicate file detection working | None - normal |
| "Error processing historical directory" | Issue with historical folder | Check folder permissions |
Troubleshooting
Files Not Being Uploaded
Check 1: Is Service Running?
- Open
services.msc - Verify "Job Export Service" status is "Running"
- If stopped, start it
Check 2: Are Files in Correct Folder?
- Verify files are
.parquetformat - Verify they're in
measurements\orevents\folder - Check that folder paths in config match actual folders
Check 3: Check Logs
- Open Event Viewer
- Look for errors in Application logs
- Common issues: network problems, permission errors
Check 4: Network Connectivity
- Verify server can reach S3 or SFTP server
- Test with:
Test-NetConnection sftp.example.com -Port 22
Files Getting Stuck
Symptom: Files stay in folder, never move to archive
Possible Causes:
- Upload failing (check logs)
- File is locked by another program
- Corrupted file
- Network issue
Solution:
- Check Event Viewer for error messages
- Verify credentials are correct in
appsettings.json - Try manually deleting one file to see if service processes others
- Restart service
Service Won't Start
Possible Causes:
- Configuration file has errors (invalid JSON)
- Missing required settings
- Folder paths don't exist
- .NET Runtime not installed
Solution:
- Validate
appsettings.jsonusing online JSON validator - Check Windows Event Viewer for startup errors
- Ensure all folder paths exist
- Reinstall .NET 8.0 Runtime
Upload Errors
Error: "Parquet file upload failed"
Possible Causes:
- Network disconnected
- Wrong S3 credentials or IAM permissions
- SFTP authentication failed (wrong password or private key)
- SFTP server down
- Firewall blocking connection
Solution:
- Check network connectivity
- For S3:
- Verify IAM role/credentials are configured
- Test with:
aws s3 lsfrom same server - Check CloudWatch logs for permission errors
- Test with:
- Verify IAM role/credentials are configured
- For SFTP:
- Verify credentials in
appsettings.json- If using private key, ensure file exists at
SftpPrivateKeyPath - Test manually:
sftp -i C:\Keys\sftp_key.pem username@host - Check private key file permissions (should be readable by service account)
- If using private key, ensure file exists at
- Verify credentials in
- Check firewall rules (port 22 for SFTP, port 443 for S3)
Error: "SSH private key file not found"
Possible Causes:
- Private key path in config is incorrect
- Private key file was deleted or moved
- Service account doesn't have permission to read key file
Solution:
- Verify
SftpPrivateKeyPathinappsettings.jsonis correct - Check that file exists at specified path
- Grant service account Read permission on private key file
- Ensure key is in OpenSSH format (not PuTTY .ppk format)
Performance Tips
How Many Files Can It Handle?
- Normal Load: 100-500 files per hour - no issues
- High Load: 1000+ files per hour - may need tuning
- Very High Load: 10,000+ files per hour - consider multiple instances
Optimizing for High Volume
If you're processing many files per hour:
- Reduce Retry Count(faster failure detection):
"CloudUploadRetryCount": 2 - Monitor Archive Folder Size - Old files should be cleaned up periodically
Disk Space Management
Archive folders will grow over time:
- The AvroOutputAdapter for the data engine has a setting to purge the archive folder:
- MaxHistoryDays
- Once set, this will delete Parquet files that have been stored on disk for x number of days
Security Best Practices
Protecting Credentials
- SFTP Authentication:
- ✅ Prefer SSH private keys over passwords
- Encrypt private keys with passphrases
- Store keys in
C:\Keys\with restricted NTFS permissions (Administrators only) - Use 4096-bit RSA or Ed25519 keys
- AWS S3 Authentication:
- ✅ Prefer IAM roles (EC2 instance roles) over access keys
- If using access keys, never store in
appsettings.json - Use AWS CLI credential file or environment variables
- Rotate access keys every 90 days
- Use least-privilege IAM policies
- Configuration File Security:
- Encrypt sensitive values using Windows DPAPI
- Restrict
appsettings.jsonpermissions to Administrators only - Never commit credentials to source control
- Service Account Security:
- Run service under dedicated service account (not SYSTEM)
- Grant minimal required permissions
Network Security
- Firewall Rules: Allow outbound to S3/SFTP servers only
- VPN: Use VPN for SFTP connections if possible
- TLS/SSL: Ensure S3 uses HTTPS (enabled by default)
Folder Permissions
Recommended NTFS permissions:
- Service account: Read/Write on all data folders
- Administrators: Full Control
- Users: No access
Appendix: Configuration Examples
Example 1: AWS S3 in US East
{
"ParquetWorkerOptions": {
"OutputFolderPath": "C:\\Data\\Output",
"HistoricalOutputFolderPath": "C:\\Data\\Historical",
"PeriodFileScanIntervalSec": 30,
"CloudUploadRetryCount": 3,
"UploadMode": "S3"
},
"S3Options": {
"S3AccessPointArn": "arn:aws:s3:us-east-1:123456789012:accesspoint/data-upload",
"S3Region": "us-east-1"
}
}Example 2: SFTP Server with Private Key Authentication
{
"ParquetWorkerOptions": {
"OutputFolderPath": "D:\\FileUpload\\Realtime",
"HistoricalOutputFolderPath": "D:\\FileUpload\\Historical",
"PeriodFileScanIntervalSec": 60,
"CloudUploadRetryCount": 5,
"UploadMode": "Sftp"
},
"SftpOptions": {
"SftpHost": "upload.example.com",
"SftpPort": 22,
"SftpUsername": "dataservice",
"SftpPrivateKeyPath": "C:\\Keys\\sftp_production.pem",
"SftpPrivateKeyPassphrase": "MySecurePassphrase!"
}
}Example 3: High-Volume Processing
{
"ParquetWorkerOptions": {
"OutputFolderPath": "E:\\HighVolume\\Output",
"HistoricalOutputFolderPath": "E:\\HighVolume\\Historical",
"PeriodFileScanIntervalSec": 60,
"CloudUploadRetryCount": 2,
"UploadMode": "S3"
},
"S3Options": {
"S3AccessPointArn": "arn:aws:s3:us-west-2:987654321098:accesspoint/bulk-upload",
"S3Region": "us-west-2"
}
}Quick Reference Card
Service Control Commands
# Start
net start "Job Export Service"
# Stop
net stop "Job Export Service"
# Restart
net stop "Job Export Service" && net start "Job Export Service"
# Check Status
Get-Service "Job Export Service"Configuration File Location
C:\Program Files\Bazefield\Baze Job Export Service\appsettings.jsonWatched Folders
Real-time Measurements: [ConfiguredOutputFolderPath]\measurements\
Real-time Events: [ConfiguredOutputFolderPath]\events\
Historical: [ConfigureHistoricalOutputFolderPath]\measurements\Archive Folders
Real-time Archive: [ConfiguredOutputFolderPath]\archive\measurement\
Events Archive: [ConfiguredOutputFolderPath]\archive\event\
Historical Archive: [ConfigureHistoricalOutputFolderPath]\archive\measurement\