TABLE OF CONTENTS







What Does This Service Do?


The Job Export Service automatically watches folders the servers for data files (Parquet format), uploads them to cloud storage (AWS S3 or SFTP server), and then moves them to an archive folder. Think of it as an automated file delivery system that runs 24/7 in the background.


Key Features

  • Automatic File Detection - No manual uploads needed
  • Reliable Delivery - Retries automatically if network fails
  • Priority Processing - Real-time data gets uploaded before historical data
  • Safe Archival - Keeps local copies after successful upload
  • Runs as Windows Service - Starts automatically when server boots


Future To-dos


How It Works (Simple Explanation)

1. Data files arrive in watched folders
   ↓
2. Service detects new files within seconds
   ↓
3. Files are uploaded to AWS S3 or SFTP server
   ↓
4. Successfully uploaded files are moved to archive folder
   ↓
5. Process repeats for next file

What Folders Does It Watch?

  • The service can be configured to monitor three different folders on the local file system:
    • Real-time Measurements - C:\Data\Output\measurements\
      • High priority
      • Processed immediately
    • Real-time Events/Alarms - C:\Data\Output\events\
      • High priority
      • Processed immediately
    • Historical Measurements - C:\Data\Historical\measurements\
      • Lower priority
      • Waits for real-time data to finish first

Installation

Prerequisites

  • Windows Server 2016 or later
  • .NET 8.0 Runtime installed
  • Network access to AWS S3 or SFTP server
  • Administrator rights to install Windows Service


Installation Steps

  1. Copy Files
    • Extract the service files to: C:\Program Files\Bazefield\Baze Job Export Service\
  2. Install as Windows Service
    # Run PowerShell as Administrator
    sc.exe create "Job Export Service" binPath="C:\Program Files\Bazefield\Baze Job Export Service\JobExportService.exe"
    sc.exe config "Job Export Service" start=auto
  3. Configure Settings (see Configuration section below)
  4. Start Service
    net start "Job Export Service"
  5. Verify It's Running
    • Open Services (services.msc)
    • Find "Job Export Service"
    • Status should show "Running"

Configuration

All settings are currently in the appsettings.json file located in the service installation folder.


Basic Settings (Required)

Open appsettings.json in Notepad and configure these sections:


1. Folder Paths

"ParquetWorkerOptions": 
{
  "OutputFolderPath": "C:\\Data\\Output",
  "HistoricalOutputFolderPath": "C:\\Data\\Historical",
  "CloudUploadRetryCount": 3,
  "UploadMode": "S3"
}

Settings Explained:

  •  OutputFolderPath  - Where real-time measurement and event files arrive
  •  HistoricalOutputFolderPath  - Where historical measurement files arrive
  •  CloudUploadRetryCount  - How many times to retry if upload fails
    • Default: 3 attempts
    • Recommended: 3-5 for unreliable networks
  •  UploadMode  - Choose "S3" or "Sftp"


2. AWS S3 Settings (if using S3)

"S3Options": 
{
  "S3AccessPointArn": "arn:aws:s3:us-east-1:123456789012:accesspoint/myaccesspoint",
  "S3Region": "us-east-1"
}

Settings Explained:

  •  S3AccessPointArn   - Your AWS S3 Access Point ARN (get from AWS Console)
  •  S3Region - AWS region where your S3 bucket is located
    • Examples: us-east-1eu-west-1ap-southeast-2 


Authentication Methods


This service uses IAM Role-based authentication (recommended by AWS for services running on EC2 or on-premises).

  • If running on EC2: Attach an IAM role to the EC2 instance with S3 access permissions
  • If running on-premises: Configure AWS credentials using one of these methods:
    1. AWS CLI (aws configure) - stores credentials in C:\Users\[username]\.aws\credentials
    2. Environment variables: AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
    3. Windows Credential Manager

How to Get S3 Access Point ARN:

  1. Log into AWS Console
  2. Go to S3 → Access Points
  3. Click on your access point
  4. Copy the ARN (starts with arn:aws:s3:...)


Setting Up IAM Permissions (for IT administrators):


The IAM role/user needs these permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:PutObjectAcl"
      ],
      "Resource": "arn:aws:s3:us-east-1:123456789012:accesspoint/myaccesspoint/object/*"
    }
  ]
}

3. SFTP Settings (if using SFTP)


Option A: Password Authentication (Not Recommended)

"SftpOptions": 
{
  "SftpHost": "sftp.example.com",
  "SftpPort": 22,
  "SftpUsername": "datauploader",
  "SftpPassword": "YourSecurePassword"
}

Option B: Private Key Authentication (Recommended)

"SftpOptions": 
{
  "SftpHost": "sftp.example.com",
  "SftpPort": 22,
  "SftpUsername": "datauploader",
  "SftpPrivateKeyPath": "C:\\Keys\\sftp_private_key.pem",
  "SftpPrivateKeyPassphrase": "KeyPassphrase123"  // Optional, only if key is encrypted
}

Settings Explained:

  • SftpHost - SFTP server address (domain name or IP address)
  • SftpPort - SFTP port (usually 22)
  • SftpUsername - Your SFTP username
  • SftpPassword - Your SFTP password (for password auth)
  • SftpPrivateKeyPath - Path to SSH private key file (for key-based auth)
  • SftpPrivateKeyPassphrase - Passphrase if private key is encrypted (optional)

⚠️ Security Note:

  • Private key authentication is more secure than password authentication
  • Store private keys in a secure location (e.g., C:\Keys\) with restricted permissions
  • Never commit private keys to source control
  • Use encrypted keys with passphrases for additional security

Folder Structure

The service automatically creates these folders:

C:\Data\Output\
├── measurements\          ← Place real-time measurement files here
├── events\               ← Place real-time event/alarm files here
└── archive\
    ├── measurement\      ← Processed measurement files moved here
    └── event\           ← Processed event files moved here

C:\Data\Historical\
├── measurements\          ← Place historical measurement files here
└── archive\
    └── measurement\      ← Processed historical files moved here

What Happens to Files?

  1. Files are in measurements\ or events\ folder
  2. During Upload: Service reads file, extracts metadata, uploads to cloud
  3. After Upload: File is moved to archive\measurement\ or archive\event\
  4. Archive Naming: Files get timestamp added (e.g., data_20241211103045123.parquet)

Common Tasks

Starting the Service

    Option 1: Services Console

  1. Press Windows + R
    1. Type services.msc and press Enter
    2. Find "Job Export Service"
    3. Right-click → Start

    Option 2: Command Line

net start "Job Export Service"

Stopping the Service

    Option 1: Services Console

  1. Open services.msc
  2. Find "Job Export Service"
  3. Right-click → Stop

    Option 2: Command Line

net stop "Job Export Service"

Restarting After Configuration Changes

  1. Stop the service
  2. Edit appsettings.json
  3. Save the file
  4. Start the service

Checking If Service Is Running

    Option 1: Services Console

  • Open services.msc
  • Look for "Job Export Service"
  • Status column shows "Running" or "Stopped"

    Option 2: Command Line

Get-Service "Job Export Service"

Monitoring

Where Are the Logs?

    ⚠️  See upcoming work notes here


Logs are written to:

  • Windows Event Viewer: Application logs

Viewing Logs in Event Viewer

  1. Press Windows + R
  2. Type eventvwr.msc and press Enter
  3. Navigate to: Windows Logs → Application
  4. Filter for Source: "Job Export Service"


What to Look For

✅ Good Signs:

  • "File watchers initialized and monitoring directories"
  • "Successfully processed and uploaded: [filename]"
  • "Processing X real-time measurement file(s)"

⚠️ Warning Signs:

  • "Parquet file upload failed"
  • "Error during periodic directory scan"
  • "Failed to move file to archive"

? Critical Issues:

  • "Fatal error in ParquetArchiveProcessor"
  • Service status shows "Stopped"

Common Log Messages

MessageMeaningAction Needed
"File watchers initialized"Service started successfullyNone - normal
"Processing X files"Files are being uploadedNone - normal
"Successfully processed and uploaded"File uploaded successfullyNone - normal
"Parquet file upload failed"Network issue or S3/SFTP problemCheck network, verify credentials
"No metadata found"File is corrupted or invalidCheck file, may need to delete
"File already queued or in progress"Duplicate file detection workingNone - normal
"Error processing historical directory"Issue with historical folderCheck folder permissions

Troubleshooting

Files Not Being Uploaded

Check 1: Is Service Running?

  • Open services.msc
  • Verify "Job Export Service" status is "Running"
  • If stopped, start it

Check 2: Are Files in Correct Folder?

  • Verify files are .parquet format
  • Verify they're in measurements\ or events\ folder
  • Check that folder paths in config match actual folders

Check 3: Check Logs

  • Open Event Viewer
  • Look for errors in Application logs
  • Common issues: network problems, permission errors

Check 4: Network Connectivity

  • Verify server can reach S3 or SFTP server
  • Test with: Test-NetConnection sftp.example.com -Port 22


Files Getting Stuck

Symptom: Files stay in folder, never move to archive

Possible Causes:

  1. Upload failing (check logs)
  2. File is locked by another program
  3. Corrupted file
  4. Network issue

Solution:

  1. Check Event Viewer for error messages
  2. Verify credentials are correct in appsettings.json
  3. Try manually deleting one file to see if service processes others
  4. Restart service


Service Won't Start

Possible Causes:

  1. Configuration file has errors (invalid JSON)
  2. Missing required settings
  3. Folder paths don't exist
  4. .NET Runtime not installed

Solution:

  1. Validate appsettings.json using online JSON validator
  2. Check Windows Event Viewer for startup errors
  3. Ensure all folder paths exist
  4. Reinstall .NET 8.0 Runtime


Upload Errors


Error: "Parquet file upload failed"


    Possible Causes:

  • Network disconnected
    • Wrong S3 credentials or IAM permissions
    • SFTP authentication failed (wrong password or private key)
    • SFTP server down
    • Firewall blocking connection

    Solution:

  1. Check network connectivity
  2. For S3:
    • Verify IAM role/credentials are configured
      • Test with: aws s3 ls from same server
      • Check CloudWatch logs for permission errors
  3. For SFTP:
    • Verify credentials in appsettings.json
      • If using private key, ensure file exists at SftpPrivateKeyPath
      • Test manually: sftp -i C:\Keys\sftp_key.pem username@host
      • Check private key file permissions (should be readable by service account)
  4. Check firewall rules (port 22 for SFTP, port 443 for S3)


Error: "SSH private key file not found"


    Possible Causes:

  • Private key path in config is incorrect
  • Private key file was deleted or moved
  • Service account doesn't have permission to read key file

    Solution:

  1. Verify SftpPrivateKeyPath in appsettings.json is correct
  2. Check that file exists at specified path
  3. Grant service account Read permission on private key file
  4. Ensure key is in OpenSSH format (not PuTTY .ppk format)

Performance Tips

How Many Files Can It Handle?

  • Normal Load: 100-500 files per hour - no issues
  • High Load: 1000+ files per hour - may need tuning
  • Very High Load: 10,000+ files per hour - consider multiple instances


Optimizing for High Volume

    If you're processing many files per hour:

  1. Reduce Retry Count(faster failure detection):
    "CloudUploadRetryCount": 2
  2. Monitor Archive Folder Size - Old files should be cleaned up periodically


Disk Space Management

Archive folders will grow over time:

  • The AvroOutputAdapter for the data engine has a setting to purge the archive folder:
    • MaxHistoryDays
  • Once set, this will delete Parquet files that have been stored on disk for number of days 

Security Best Practices

Protecting Credentials

  1. SFTP Authentication:
    • ✅ Prefer SSH private keys over passwords
    • Encrypt private keys with passphrases
    • Store keys in C:\Keys\ with restricted NTFS permissions (Administrators only)
    • Use 4096-bit RSA or Ed25519 keys
  2. AWS S3 Authentication:
    • ✅ Prefer IAM roles (EC2 instance roles) over access keys
    • If using access keys, never store in appsettings.json
    • Use AWS CLI credential file or environment variables
    • Rotate access keys every 90 days
    • Use least-privilege IAM policies
  3. Configuration File Security:
    • Encrypt sensitive values using Windows DPAPI
    • Restrict appsettings.json permissions to Administrators only
    • Never commit credentials to source control
  4. Service Account Security:
    • Run service under dedicated service account (not SYSTEM)
    • Grant minimal required permissions


Network Security

  1. Firewall Rules: Allow outbound to S3/SFTP servers only
  2. VPN: Use VPN for SFTP connections if possible
  3. TLS/SSL: Ensure S3 uses HTTPS (enabled by default)


Folder Permissions

Recommended NTFS permissions:

  • Service account: Read/Write on all data folders
  • Administrators: Full Control
  • Users: No access

Appendix: Configuration Examples

Example 1: AWS S3 in US East

{
  "ParquetWorkerOptions": {
    "OutputFolderPath": "C:\\Data\\Output",
    "HistoricalOutputFolderPath": "C:\\Data\\Historical",
    "PeriodFileScanIntervalSec": 30,
    "CloudUploadRetryCount": 3,
    "UploadMode": "S3"
  },
  "S3Options": {
    "S3AccessPointArn": "arn:aws:s3:us-east-1:123456789012:accesspoint/data-upload",
    "S3Region": "us-east-1"
  }
}

Example 2: SFTP Server with Private Key Authentication

{
  "ParquetWorkerOptions": {
    "OutputFolderPath": "D:\\FileUpload\\Realtime",
    "HistoricalOutputFolderPath": "D:\\FileUpload\\Historical",
    "PeriodFileScanIntervalSec": 60,
    "CloudUploadRetryCount": 5,
    "UploadMode": "Sftp"
  },
  "SftpOptions": {
    "SftpHost": "upload.example.com",
    "SftpPort": 22,
    "SftpUsername": "dataservice",
    "SftpPrivateKeyPath": "C:\\Keys\\sftp_production.pem",
    "SftpPrivateKeyPassphrase": "MySecurePassphrase!"
  }
}

Example 3: High-Volume Processing

{
  "ParquetWorkerOptions": {
    "OutputFolderPath": "E:\\HighVolume\\Output",
    "HistoricalOutputFolderPath": "E:\\HighVolume\\Historical",
    "PeriodFileScanIntervalSec": 60,
    "CloudUploadRetryCount": 2,
    "UploadMode": "S3"
  },
  "S3Options": {
    "S3AccessPointArn": "arn:aws:s3:us-west-2:987654321098:accesspoint/bulk-upload",
    "S3Region": "us-west-2"
  }
}

Quick Reference Card

Service Control Commands

# Start
net start "Job Export Service"

# Stop
net stop "Job Export Service"

# Restart
net stop "Job Export Service" && net start "Job Export Service"

# Check Status
Get-Service "Job Export Service"

Configuration File Location

C:\Program Files\Bazefield\Baze Job Export Service\appsettings.json


Watched Folders

Real-time Measurements: [ConfiguredOutputFolderPath]\measurements\
Real-time Events: [ConfiguredOutputFolderPath]\events\
Historical: [ConfigureHistoricalOutputFolderPath]\measurements\

Archive Folders

Real-time Archive: [ConfiguredOutputFolderPath]\archive\measurement\
Events Archive: [ConfiguredOutputFolderPath]\archive\event\
Historical Archive: [ConfigureHistoricalOutputFolderPath]\archive\measurement\