Authors: Luke Boening, Github, ChatGPT and Claude Sonnet 4.5 • Last update: December 14, 2025 03:04:31 • 2 min read

Problem Solution

Products and Problems¶

Problem 1: Selective copying of files and folders¶

Problem Statement¶

Environment:

Production Server: Windows Server 2022 hosting millions of document files (TIFF images, PDFs, etc.)
Test Server: Windows Server 2022 requiring only a small subset (~hundreds) of documents for application testing
Document Management System: IIS-hosted application that retrieves documents by metadata (document type + keywords)
Database: Oracle database with complex relational structure mapping document metadata to physical file paths
Document Structure: One logical document may consist of multiple physical files (e.g., multi-page TIFFs stored as individual files)

The Challenge: How to efficiently identify and synchronize only the specific files needed for test documents, when:

File paths are not directly known - they must be derived through complex SQL queries
One "document" maps to multiple physical files
The Oracle database schema requires joining multiple tables to resolve paths
Test server needs only 0.01% of production files but must mirror exact folder structure
Direct SMB/network access between servers may not be available
Manual copying is error-prone and doesn't scale

Solution Architecture¶

Three-Phase Approach:

Phase 1: Path Discovery (Database-Driven)¶

-- Query Oracle to identify file paths for specific documents
SELECT file_path, document_id, page_number
FROM document_files df
JOIN document_metadata dm ON df.document_id = dm.id
WHERE dm.document_type = 'Site Closure'
  AND dm.site_name = 'Test Site A'
  AND dm.status = 'ACTIVE'

Phase 2: Selective Collection¶

Export path list to JSON/CSV configuration file
PowerShell script reads configuration and builds directory structure
Compress only required files maintaining full folder hierarchy
Upload archive to Azure Blob Storage (intermediary)

Phase 3: Targeted Deployment¶

Test server downloads compressed archive from Azure
Extract files preserving complete folder structure
Verify file counts and paths match database expectations

Key Benefits:

✅ Only transfers needed files (99.99% reduction in storage)
✅ Maintains exact folder structure required by IIS application
✅ Database-driven: paths determined by actual application logic
✅ Works without direct server-to-server connectivity
✅ Repeatable: Re-run queries to refresh test data as needed
✅ Audit trail: Database records which documents/files were selected

Implementation Notes:

Use Oracle SQL to generate dynamic file path lists based on test scenarios
Leverage Azure Blob Storage as secure transfer point
PowerShell scripts handle compression, upload, download, extraction
Schedule periodic syncs to keep test environment current

Flowchart¶

flowchart TD
    A[Oracle Database] -->|SQL Query| B[Generate File Path List]
    B --> C[Export to JSON/CSV]
    C --> D[PowerShell: Collect Files]
    D --> E[Compress Archive]
    E --> F[Upload to Azure Blob]
    F --> G[Test Server Downloads]
    G --> H[Extract with Structure]
    H --> I[Verify File Counts]
    I --> J[IIS Application Ready]

Class diagram¶

classDiagram
    DOCUMENT_METADATA "1" --> "*" DOCUMENT_FILES : contains
    DOCUMENT_METADATA "*" --> "1" DOCUMENT_TYPES : has_type
    DOCUMENT_METADATA "*" --> "1" SITES : belongs_to

    class DOCUMENT_METADATA{
        +int document_id
        +string document_type
        +string site_name
        +string status
        +date created_date
    }

    class DOCUMENT_FILES{
        +int file_id
        +int document_id
        +string file_path
        +int page_number
    }

    class DOCUMENT_TYPES{
        +int type_id
        +string type_name
    }

    class SITES{
        +int site_id
        +string site_name
    }