Skip to content
Authors: Luke Boening, Github, ChatGPT and Claude Sonnet 4.5 • Last update: December 14, 2025 03:04:31 • 2 min read

Problem Solution

Products and Problems

Problem 1: Selective copying of files and folders

Problem Statement

Environment:

  • Production Server: Windows Server 2022 hosting millions of document files (TIFF images, PDFs, etc.)
  • Test Server: Windows Server 2022 requiring only a small subset (~hundreds) of documents for application testing
  • Document Management System: IIS-hosted application that retrieves documents by metadata (document type + keywords)
  • Database: Oracle database with complex relational structure mapping document metadata to physical file paths
  • Document Structure: One logical document may consist of multiple physical files (e.g., multi-page TIFFs stored as individual files)

The Challenge: How to efficiently identify and synchronize only the specific files needed for test documents, when:

  1. File paths are not directly known - they must be derived through complex SQL queries
  2. One "document" maps to multiple physical files
  3. The Oracle database schema requires joining multiple tables to resolve paths
  4. Test server needs only 0.01% of production files but must mirror exact folder structure
  5. Direct SMB/network access between servers may not be available
  6. Manual copying is error-prone and doesn't scale

Solution Architecture

Three-Phase Approach:

Phase 1: Path Discovery (Database-Driven)

-- Query Oracle to identify file paths for specific documents
SELECT file_path, document_id, page_number
FROM document_files df
JOIN document_metadata dm ON df.document_id = dm.id
WHERE dm.document_type = 'Site Closure'
  AND dm.site_name = 'Test Site A'
  AND dm.status = 'ACTIVE'

Phase 2: Selective Collection

  • Export path list to JSON/CSV configuration file
  • PowerShell script reads configuration and builds directory structure
  • Compress only required files maintaining full folder hierarchy
  • Upload archive to Azure Blob Storage (intermediary)

Phase 3: Targeted Deployment

  • Test server downloads compressed archive from Azure
  • Extract files preserving complete folder structure
  • Verify file counts and paths match database expectations

Key Benefits:

  • ✅ Only transfers needed files (99.99% reduction in storage)
  • ✅ Maintains exact folder structure required by IIS application
  • ✅ Database-driven: paths determined by actual application logic
  • ✅ Works without direct server-to-server connectivity
  • ✅ Repeatable: Re-run queries to refresh test data as needed
  • ✅ Audit trail: Database records which documents/files were selected

Implementation Notes:

  • Use Oracle SQL to generate dynamic file path lists based on test scenarios
  • Leverage Azure Blob Storage as secure transfer point
  • PowerShell scripts handle compression, upload, download, extraction
  • Schedule periodic syncs to keep test environment current

Flowchart

flowchart TD
    A[Oracle Database] -->|SQL Query| B[Generate File Path List]
    B --> C[Export to JSON/CSV]
    C --> D[PowerShell: Collect Files]
    D --> E[Compress Archive]
    E --> F[Upload to Azure Blob]
    F --> G[Test Server Downloads]
    G --> H[Extract with Structure]
    H --> I[Verify File Counts]
    I --> J[IIS Application Ready]

Class diagram

classDiagram
    DOCUMENT_METADATA "1" --> "*" DOCUMENT_FILES : contains
    DOCUMENT_METADATA "*" --> "1" DOCUMENT_TYPES : has_type
    DOCUMENT_METADATA "*" --> "1" SITES : belongs_to

    class DOCUMENT_METADATA{
        +int document_id
        +string document_type
        +string site_name
        +string status
        +date created_date
    }

    class DOCUMENT_FILES{
        +int file_id
        +int document_id
        +string file_path
        +int page_number
    }

    class DOCUMENT_TYPES{
        +int type_id
        +string type_name
    }

    class SITES{
        +int site_id
        +string site_name
    }