Skip to content

Batch Document Processing

Process multiple documents from cloud storage on a schedule.

Workflow Overview

graph LR
    A[Schedule Trigger] --> B[List Files]
    B --> C[Loop]
    C --> D[Download File]
    D --> E[DeepTagger]
    E --> F[Database]
    F --> G[Archive File]

Use Case

Scenario: Process new documents uploaded to shared drive overnight

Automation: Every night at midnight, extract data from all new PDFs

Quick Setup

  1. Schedule Trigger
  2. Cron: 0 0 * * * (midnight daily)

  3. Google Drive / Dropbox

  4. List files in "/Invoices/New"
  5. Filter: PDF files only

  6. Loop Over Items

  7. Google Drive Download

  8. File ID: {{$json["id"]}}

  9. DeepTagger Node

  10. Project ID: (your project)
  11. Input Type: File
  12. Binary Property: data

  13. Database Insert

  14. Insert extracted data

  15. Move File

  16. Move to "/Invoices/Processed"

Performance Tips

  • Add 1-second delay between iterations
  • Enable "Continue on Fail" for fault tolerance
  • Process max 100 files per run
  • Log all processing to database

Expected Processing Time

  • 50 documents × 5 seconds each = ~4 minutes total