Operations Reference¶

The DeepTagger node currently supports one primary operation: Extract Data. This page provides detailed documentation on how to use it.

Extract Data Operation¶

Extracts structured data from a document or text using a trained DeepTagger project.

Parameters¶

Operation¶

Type: Dropdown (fixed)

Value: Extract Data

Description: The type of operation to perform. Currently only data extraction is supported.

Project ID¶

Type: String

Required: Yes

Description: The ID of your trained DeepTagger project.

Format: fo_ followed by a timestamp (e.g., fo_1759714105892)

How to find:

Go to https://deeptagger.com/das/fos
Click on your project

Copy the ID from the URL:

https://deeptagger.com/das/fos/fo_1759714105892
                                   ^^^^^^^^^^^^^^^^
                                   This is your Project ID

Example: fo_1759714105892

Hint: The node includes a helpful hint text with this information.

Input Type¶

Type: Dropdown

Required: Yes

Options: - File (default) - Upload a document file - Text - Send raw text for extraction

Description: Determines whether you're sending a file (PDF, image) or raw text content.

When to use File: - Processing PDFs, images, scanned documents - Working with binary data from previous nodes - Uploaded documents via webhooks

When to use Text: - Extracting from plain text, emails, form submissions - Text data from APIs or databases - Markdown, HTML, or other text formats

Binary Property (for File input)¶

Type: String

Default: data

Required: Yes (when Input Type = File)

Description: Name of the binary property containing the file data.

Common values: - data (default for most nodes) - file (some HTTP nodes) - attachment (email nodes)

How it works: n8n passes binary data between nodes using named properties. This parameter tells the DeepTagger node where to find the file data from the previous node.

Example: If a previous node outputs binary data in a property called invoice, set this to invoice.

Text (for Text input)¶

Type: Multi-line String

Required: Yes (when Input Type = Text)

Description: The raw text content to extract data from.

Usage: - Can be hardcoded text (for testing) - Or dynamic expression: {{$json["body"]}} (from previous node)

Example:

Invoice #12345
Date: 2025-01-15
Total: $1,234.56

Input Requirements¶

For File Input¶

The previous node must output binary data. Compatible source nodes include:

HTTP Request - Download files from URLs
Webhook - Receive uploaded files
Google Drive - Read files from Drive
Dropbox - Read files from Dropbox
Email (IMAP) - Extract attachments
Read Binary File - Load files from disk
FTP - Download files via FTP

Binary data structure:

{
  "data": {
    "data": "base64encodeddata...",
    "mimeType": "application/pdf",
    "fileName": "invoice.pdf"
  }
}

For Text Input¶

The previous node must output JSON data containing text. Compatible source nodes include:

HTTP Request - API responses with text
Webhook - Form submissions
Email - Email body text
Google Sheets - Cell content
Database - Query results
Set - Manually set text value

JSON data structure:

{
  "text": "Invoice content here..."
}

Output¶

The node returns structured JSON data extracted from the document.

Success Output¶

{
  "invoice_number": "INV-2025-001",
  "date": "2025-01-15",
  "total": "$1,234.56",
  "vendor": "Acme Corporation",
  "line_items": [
    {
      "description": "Widget A",
      "quantity": 10,
      "price": "$10.00"
    },
    {
      "description": "Widget B",
      "quantity": 5,
      "price": "$20.00"
    }
  ]
}

The exact structure depends on your DeepTagger project configuration.

Error Output¶

If an error occurs (and "Continue on Fail" is enabled):

{
  "error": "Project not found"
}

Configuration Options¶

Continue on Fail¶

Location: Node settings (click the three dots menu)

Description: If enabled, the workflow continues even if the DeepTagger node fails. The error is returned as JSON.

Use cases: - Batch processing where some documents may fail - Fault-tolerant workflows - Logging errors without stopping the workflow

When enabled:

{
  "error": "Failed to extract data: Invalid project ID"
}

When disabled: Workflow execution stops and shows error message.

Usage Examples¶

Example 1: Extract Invoice Data from File Upload¶

Workflow:

Webhook → DeepTagger → Google Sheets

DeepTagger Configuration: - Operation: Extract Data - Project ID: fo_1759714105892 (your invoice project) - Input Type: File - Binary Property: data

Webhook receives file upload via multipart/form-data.

DeepTagger extracts:

{
  "invoice_number": "INV-123",
  "total": "$500.00",
  "date": "2025-01-15"
}

Google Sheets appends row with extracted data.

Example 2: Extract Receipt Data from Email¶

Workflow:

Email Trigger → Filter → DeepTagger → Airtable

DeepTagger Configuration: - Operation: Extract Data - Project ID: fo_1759722334567 (your receipt project) - Input Type: File - Binary Property: attachment0 (first attachment)

Filter ensures email has PDF attachment.

DeepTagger processes the attachment.

Airtable creates record with extracted data.

Example 3: Extract Data from Text (Form Submission)¶

Workflow:

Webhook → Set → DeepTagger → Database

Set Node formats the form data:

{
  "text": "{{$json.body.formContent}}"
}

DeepTagger Configuration: - Operation: Extract Data - Project ID: fo_1759733445678 (your form project) - Input Type: Text - Text: {{$json["text"]}}

Database node inserts structured data.

Example 4: Batch Process Documents from Google Drive¶

Workflow:

Schedule → Google Drive List → Loop → Google Drive Download → DeepTagger → Spreadsheet → Move File

Google Drive List finds new PDFs in a folder.

Loop processes each file individually.

Google Drive Download gets the file binary data.

DeepTagger Configuration: - Operation: Extract Data - Project ID: fo_1759744556789 (your document project) - Input Type: File - Binary Property: data

Spreadsheet logs extracted data.

Move File archives processed documents.

Example 5: Extract Contract Terms with Error Handling¶

Workflow:

Dropbox Trigger → DeepTagger → IF → [Success Path] → [Error Path]

DeepTagger Configuration: - Operation: Extract Data - Project ID: fo_1759755667890 (your contract project) - Input Type: File - Binary Property: data - Settings: ✅ Continue on Fail

IF Node checks for errors:

{{$json["error"] === undefined}}

Success Path: Send to Airtable, notify Slack.

Error Path: Log to error database, send alert email.

Expressions and Dynamic Values¶

Using Dynamic Project IDs¶

If you have multiple projects and want to select dynamically:

// Based on document type
{{$json["docType"] === "invoice" ? "fo_1759714105892" : "fo_1759722334567"}}

// From previous node
{{$json["projectId"]}}

// From workflow variables
{{$vars.INVOICE_PROJECT_ID}}

Using Dynamic Text¶

Extract from different sources:

// Email body
{{$json["body"]["text"]}}

// HTTP response
{{$json["content"]}}

// Database query result
{{$json["documentText"]}}

// Multiple fields concatenated
{{$json["title"] + "\n" + $json["description"]}}

Accessing Binary Data from Specific Nodes¶

Different nodes name their binary output differently:

// HTTP Request (binary)
Binary Property: data

// Email IMAP (first attachment)
Binary Property: attachment0

// Google Drive
Binary Property: data

// Read Binary File
Binary Property: data

Performance Considerations¶

Processing Time¶

Text extraction: ~2-5 seconds
PDF processing: ~5-15 seconds (depends on page count)
Image processing: ~3-10 seconds (depends on resolution and complexity)

File Size Limits¶

Maximum file size: Check your DeepTagger plan limits
Recommended: Keep files under 10MB for best performance
Large PDFs: Consider splitting into smaller chunks

Rate Limits¶

Check your DeepTagger API plan for rate limits
For high-volume workflows, implement:
Retry logic with exponential backoff
Queue management
Batch processing with delays

Best Practices¶

Batch Processing: Add small delays between nodes when processing many documents
Error Handling: Always enable "Continue on Fail" for batch workflows
Caching: For identical documents, consider caching results
Monitoring: Log all extractions for auditing and debugging

Next Steps¶

Example Workflows - Detailed workflow examples
Troubleshooting - Common issues and solutions
API Reference - Direct API usage

Future Operations¶

Future versions may include:

List Projects - Get all available projects
Train Model - Add training examples via API
Batch Extract - Process multiple documents in one call
Get Extraction Status - Check async processing status

Stay tuned for updates!