API Documentation

Complete reference for the ISIN-AI API

Overview

The ISIN-AI API provides powerful document processing capabilities with configurable extraction methods.

Base URL:

https://isin-ai.com

Authentication

All API requests require authentication using an API key sent in the request header.

API Key Header

X-API-Key: YOUR_API_KEY

Include this header in all requests to authenticate. Requests without a valid API key will receive a 401 Unauthorized response.

Security Note: Keep your API key secure and never share it publicly. If your key is compromised, contact support to regenerate it.

POST /api/process

Process a single document with configurable extraction settings

Request Parameters

Parameter Type Required Description
X-API-Key Header Yes API authentication key
file File Yes PDF, PNG, JPG, or JPEG file
text_extractor String Yes high_quality, form, or llm
field_extractor String Yes schema_free or schema_based
schema JSON String No JSON schema (required if field_extractor=schema_based)
use_challenger Boolean No Enable challenger verification (default: false)

Python Examples

Example 1: High Quality OCR + Schema-Free (Auto-Discovery)

import requests

url = "https://your-replit-app.repl.co/api/process"
api_key = "YOUR_API_KEY"  # Replace with your actual API key
headers = {"X-API-Key": api_key}

with open("document.pdf", "rb") as f:
    files = {"file": f}
    data = {
        "text_extractor": "high_quality",
        "field_extractor": "schema_free",
        "use_challenger": False
    }
    
    response = requests.post(url, files=files, data=data, headers=headers)
    result = response.json()
    
    print("Extracted Fields:", result["extracted_fields"])

Example 2: Schema-Based Extraction (Business Permit)

import requests
import json

url = "https://your-replit-app.repl.co/api/process"
api_key = "YOUR_API_KEY"  # Replace with your actual API key
headers = {"X-API-Key": api_key}

# Define schema with fields array
schema = {
    "fields": [
        {
            "name": "business_name",
            "description": "Legal name of the business"
        },
        {
            "name": "owner_name",
            "description": "Full name of the business owner or proprietor"
        },
        {
            "name": "registration_date",
            "description": "Date when the business was registered"
        },
        {
            "name": "expiration_date",
            "description": "Date when the business permit expires"
        },
        {
            "name": "permit_number",
            "description": "Unique permit identification number"
        },
        {
            "name": "business_address",
            "description": "Physical address of the business"
        }
    ]
}

with open("business_permit.pdf", "rb") as f:
    files = {"file": f}
    data = {
        "text_extractor": "high_quality",
        "field_extractor": "schema_based",
        "schema": json.dumps(schema),
        "use_challenger": True
    }
    
    response = requests.post(url, files=files, data=data, headers=headers)
    result = response.json()
    
    print("Extracted Fields:", result["extracted_fields"])
    print("Verification:", result["challenger_results"])

Example 3: Form OCR

import requests

url = "https://your-replit-app.repl.co/api/process"
api_key = "YOUR_API_KEY"  # Replace with your actual API key
headers = {"X-API-Key": api_key}

with open("form.pdf", "rb") as f:
    files = {"file": f}
    data = {
        "text_extractor": "form",
        "field_extractor": "schema_free",
        "use_challenger": False
    }
    
    response = requests.post(url, files=files, data=data, headers=headers)
    result = response.json()
    
    print("Status:", "Success" if result["success"] else "Failed")
    print("Fields Found:", result["extracted_fields"])

Response Format

{
  "success": true,
  "extracted_text": "Full extracted text content...",
  "extracted_fields": {
    "field_name": "value",
    "another_field": "value"
  },
  "challenger_results": {  // Only if use_challenger=true
    "overall_accuracy": 95.5,
    "summary": "High confidence extraction",
    "verification_results": {
      "field_name": {
        "value": "verified_value",
        "is_accurate": true,
        "confidence": 98.0
      }
    }
  },
  "config_used": {
    "text_extractor": "high_quality",
    "field_extractor": "schema_free",
    "use_challenger": false
  },
  "processing_time_seconds": 12.5
}

POST /api/process-multiple

Process multiple documents and check consistency

Request Parameters

Parameter Type Required Description
X-API-Key Header Yes API authentication key
files File[] Yes Multiple PDF, PNG, JPG, or JPEG files
text_extractor String Yes high_quality, form, or llm
field_extractor String Yes schema_free or schema_based
schema JSON String No JSON schema for field extraction
use_challenger Boolean No Enable challenger verification (default: false)

Python Example

import requests

url = "https://your-replit-app.repl.co/api/process-multiple"
api_key = "YOUR_API_KEY"  # Replace with your actual API key
headers = {"X-API-Key": api_key}

files = [
    ("files", open("document1.pdf", "rb")),
    ("files", open("document2.pdf", "rb")),
    ("files", open("document3.pdf", "rb"))
]

data = {
    "text_extractor": "high_quality",
    "field_extractor": "schema_free",
    "use_challenger": False
}

response = requests.post(url, files=files, data=data, headers=headers)
result = response.json()

print("Total Documents:", result["total_documents"])
print("Consistency Status:", result["consistency"]["status"])
print("Summary:", result["consistency"]["summary"])

# Check for inconsistencies
if result["consistency"]["inconsistencies"]:
    print("\nInconsistencies Found:")
    for issue in result["consistency"]["inconsistencies"]:
        print(f"  - {issue['severity'].upper()}: {issue['description']}")
else:
    print("\nNo inconsistencies found - all documents match!")

# Access individual document results
for doc in result["results"]:
    print(f"\nDocument: {doc['filename']}")
    print(f"Fields: {doc['extracted_fields']}")

Response Format

{
  "success": true,
  "results": [
    {
      "filename": "document1.pdf",
      "extracted_fields": {...},
      "challenger_results": {...}  // If use_challenger=true
    },
    {
      "filename": "document2.pdf",
      "extracted_fields": {...},
      "challenger_results": {...}
    }
  ],
  "consistency": {
    "status": "pass" | "fail" | "warning",
    "summary": "Overall consistency assessment",
    "inconsistencies": [
      {
        "type": "name" | "date" | "amount" | "business_name" | "other",
        "severity": "critical" | "high" | "medium" | "low",
        "description": "Detailed description",
        "affected_documents": ["document1.pdf", "document2.pdf"],
        "details": "Specific values: Document 1: X, Document 2: Y"
      }
    ]
  },
  "total_documents": 2,
  "config_used": {
    "text_extractor": "high_quality",
    "field_extractor": "schema_free",
    "use_challenger": false
  }
}

Configuration Guide

Text Extraction Methods

  • High Quality OCR (high_quality): Best quality with layout preservation, optimal for complex documents
  • Form OCR (form): Optimized for structured form documents with fields and tables
  • LLM Vision (llm): AI-powered vision model for complex or handwritten documents

Field Extraction Modes

  • Schema-Free (schema_free): Automatically discovers all key-value pairs in the document
  • Schema-Based (schema_based): Extracts only specific fields defined in the schema

Challenger Verification

Enable use_challenger=true to get:

  • Secondary LLM validation of extracted fields
  • Confidence scores for each field
  • Accuracy flags and verification summary
  • Recommended for mission-critical applications

POST /api/forgery-detect

Detect image/PDF forgery using classic forensic methods or LLM vision analysis

Request Parameters (multipart/form-data)

Parameter Type Required Description
file File Yes Image (PNG, JPG, JPEG, BMP, TIFF, WEBP) or single-page PDF
method String No Detection method: classic (default), llm, ela_llm, or both
classic_methods JSON Array No Which classic methods to run: ["ela", "noise", "clone", "exif"] (all by default)

Detection Methods

Error Level Analysis (ELA)

Re-saves the image at a known quality level and compares compression differences. Edited regions appear brighter in the ELA visualization.

Noise Analysis

Examines noise patterns across the image. Edited areas often have different noise characteristics than the original.

Clone Detection

Scans for duplicate regions using block-matching algorithms to detect copy-paste edits.

EXIF/Metadata Inspection

Checks for stripped metadata, editing software tags (e.g., Adobe Photoshop), and mismatches between camera model and image properties.

ELA + LLM Vision (ela_llm)

Runs ELA first to generate a heatmap, then sends both the original image and the ELA heatmap to an AI vision model. The LLM focuses specifically on the highlighted ELA regions to provide targeted manipulation analysis, with built-in awareness of common false positives like sharp text, holograms, and compression artifacts.

LLM Vision Analysis

AI-powered visual inspection that analyzes the image for manipulation artifacts, AI generation signs, and editing inconsistencies.

Example Request

import requests
import json

url = "https://isin-ai.com/api/forgery-detect"
api_key = "YOUR_API_KEY"
headers = {"X-API-Key": api_key}

with open("suspicious_image.jpg", "rb") as f:
    files = {"file": f}
    data = {
        "method": "classic",
        "classic_methods": json.dumps(["ela", "noise", "clone", "exif"])
    }

    response = requests.post(url, files=files, data=data, headers=headers)
    result = response.json()

    print("Success:", result["success"])
    print("Verdict:", result["results"]["verdict"])
    print("Severity:", result["results"]["overall_severity"])

Example Response

{
  "success": true,
  "method": "classic",
  "filename": "suspicious_image.jpg",
  "image_size": {"width": 1920, "height": 1080},
  "results": {
    "overall_severity": "medium",
    "verdict": "Some signs of potential manipulation found",
    "analyses": {
      "ela": {
        "method": "Error Level Analysis (ELA)",
        "severity": "medium",
        "assessment": "Moderate signs of potential editing",
        "suspicious_pixel_ratio": 0.0823,
        "ela_image_base64": "..."
      },
      "noise": {
        "method": "Noise Analysis",
        "severity": "low",
        "assessment": "Noise pattern appears consistent"
      },
      "clone": {
        "method": "Clone Detection",
        "severity": "low",
        "assessment": "No significant cloned regions detected"
      },
      "exif": {
        "method": "EXIF/Metadata Inspection",
        "severity": "high",
        "assessment": "Significant metadata issues found",
        "editing_software_detected": ["Adobe Photoshop CC 2024"]
      }
    }
  },
  "processing_time_seconds": 2.34
}

Error Handling

All endpoints return standard HTTP status codes:

  • 200 OK: Request successful
  • 400 Bad Request: Invalid parameters or file format
  • 500 Internal Server Error: Processing error

Error responses include a descriptive message:

{
  "detail": "Error description"
}