Quick Start

1. Get your API token from the Admin Panel → Open API tab

2. Send a POST request with your PDF file:

curl -X POST https://your-domain.com/api/v1/clean \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -F "file=@document.pdf"

3. Poll the task status until completed:

curl https://your-domain.com/api/v1/task/{task_id} \
  -H "Authorization: Bearer YOUR_API_TOKEN"

4. Download the cleaned PDF:

curl -o cleaned.pdf https://your-domain.com/api/v1/task/{task_id}/download \
  -H "Authorization: Bearer YOUR_API_TOKEN"

Authentication

All API requests require a valid API token. Include it in one of these headers:

Method	Header	Example
Bearer Token	Authorization	Bearer pk_live_abc123...
API Key	X-API-Key	pk_live_abc123...

API Endpoints

POST /api/v1/clean — Upload a PDF file for cleaning

Content-Type: multipart/form-data

Parameter	Type	Default	Description
file	File	required	PDF file to process
remove_password	bool	true	Remove password protection
remove_watermark	bool	true	Remove watermarks
remove_ad_text	bool	true	Remove advertising text
remove_ad_images	bool	true	Remove advertising images
remove_header_footer	bool	true	Remove ad headers/footers
remove_background	bool	true	Remove background watermarks
remove_first_last_ad_pages	bool	true	Remove first/last ad pages
use_ocr	bool	true	Enable OCR recognition
use_ai_detection	bool	false	Enable AI-powered detection
process_mode	string	generate_new	generate_new or edit_original
extra_ad_keywords	string	null	Comma-separated extra ad keywords (appended to global config)
extra_watermark_patterns	string	null	Comma-separated extra watermark patterns
custom_rules	string	null	Comma-separated rule names to apply (empty = all active)
callback_url	string	null	Webhook URL for completion notification

POST /api/v1/clean-url — Clean a PDF from URL

Same parameters as /clean but use url field instead of file.

curl -X POST https://your-domain.com/api/v1/clean-url \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "url=https://example.com/document.pdf"

GET /api/v1/task/{task_id} — Get task status

Response includes download_url when status is completed.

{
  "task_id": "abc123...",
  "status": "completed",  // pending | analyzing | processing | completed | failed
  "progress": 100,
  "file_name": "document.pdf",
  "file_size": 1024000,
  "download_url": "https://your-domain.com/api/v1/task/abc123/download",
  "error_message": null
}

GET /api/v1/task/{task_id}/download — Download cleaned PDF

Returns the cleaned PDF file as binary download.

Code Examples

Python

import requests, time

API_URL = "https://your-domain.com"
TOKEN = "pk_live_your_token_here"
HEADERS = {"Authorization": f"Bearer {TOKEN}"}

# Upload and clean a PDF file
with open("document.pdf", "rb") as f:
    resp = requests.post(f"{API_URL}/api/v1/clean", headers=HEADERS, files={"file": f})
task = resp.json()
print(f"Task created: {task['task_id']}")

# Poll until complete
while task["status"] not in ("completed", "failed"):
    time.sleep(2)
    task = requests.get(f"{API_URL}/api/v1/task/{task['task_id']}", headers=HEADERS).json()
    print(f"  Status: {task['status']} Progress: {task['progress']}%")

# Download result
if task["status"] == "completed":
    pdf = requests.get(task["download_url"], headers=HEADERS)
    with open("cleaned.pdf", "wb") as f:
        f.write(pdf.content)
    print("Cleaned PDF saved!")

# --- Or clean from URL ---
resp = requests.post(f"{API_URL}/api/v1/clean-url", headers=HEADERS,
                     data={"url": "https://example.com/document.pdf"})

JavaScript (Node.js)

const fs = require('fs');
const FormData = require('form-data');
const axios = require('axios');

const API_URL = 'https://your-domain.com';
const TOKEN = 'pk_live_your_token_here';
const headers = { 'Authorization': `Bearer ${TOKEN}` };

async function cleanPDF(filePath) {
  // Upload
  const form = new FormData();
  form.append('file', fs.createReadStream(filePath));
  const { data: task } = await axios.post(`${API_URL}/api/v1/clean`, form,
    { headers: { ...headers, ...form.getHeaders() } });
  console.log('Task:', task.task_id);

  // Poll
  let status = task;
  while (!['completed', 'failed'].includes(status.status)) {
    await new Promise(r => setTimeout(r, 2000));
    status = (await axios.get(`${API_URL}/api/v1/task/${task.task_id}`, { headers })).data;
  }

  // Download
  if (status.status === 'completed') {
    const pdf = await axios.get(status.download_url, { headers, responseType: 'arraybuffer' });
    fs.writeFileSync('cleaned.pdf', pdf.data);
  }
}
cleanPDF('document.pdf');

cURL

# Upload file
curl -X POST https://your-domain.com/api/v1/clean \
  -H "Authorization: Bearer pk_live_your_token" \
  -F "file=@document.pdf" \
  -F "remove_watermark=true" \
  -F "extra_ad_keywords=sponsor,advertisement"

# Check status
curl https://your-domain.com/api/v1/task/TASK_ID \
  -H "Authorization: Bearer pk_live_your_token"

# Download result
curl -o cleaned.pdf https://your-domain.com/api/v1/task/TASK_ID/download \
  -H "Authorization: Bearer pk_live_your_token"

# Clean from URL
curl -X POST https://your-domain.com/api/v1/clean-url \
  -H "Authorization: Bearer pk_live_your_token" \
  -F "url=https://example.com/document.pdf"

PHP

<?php
$apiUrl = 'https://your-domain.com';
$token = 'pk_live_your_token_here';

// Upload
$ch = curl_init("$apiUrl/api/v1/clean");
curl_setopt_array($ch, [
    CURLOPT_POST => true,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_HTTPHEADER => ["Authorization: Bearer $token"],
    CURLOPT_POSTFIELDS => ['file' => new CURLFile('document.pdf')],
]);
$task = json_decode(curl_exec($ch), true);
curl_close($ch);

// Poll until done
do {
    sleep(2);
    $ch = curl_init("$apiUrl/api/v1/task/{$task['task_id']}");
    curl_setopt_array($ch, [CURLOPT_RETURNTRANSFER => true,
        CURLOPT_HTTPHEADER => ["Authorization: Bearer $token"]]);
    $task = json_decode(curl_exec($ch), true);
    curl_close($ch);
} while (!in_array($task['status'], ['completed', 'failed']));

// Download
if ($task['status'] === 'completed') {
    file_put_contents('cleaned.pdf',
        file_get_contents($task['download_url'], false,
            stream_context_create(['http' => ['header' => "Authorization: Bearer $token"]])));
}

Java

import java.net.http.*;
import java.nio.file.*;

var client = HttpClient.newHttpClient();
var token = "pk_live_your_token_here";
var apiUrl = "https://your-domain.com";

// Upload file
var boundary = "----Boundary" + System.currentTimeMillis();
var body = "--" + boundary + "\r\n"
    + "Content-Disposition: form-data; name=\"file\"; filename=\"doc.pdf\"\r\n"
    + "Content-Type: application/pdf\r\n\r\n";
// ... append file bytes and closing boundary ...

var req = HttpRequest.newBuilder()
    .uri(URI.create(apiUrl + "/api/v1/clean"))
    .header("Authorization", "Bearer " + token)
    .header("Content-Type", "multipart/form-data; boundary=" + boundary)
    .POST(HttpRequest.BodyPublishers.ofByteArray(fullBody))
    .build();
var resp = client.send(req, HttpResponse.BodyHandlers.ofString());
// Parse JSON response for task_id, poll status, download result

Go

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "mime/multipart"
    "net/http"
    "os"
    "time"
)

func main() {
    token := "pk_live_your_token_here"
    apiURL := "https://your-domain.com"

    // Upload
    body := &bytes.Buffer{}
    writer := multipart.NewWriter(body)
    file, _ := os.Open("document.pdf")
    part, _ := writer.CreateFormFile("file", "document.pdf")
    io.Copy(part, file)
    writer.Close()

    req, _ := http.NewRequest("POST", apiURL+"/api/v1/clean", body)
    req.Header.Set("Authorization", "Bearer "+token)
    req.Header.Set("Content-Type", writer.FormDataContentType())

    resp, _ := http.DefaultClient.Do(req)
    var task map[string]interface{}
    json.NewDecoder(resp.Body).Decode(&task)
    fmt.Println("Task:", task["task_id"])

    // Poll & download...
}

Error Codes

HTTP Code	Error	Description
400	invalid_file	Only PDF files accepted
400	file_too_large	File exceeds size limit
400	invalid_url	Invalid URL provided
400	download_failed	Failed to download PDF from URL
401	missing_token	No API token provided
401	token_error	Invalid API token
403	token_error	Token disabled or expired
404	not_found	Task not found
429	token_error	Daily rate limit exceeded

Rate Limits & Token Management

Daily Limits

Each token can have a daily request limit (0 = unlimited). When exceeded, requests return HTTP 429.

Token Expiration

Tokens can have an expiration date. Expired tokens return HTTP 403.

Origin Restrictions

Tokens can be restricted to specific domains for additional security.

Key Regeneration

If a token is compromised, regenerate its key from the admin panel. The old key is invalidated immediately.

MCP Server (AI Integration)

PDF Cleaner Pro provides a Model Context Protocol (MCP) server for AI assistants to call PDF cleaning tools directly.

Configuration

Add to your MCP client config (e.g., Claude Desktop claude_desktop_config.json):

{
  "mcpServers": {
    "pdf-cleaner": {
      "command": "python",
      "args": ["-m", "app.mcp_server"],
      "env": {
        "PDF_CLEANER_API_URL": "https://your-domain.com",
        "PDF_CLEANER_API_TOKEN": "pk_live_your_token_here"
      }
    }
  }
}

Available Tools

Tool	Description
clean_pdf_file	Clean a local PDF file
clean_pdf_url	Clean a PDF from URL
check_task_status	Check processing status
download_cleaned_pdf	Download cleaned result

API Integration Guide