Back to Catalog
Finance Doc Parser icon

Finance Doc Parser

Verified

by Dryade

starter industry-verticals
0.0 (0 ratings) 0 downloads

Description

Extract structured data from financial documents: invoices, balance sheets, income statements, and bank statements

Screenshots

Details

Finance Document Parser

Tier: Starter | Type: Tool | Category: Finance | Version: 1.0.0

Extract structured, machine-readable data from common financial documents including invoices, balance sheets, income statements, and bank statements. Reduces manual data entry from hours to seconds.


1. Overview

Plugin Name: Finance Document Parser Slug: finance-doc-parser Required Tier: starter Plugin Type: tool (REST API endpoints) Category: Finance Author: Dryade License: DSUL

What It Does

Parses financial documents and extracts all key fields into structured JSON. Supports four document types commonly processed by finance teams: invoices, balance sheets, income statements, and bank statements.

Key Capabilities

  • Invoice parsing: vendor details, line items, amounts, VAT, payment terms
  • Balance sheet extraction: assets, liabilities, equity with full breakdown
  • Income statement parsing: revenue, expenses, margins, period data
  • Bank statement processing: transaction list, categorization, running balances
  • Batch processing for multiple documents at once

2. User Stories

Primary User Stories

US-1: Automate Invoice Data Entry

As a bookkeeper, I want to extract invoice data automatically so that I can eliminate manual data entry and reduce errors.

Acceptance Criteria:

  • [ ] Upload invoice content and receive structured JSON with all fields
  • [ ] Vendor name, amounts, line items, and dates correctly extracted
  • [ ] VAT/tax amounts identified separately

US-2: Build Financial Models from Statements

As a financial analyst, I want to parse balance sheets and income statements so that I can feed clean data into my models without manual transcription.

Acceptance Criteria:

  • [ ] Balance sheet returns assets/liabilities/equity breakdown
  • [ ] Income statement returns revenue/expenses/margins
  • [ ] Data is consistent and machine-readable

Edge Cases

  • Unsupported document type: Returns clear error with list of supported types
  • Malformed document content: Returns success=false with descriptive error message
  • Empty content: Handled gracefully, returns mock data in default mode

3. Architecture

Component Diagram

+------------------+     +------------------+     +------------------+
|   Plugin Router  | --> |  Parse Logic     | --> |  Data Provider   |
|  /finance-doc-   |     |  routes.py       |     |  (mock / real)   |
|  parser/*        |     +------------------+     +------------------+
+------------------+            |
                          +-----v------+
                          |  Demo Data |
                          |  data/*.json|
                          +------------+

Components

| Component | File | Responsibility | |-----------|------|----------------| | Router | routes.py | API endpoints, request validation | | Plugin | plugin.py | Plugin lifecycle, config management | | Data | data/ | Demo datasets (5 JSON files) |

Dependencies

  • Internal: core.plugins.PluginProtocol, core.plugin_config_store.PluginConfigStore
  • External: None (standard library only in mock mode)
  • Plugin: None

4. API Spec

REST Endpoints

| Method | Path | Description | Auth | |--------|------|-------------|------| | POST | /api/plugins/finance-doc-parser/parse | Parse a single document | Yes | | POST | /api/plugins/finance-doc-parser/batch | Parse multiple documents | Yes | | GET | /api/plugins/finance-doc-parser/supported-types | List supported document types | No | | GET | /api/plugins/finance-doc-parser/status | Health check | No |

Request/Response Examples

Parse Document

// Request
{
    "document_type": "invoice",
    "content": "<invoice text or JSON>"
}

// Response { "success": true, "document_type": "invoice", "extracted_fields": { "vendor": {"name": "Dupont Technologies SAS"}, "total_amount": 30216.00, "currency": "EUR" }, "summary": "Invoice from Dupont Technologies SAS for EUR 30,216.00", "confidence": 0.95 }


5. Data Flow

Processing Pipeline

1. User submits document via POST /parse or /batch
2. Router validates request against Pydantic models
3. Mock mode loads pre-parsed data from data/ directory
4. Summarizer generates human-readable summary
5. Response returned with extracted fields and confidence score

Demo Data Description

The data/ directory contains:

  • sample_invoice.json: French vendor invoice with 3 line items (EUR 30,216)
  • sample_balance_sheet.json: Q3 2025 balance sheet (EUR 5.4M total assets)
  • sample_income_statement.json: Q3 2025 P&L (EUR 2.5M revenue, 9.8% net margin)
  • sample_bank_statement.json: October 2025 bank statement (15 transactions)
  • sample_invoice_batch.json: 3 invoices for batch processing demo

Total: 5 demo files covering all supported document types.


6. Security Considerations

Data Handling

  • PII: No - demo data uses fictional companies only
  • Encryption: N/A in mock mode; real mode should use HTTPS for API calls
  • Data Retention: No data persisted; stateless request/response

External API Keys

No external API keys required in mock mode.

Isolation

  • Plugin runs in sandboxed context via core plugin loader
  • No direct database access -- uses core API only
  • No file writes outside plugin directory

7. Test Plan

Test Classes

| Class | Tests | Coverage Target | |-------|-------|----------------| | TestPluginAttributes | 6 | 100% manifest fields | | TestPluginRouter | 5 | All routes | | TestPluginConfig | 2 | Config validation | | TestDemoData | 8 | All data files |

Running Tests

cd dryade-plugins
python -m pytest starter/finance_doc_parser/tests/ -x -v --tb=short

8. Deployment Notes

Requirements

No additional packages required beyond Dryade core.

Configuration

Default plugin configuration (set via plugin settings UI or API):

{
    "data_source": "mock"
}

Compatibility

  • Min Dryade Version: 1.0.0
  • Python: >=3.11

9. User Guide

Getting Started

  1. Ensure your Dryade instance has a starter tier license or higher
  2. Install the plugin via the marketplace or dryade-pm push
  3. Use the /parse endpoint to extract data from a financial document
  4. Use /batch for processing multiple documents at once

Common Workflows

Workflow 1: Parse a Single Invoice

  1. POST to /parse with document_type: "invoice" and invoice content
  2. Receive structured JSON with vendor, line items, amounts
  3. Feed extracted data into your accounting system

Workflow 2: Batch Process Bank Statements

  1. POST to /batch with array of bank statement documents
  2. Receive per-document results with transaction lists
  3. Reconcile against internal records

10. Screenshots

Screenshots will be added after UI integration.


11. Changelog

1.0.0 (2026-03-05)

  • Initial release
  • Invoice, balance sheet, income statement, and bank statement parsers
  • Mock data with 5 sample documents
  • Batch processing endpoint
  • Supported types discovery endpoint

Future Roadmap

  • [ ] Real-mode parsing with LLM extraction
  • [ ] PDF document support
  • [ ] Multi-currency normalization
  • [ ] Historical document comparison
Subscribe & Install

Requires starter tier subscription

Plugin Info

Version 1.0.0
Author Dryade
Tier starter
Category industry-verticals
Type backend
Downloads 0
Updated Mar 15, 2026

Tags

starterfinancedocparser