BlobStore
Contents
- 1 GnuCash Paperless-ngx Integration
- 1.1 Overview
- 1.2 Configuration
- 1.3 API Layer
- 1.4 Storage Model
- 1.5 Attachment Workflow
- 1.6 KVP Structure
- 1.7 Detachment & Lifecycle
- 1.8 UI Components
- 1.9 Error Handling
- 1.10 Implementation Roadmap
- 1.11 Paperless API Reference (Minimal) =
- 1.12 Configuration File Format (Optional) =
- 1.13 Security Considerations
- 1.14 Testing Strategy
- 1.15 Open Questions
- 1.16 Migration (If Existing Blobs) =
- 1.17 Appendix: Sample KVP After Attach =
- 1.18 Conclusion
GnuCash Paperless-ngx Integration
Overview
Integrate GnuCash with Paperless-ngx for document management. PDFs and images attached to transactions and invoices are uploaded to Paperless-ngx; the document ID is stored in KVP. Launches retrieve document list from Paperless, with options to view individual documents or access Paperless document management UI.
Key properties:
- Upload via Paperless API at document attach time
- Store Paperless
document_idin transaction/invoice KVP - Launchable: fetch attached doc list, view doc or open Paperless UI
- Configurable per-book or global: hostname, API port, authentication
- No local blob storage; full outsourcing to Paperless
Configuration
Global Preferences (GConf/gsettings)
Path: org.gnucash.general.paperless
paperless_enabled : boolean = false paperless_hostname : string = "localhost" paperless_port : integer = 8000 paperless_api_token : string = "" [encrypted in gconf] paperless_use_https : boolean = false
Example:
paperless_hostname = "paperless.example.com" paperless_port = 8000 paperless_api_token = "abc123def456..." paperless_use_https = true
Per-Book Override (Optional)
Book KVP at /kvp/paperless-config/:
/kvp/paperless-config/hostname → "paperless.example.com" /kvp/paperless-config/port → "8000" /kvp/paperless-config/api_token → "abc123def456..." /kvp/paperless-config/use_https → "true"
Priority: Per-book config overrides global prefs (allows multi-Paperless setups).
Configuration UI
New tab in Edit → Preferences → Paperless:
┌─ Preferences: Paperless Integration ─────────────┐ │ │ │ ☑ Enable Paperless integration │ │ │ │ Hostname: [ localhost ] │ │ Port: [ 8000 ] │ │ Use HTTPS: ☐ │ │ │ │ API Token: [ •••••••••••••••••• ] │ │ [ Show ] [ Generate... ] │ │ │ │ [ Test Connection ] │ │ │ │ ☐ Override per-book (advanced) │ │ │ │ [ OK ] [ Cancel ] [ Help ] │ └────────────────────────────────────────────────────┘
Test Connection button: POST to /api/documents/ with empty filter; show "✓ Connected" or error message.
---
API Layer
PaperlessClient Class
C++ wrapper around Paperless REST API.
class PaperlessClient {
public:
// Lifecycle
BlobStore(QofBook* book);
~BlobStore();
// Connection & auth
gboolean test_connection(char** out_error);
// -> GET /api/documents/ with token in Authorization header
// Returns TRUE if 200 OK, FALSE + error message otherwise
// Upload a document
gint upload_document(const char* filepath, const char* title,
const char* filename, char** out_error);
// -> POST /api/documents/upload/ with file in multipart/form-data
// Optional: query params ?title=... &filename=...
// Returns document_id (int) on success, -1 on error + out_error
// Blocks until upload completes (may be slow for large PDFs)
// Retrieve document info
gboolean get_document_info(gint doc_id,
char** out_title, char** out_filename,
char** out_error);
// -> GET /api/documents/{id}/
// Returns title and original_filename for UI display
// Build URLs for user
char* get_document_url(gint doc_id);
// -> "https://paperless.example.com:8000/documents/{id}/"
char* get_management_url();
// -> "https://paperless.example.com:8000/documents/"
// List documents (optional, for future search/filter UI)
GList* list_documents(const char* query, char** out_error);
// -> GET /api/documents/?query=...
// Returns list of (doc_id, title, filename) tuples
};
Implementation notes:
- Use libcurl for HTTP requests
- Parse JSON responses with json-glib or similar
- Store API token in memory (gconf stores encrypted); clear on shutdown
- Synchronous calls (blocks main thread); wrap in timeout dialog if concern
---
Storage Model
Transaction-Level Attachment
Paperless document ID stored in transaction KVP:
/kvp/attachments/<index> → "doc_id:12345"
Or (simpler):
/kvp/paperless_docs → ["12345", "12346", ...] (JSON array of doc IDs)
Design decision: Store only the document_id integer in KVP. Title, filename, and URL are fetched from Paperless on demand (live data; user can rename in Paperless and GnuCash reflects it).
Invoice-Level Attachment
Same pattern for invoices (if GnuCash models invoices as objects with KVP):
/kvp/paperless_docs → ["12345"]
Or per-line-item:
/kvp/paperless_line_items/<item-id>/doc_id → "12346"
---
Attachment Workflow
User Attaches PDF to Transaction
- User clicks Attach Document in transaction editor
- File picker dialog opens
- User selects
invoice_2024_q1.pdf - Dialog: optional title override for Paperless (default: filename)
Title in Paperless: [ invoice_2024_q1.pdf ]
- User clicks "Attach"
- Blocking upload:
* Progress dialog: "Uploading to Paperless..." with cancel button * Callpaperless_client->upload_document()* Paperless returnsdoc_id = 12345* Store in transaction KVP:/kvp/paperless_docs→["12345"]* Close dialog
- UI shows attachment icon next to transaction (same as before)
User Views/Launches Attachment
- User clicks attachment icon (📎) in transaction view
- Blob Viewer Dialog opens:
┌─ Transaction Attachments ──────────────────┐ │ TX: 2024-01-15 Invoice from Acme Corp │ │ │ │ Attached Documents: │ │ ───────────────────────────────────────── │ │ ☑ [12345] invoice_2024_q1.pdf │ │ [View] [Manage in Paperless] │ │ │ │ ☑ [12346] receipt_support.jpg │ │ [View] [Manage in Paperless] │ │ │ │ [ + Attach New ] [ - Remove ] [Close] │ └─────────────────────────────────────────────┘
- User clicks [View] → opens Paperless download URL in browser/PDF viewer
https://paperless.example.com:8000/api/documents/12345/download/
- User clicks [Manage in Paperless] → opens document edit UI in browser
https://paperless.example.com:8000/documents/12345/
- User clicks [+ Attach New] → repeats attach workflow above
- User clicks [- Remove] for a doc → detaches from transaction (marks doc stale locally, does NOT delete from Paperless)
---
KVP Structure
Minimal Design (Recommended)
Transaction KVP:
/kvp/paperless_docs → JSON array: "[12345, 12346, ...]"
Rationale:
- Simple, flat structure
- No need to track per-doc metadata (fetch from Paperless)
- No orphan cleanup logic (docs live in Paperless independently)
Alternative: Verbose Design
/kvp/paperless/<doc_id>/uploaded_at → "2024-01-15T10:30:00Z" /kvp/paperless/<doc_id>/local_title → "Original filename" (optional)
(More complex; unlikely needed for MVP.)
Detachment & Lifecycle
User Detaches Document
- User clicks [- Remove] in Blob Viewer dialog
- Local KVP entry removed:
/kvp/paperless_docsupdates to removedoc_id - Document stays in Paperless (user must delete manually there if desired)
- Transaction saved to disk
Rationale: Paperless is the source of truth. GnuCash only tracks which Paperless docs are relevant to a transaction. User can manage doc lifecycle in Paperless separately (e.g., re-use a doc across multiple transactions).
Book Close
- No special cleanup needed
- KVP with doc IDs persists
- At attach time: if Paperless unreachable, error dialog; user retries or cancels
- At view time: if Paperless unreachable, error dialog; doc URL shown but not accessible
- KVP reference remains; will work again when Paperless comes back up
UI Components
Transaction Editor
Add Attachments section below date/description/amount:
┌─ Transaction Editor ────────────────────┐ │ Date: [ 2024-01-15 ] │ │ Account: [ Assets:Bank ] │ │ Memo: [ Invoice from Acme ] │ │ Amount: [ 1000.00 ] │ │ │ │ Attachments: │ │ 📎 [1] invoice_2024_q1.pdf [✕] │ │ [ + Add Document ] │ │ │ │ [ OK ] [ Cancel ] │ └──────────────────────────────────────────┘
Clicking on the PDF → opens Blob Viewer dialog. Clicking [✕] → removes from KVP. Clicking [+ Add Document] → file picker + upload.
Blob Viewer Dialog
Standalone modal (reusable for transactions, invoices, splits):
┌─ Attachments ──────────────────────────────┐ │ Parent: TX 2024-01-15 Acme Invoice │ │ │ │ Documents: │ │ ┌─────────────────────────────────────────┐ │ │ [12345] invoice_2024_q1.pdf │ │ │ ┌─────────────────────────────────────┐ │ │ │ │ [View] [Open in Paperless] [Remove] │ │ │ │ └─────────────────────────────────────┘ │ │ │ │ │ │ [12346] receipt_support.jpg │ │ │ ┌─────────────────────────────────────┐ │ │ │ │ [View] [Open in Paperless] [Remove] │ │ │ │ └─────────────────────────────────────┘ │ │ └─────────────────────────────────────────┘ │ │ │ [ + Attach New ] [Close] │ └──────────────────────────────────────────────┘
[View] → PaperlessClient::get_document_url() → open in browser/viewer
[Open in Paperless] → PaperlessClient::get_management_url() → doc edit page
[Remove] → delete from KVP array
[+ Attach New] → file picker + upload workflow
Preferences Dialog
(See Configuration section above.)
Error Handling
Upload Failure
┌─ Upload Error ────────────────────┐ │ Failed to upload to Paperless: │ │ │ │ [Connection refused] │ │ (Is Paperless running?) │ │ │ │ Hostname: paperless.example.com │ │ Port: 8000 │ │ │ │ [ Retry ] [ Cancel ] │ └────────────────────────────────────┘
Common errors:
- Connection refused → Paperless not running
- 401 Unauthorized → invalid API token
- 413 Payload Too Large → file too big
- Timeout → slow network / large file
Missing Configuration
- User clicks "Attach Document" but Paperless is disabled
- Dialog: "Paperless integration not enabled. Configure in Preferences."
- Offer quick link to Preferences
Stale Document References
- User opens transaction with doc_id that no longer exists in Paperless
- Blob Viewer shows: "[12345] (document not found)"
- [View] and [Open in Paperless] buttons disabled
- User can still [Remove] the KVP entry locally
---
Implementation Roadmap
Phase 1: Core
- PaperlessClient: `test_connection()`, `upload_document()`, `get_document_url()`
- Preferences UI + gconf storage
- Transaction editor: [+ Attach] button + file picker
- Minimal Blob Viewer: list docs with [View] and [Remove] buttons
- KVP: `/kvp/paperless_docs` → JSON array
Phase 2: Polish
- [Open in Paperless] button (management UI link)
- Invoice attachment support
- Upload progress dialog with cancel
- Error messages + retry logic
- Paperless doc title caching in KVP (optional metadata)
Phase 3: Future
- Search/filter attached docs by title
- Drag-and-drop attachment to transactions
- Paperless tag integration (tag tx based on Paperless tags)
- Bulk upload from file picker
- Scheduled sync: periodic check for orphaned docs
---
Paperless API Reference (Minimal) =
Test Connection
``` GET /api/documents/?page_size=1
Headers:
Authorization: Token <api_token>
Response (200 OK): {
"count": 42, "results": [...]
} ```
Upload Document
``` POST /api/documents/upload/
Headers:
Authorization: Token <api_token>
Body (multipart/form-data):
document=<binary file> title=<optional string> filename=<original filename>
Response (200 OK): {
"id": 12345, "title": "invoice_2024_q1", "original_file_name": "invoice_2024_q1.pdf", ...
} ```
Get Document Info
``` GET /api/documents/{id}/
Headers:
Authorization: Token <api_token>
Response (200 OK): {
"id": 12345, "title": "invoice_2024_q1", "original_file_name": "invoice_2024_q1.pdf", "created": "2024-01-15T10:30:00Z", "updated": "2024-01-15T10:30:00Z", ...
} ```
Download Document
``` GET /api/documents/{id}/download/
Response (200 OK):
<binary PDF/image content>
Or redirect to:
/documents/{id}/ (web UI)
```
Document Management UI
``` https://<hostname>:<port>/documents/{id}/ ```
Allows user to edit title, tags, archive, delete, etc.
---
Configuration File Format (Optional) =
If per-book config stored in KVP becomes unwieldy, alternative: plaintext config file.
- File:** `<book-path>.gnucash.paperless`
```ini [paperless] hostname = paperless.example.com port = 8000 use_https = true api_token = abc123def456... ```
- Loader:**
```cpp gboolean load_paperless_config(const char* book_path, PaperlessConfig* cfg); // Tries <book-path>.paperless first; falls back to gconf global ```
(Avoids cluttering book KVP but requires file management; recommend KVP for simplicity.)
---
Security Considerations
API Token Storage
- **In memory:** Keep decrypted token only in PaperlessClient; never log
- **In GConf:** Store encrypted (use gnome-keyring if available)
- **In KVP:** Never store; reference global config only
- **Cleanup:** Clear token on app exit (destructor)
HTTPS
- Default: `use_https = false` (local Paperless on `localhost:8000`)
- Production: enable `use_https = true`
- Validate SSL cert (libcurl default)
API Token Scoping
- Paperless token is global (no per-transaction auth)
- Assume trusted GnuCash environment (user has API token access)
- No per-user/per-doc ACLs (GnuCash talks to Paperless as single identity)
Filename Validation
- Trust Paperless-returned filenames (already sanitized by Paperless)
- No symlink or path traversal risk
---
Testing Strategy
Unit Tests
- Mock PaperlessClient: stub upload, return fake doc IDs
- KVP serialization: attach 2 docs, save/reload, verify IDs persist
- Error cases: 401, 500, timeout → verify error dialogs
Integration Tests
- Spin up real Paperless (Docker) for test suite
- Upload file → verify doc appears in Paperless UI
- Detach → verify KVP updated, doc stays in Paperless
- View → verify browser opens correct URL
Manual Testing
- Configure against live Paperless instance
- Attach PDF to transaction → verify upload succeeds
- Open Preferences → click [Test Connection] → verify ✓
- Disable Paperless → attach button disabled
- Restart GnuCash → re-open transaction → docs still listed
---
Open Questions
- **Large file handling:** Progress bar for slow uploads? Recommend max file size?
- **Batch upload:** Support drag-and-drop multiple files at once?
- **Paperless search:** Integrate Paperless search into GnuCash doc picker?
- **Tagging:** Sync Paperless tags ↔ transaction memo or custom KVP field?
- **Archiving:** When user archives doc in Paperless, should GnuCash warn?
- **Offline mode:** Graceful degradation if Paperless unavailable?
---
Migration (If Existing Blobs) =
If GnuCash already has local blob storage, migration strategy:
- Read old blob storage: `/kvp/attachments/` with SHA256 refs
- For each blob:
* Read file from disk * Upload to Paperless → get doc_id * Replace KVP entry: SHA256 ref → Paperless doc_id
- Clean up local blobs directory
CLI tool: `gnucash --migrate-blobs-to-paperless /path/to/book.gnucash`
---
Appendix: Sample KVP After Attach =
Transaction KVP after attaching two Paperless docs:
``` /kvp/paperless_docs = "[12345, 12346]" /kvp/notes = "Invoice from vendor" ```
Or verbose:
``` /kvp/paperless_docs/12345 = "{}" (empty; metadata on-demand from Paperless) /kvp/paperless_docs/12346 = "{}" ```
When GnuCash loads the transaction:
- Fetch `/kvp/paperless_docs`
- Parse JSON array: `[12345, 12346]`
- On UI render: call `paperless_client->get_document_info(12345, ...)` → title, filename
- Display in transaction view with attachment icon
---
Conclusion
- Advantages over local blob storage:**
- ✅ No local filesystem management
- ✅ Deduplication handled by Paperless (across books)
- ✅ Full-text search in Paperless
- ✅ Tagging, archiving, deletion in Paperless UI
- ✅ Backup: Paperless is separate backup target
- ✅ Scalability: Paperless handles large collections
- Disadvantages:**
- ❌ Requires Paperless running + network access
- ❌ Lost access if Paperless down or deleted docs
- ❌ Latency on upload (especially large files)
- Best for:** Professional workflows with Paperless already in use; document-heavy orgs; multi-computer setup where Paperless is centralized.