Blog
All Blog Posts | Next Post | Previous Post
Automatic invoice data extraction in Delphi apps via AI
Monday, July 28, 2025
In this short example, we want to show you how you can add with a minimal amount of effort automatic invoice data extraction to a Delphi app.
This is realized with the TMS AI Studio TTMSMCPCloudAI component and in this example, taking advantage of 3 possible LLMs: OpenAI, Gemini or Claude.
How it works?
It is actually quite simple how to achieve this. We use the TTMSMCPCloudAI to send the PDF invoice and a prompt and we get in return a JSON object with the data we wanted to extract!
An essential part is providing a good prompt. We have seen good results from the following prompt setup:
TMSMCPCloudAI1.Context.Text := 'Extract information from the PDF invoice and return as a JSON object'; TMSMCPCloudAI1.SystemRole.Text := 'You collect information from the PDF invoice and return it strictly as a JSON object with this exact structure: JSON object has following key-value pairs: "InvoiceDate","InvoiceNumber","VendorName","VendorAddress","VendorVATID","CustomerName","CustomerAddress","CustomerVATID", "ListofItems","ListOfPrices", "NetTotal", "Tax", "Total";'
You can see that we use the system role description where we specify what the expected output format is of the JSON object containing the critical invoice data. But combining everything in the context also turned out to be working. The main prompt is then just to "Extract information from the PDF invoice and return as a JSON object". The system role is used to specify what exact information we want from the invoice and as what key-value pairs it should be returned in the invoice.
begin TMSMCPCloudAI1.UploadFile(INVOICEPDF, aiftPDF); end;
We the upload is complete, i.e. after the event OnFileUpload is triggered, we can send the prompt with
begin TMSMCPCloudAI1.Execute('process_invoice'); end;
begin TMSMCPCloudAI1.AddFile(INVOICEPDF, aiftPDF); TMSMCPCloudAI1.Execute('process_invoice'); end;
To select the LLM to be used for this operation is as sample as:
TMSMCPCloudAI.Service := aiClaude

{ "InvoiceDate": "05/09/2022", "InvoiceNumber": "FA2022-0001", "VendorName": "BV CRE8", "VendorAddress": "Diestersteenweg 462, 3680 Maaseik, Belgium", "VendorVATID": "BE0631922138", "CustomerName": "Comanage business user", "CustomerAddress": "", "CustomerVATID": "BE0631922138", "ListofItems": ["comanage business pakket"], "ListOfPrices": [150.00], "NetTotal": 150.00, "Tax": 31.50, "Total": 181.50 }
This is available as a VCL application. Note that it uses the TAdvPDFViewer component to show a preview of the PDF file selected. The TAdvPDFViewer component is part of TMS VCL UI Pack. The sample project download also includes 3 sample PDF files for your testing.

Conclusion
Bruno Fierens
Related Blog Posts
-
Add AI superpower to your Delphi & C++Builder apps part 1
-
Add AI superpower to your Delphi & C++Builder apps part 2: function calling
-
Add AI superpower to your Delphi & C++Builder apps part 3: multimodal LLM use
-
Add AI superpower to your Delphi & C++Builder apps part 4: create MCP servers
-
Add AI superpower to your Delphi & C++Builder apps part 5: create your MCP client
-
Add AI superpower to your Delphi & C++Builder apps part 6: RAG
-
Introducing TMS AI Studio: Your Complete AI Development Toolkit for Delphi
-
Automatic invoice data extraction in Delphi apps via AI
-
Creating an n8n Workflow to use a Logging MCP Server

This blog post has received 15 comments.


Bruno Fierens

Would it work on other types of documents ?
What would the time requiered to read 25.000 pages ?
Does this requiere to have a specific OpenAI or other licensing fee ?
Best regards
Julien ALBRECHT


To use OpenAI, you need an OpenAI API key. (Same applies for Gemini & Claude)
We haven’t done performance testing on so many files. In our testing, one invoice took a couple of seconds maximum.
Bruno Fierens

Suer Martin

Edwards Peter

Paul Stohr

Alexander Pastuhov

Does it matter when an invoice contains more than one page, or will the returned JSON string still contain everything needed as extracted from a multi-page invoice? Or is there a recommended alternative way to detect multi-page invoices and prompt the AI differently for the desired JSON string?
van Rensburg Robert


At this moment, OpenAI, Gemini and Claude can directly work with PDF files.
Bruno Fierens

if i buy the tms ai to process my invoice, do we have a charge pour AI for each pdf invoice i process?
thank you
Josee Letarte

Will it be hard to add Copilot AI ? Also how complicated is it to create the same project but working with local Ollama + llama vision 3.2 model ?
stephane.wierzbicki

I myself am building a data recognition service by analyzing identity documents or passports.
But as with invoices, the REAL problem is: How can we manage these processes with AI in complete security?
Obviously using local or standalone (private) AI services.
But are these services as capable or powerful as the more renowned and well-known online ones? How many resources are needed?
Unfortunately, despite having excellent tools available, the real issue to address is this one involving sensitive data, before writing code.
Great job TMS!
Stefano Monterisi


An analysis of the OpenAI ChatGTP security & privacy can be found here:
https://www.security.org/digital-safety/is-chatgpt-safe
For Anthropic Claude there is this reference as starting point:
https://privacy.anthropic.com/en/articles/10458704-how-does-anthropic-protect-the-personal-data-of-claude-ai-users
For Google Gemini see:
https://safety.google/gemini/
And then of course there is Ollama that runs locally with a local model, so there shouldn''t be a privacy or security concern if at least your internal network is properly secured.
Bruno Fierens

Stefano Monterisi
All Blog Posts | Next Post | Previous Post
Mike