Blog

All Blog Posts  |  Next Post  |  Previous Post

Add AI superpower to your Delphi & C++Builder apps part 3: multimodal LLM use

Today

TMS Software Delphi  Components

This is part 3 of our blog series on adding AI superpower to your Delphi & C++Builder apps. We already had the first article on basic usage of LLMs and the second article about using function calling with LLMs. In these first two articles, we dealt with textual information. In this third installment, we shift to multimodal LLMs. That is LLMs with the capabilities to deal also with other information than "simple prompts". In other words, providing files as context for the LLMs that contain ngimages, video, audio, documents ...


Embracing Multimodal LLMs in Delphi: Describe, Compare, Extract, Summarize, Translate — All in One

AI has quickly moved beyond just text generation. With the rise of multimodal large language models (LLMs), Delphi developers can now leverage image understanding, OCR, file summarization, and translation — all with minimal code and maximum flexibility. And thanks to the TTMSFNCCloudAI component, switching between AI providers like OpenAI, Claude, Mistral, Gemini, DeepSeek, Ollama, Grok, or Perplexity becomes seamless.

TMS Software Delphi  Components

Why Multimodal Matters

Traditional LLMs focused on text. Today’s advanced models can process both text and images, enabling workflows such as:

  • Automatically describing image content

  • Performing OCR on photos or scanned documents

  • Comparing two pictures and identifying visual differences

  • Summarizing lengthy documents

  • Translating files between languages

All of these tasks are achievable with the same API structure, just by adjusting context instructions. And best of all, you remain in control of the backend AI service—whether hosted or local.


A Unified Approach with TTMSFNCCloudAI

Here’s how you use it:

1. Describe an Image

delphi
TMSFNCCloudAI1.Files.Clear; TMSFNCCloudAI1.AddFile(ImageFileName, aiftImage); TMSFNCCloudAI1.Context.Clear; TMSFNCCloudAI1.Context.Text := 'describe the picture'; TMSFNCCloudAI1.Execute;

Whether it’s a scenic photo or a complex chart, supported AI models can return a natural language summary of what’s in the image.Here is an example showing an amazing result, that it even detected a half readable bottle label and could correctly identify it as Jules Mumm champagne! 
TMS Software Delphi  Components

2. Compare Two Pictures

delphi
TMSFNCCloudAI1.Files.Clear; TMSFNCCloudAI1.AddFile(ImageFileName1, aiftImage); TMSFNCCloudAI1.AddFile(ImageFileName2, aiftImage); TMSFNCCloudAI1.Context.Clear; TMSFNCCloudAI1.Context.Text := 'compare the two pictures and describe the differences'; TMSFNCCloudAI1.Execute;

Ideal for visual regression tests, UI comparisons, or even spotting differences in scanned documents or maps. In our testing, the Claude LLM seemed to provide the most accurate and knowledgable answer.

TMS Software Delphi  Components

3. Perform OCR (Optical Character Recognition)

delphi
TMSFNCCloudAI1.Files.Clear; TMSFNCCloudAI1.AddFile(ImageFileName, aiftImage); TMSFNCCloudAI1.Context.Clear; TMSFNCCloudAI1.Context.Text := 'extract the text from the picture'; TMSFNCCloudAI1.Execute;

Forget hard-coded OCR libraries — just describe the task and let the LLM handle everything. Here the test performed was with a picture taken from the back of the "Delphi Component Design" by the late Danny Thorpe (I had the honor to meet a few times back in Scotts Valley). Here credits go to OpenAI that was not only extremely accurate but was also smart enough to see the two column layout and properly put the text under each other. Up till the ISBN number of the book, everything is correct.TMS Software Delphi  Components

4. Summarize a Text File

delphi
TMSFNCCloudAI1.Files.Clear; TMSFNCCloudAI1.AddFile(TextFileName, aiftText); TMSFNCCloudAI1.Context.Clear; TMSFNCCloudAI1.Context.Text := 'summarize this text for me in one paragraph'; TMSFNCCloudAI1.Execute;

Perfect for making sense of long reports, log files, or any dense document.

5. Translate Text

delphi
TMSFNCCloudAI1.Files.Clear; TMSFNCCloudAI1.AddFile(TextFileName, aiftText); TMSFNCCloudAI1.Context.Clear; TMSFNCCloudAI1.Context.Text := 'translate the text to german'; TMSFNCCloudAI1.Execute;

Build multilingual applications with just a few lines of Delphi code.


Abstracting the Complexity

One of the biggest strengths of TTMSFNCCloudAI is abstraction. You don't need to learn every provider's API or worry about changing your code when switching services. The interface stays the same. Just configure your model and endpoint.

This allows developers to:

  • Prototype with OpenAI, then move to Claude for privacy

  • Use local models with Ollama during development

  • Compare results from Gemini or Grok with just a config change

Vision Models Required

Note: Some providers require specific models that support image understanding. For example:

  • Ollama: Only models like llava or bakllava support vision

  • Grok and Mistral: Need to be paired with multimodal-capable backends

  • Claude, OpenAI (GPT-4o), and Gemini Pro Vision support image input natively

Always ensure the model you choose understands the data type you're sending.


A Future-Proof Way to Integrate AI

With TTMSFNCCloudAI, you're not locked into one vendor or use case. You build once, and switch as needed. The multimodal revolution is here, and Delphi developers now have a first-class way to participate.

Start experimenting. Start integrating. Start building smarter Delphi apps today.

Explore TTMSFNCCloudAI and redefine how your applications interact with the world.

In upcoming articles, we’ll dive deeper into RAG, agents, MCP servers & clients.  
If you have an active TMS ALL-ACCESS license, you can now get also access to the first test version of TMS AI Studio that uses the TTMSFNCCloudAI component but also has everything on board to let you build MCP servers and clients. 
Register now to participate in this testing via this landing page.

TMS Software Delphi  Components



Bruno Fierens




This blog post has received 2 comments.


1. Friday, May 23, 2025 at 2:33:55 PM

WoW great! .. will there be any examples with Assistant AI?


Carlomagno Antonello


2. Friday, May 23, 2025 at 2:52:37 PM

Yes, working on it

Bruno Fierens




Add a new comment

You will receive a confirmation mail with a link to validate your comment, please use a valid email address.
All fields are required.



All Blog Posts  |  Next Post  |  Previous Post