# Large Language Models (LLMs) Available on Our Platform

Large Language Models (LLMs) play a crucial role in powering the AI Agents and workflows within **indigo.ai**. These models enable natural language understanding, reasoning, and content generation, ensuring businesses can automate interactions and provide intelligent, context-aware responses. This article explores the LLMs available in **indigo.ai**, their capabilities, and best practices for selecting the right model for your needs.

## Understanding LLMs in indigo.ai

At indigo.ai, we integrate multiple LLMs to offer a **flexible, high-performance AI ecosystem**. Different models are optimized for **speed or power**, allowing users to choose the best fit for their use case.

Here’s how we categorize them:

<table><thead><tr><th>Speed ⚡</th><th>Power 🚀</th><th>Reasoning 🧠</th><th data-hidden></th></tr></thead><tbody><tr><td><strong>gpt-4.1-mini</strong></td><td><strong>gpt-4.1</strong></td><td><strong>gpt-5.1</strong></td><td><strong>Balanced</strong> ⚖️</td></tr><tr><td>gpt-4.1-nano</td><td>gemini-2.5-flash</td><td>gpt-5-mini</td><td>gpt-4o-mini</td></tr><tr><td>gemini-2.5-flash-lite</td><td>claude-4.5-sonnet</td><td>gpt-5-nano</td><td>gemini-1.5-flash</td></tr><tr><td>mistral-small-3.2</td><td>gpt-oss-120b</td><td>gemini-2.5-pro</td><td></td></tr><tr><td>claude-4.5-haiku</td><td></td><td></td><td></td></tr><tr><td>gpt-oss-20b</td><td></td><td></td><td></td></tr></tbody></table>

## **LLM Categories and Their Use Cases**

* **Speed:** Models that prioritize response time over advanced reasoning. Best for real-time interactions where immediate feedback is essential.
* **Power:** High-performance models with **strong generative capabilities**, designed for complex tasks but with longer response times.
* **Reasoning**: Models designed to generate a reasoning process before providing an answer. This feature makes them capable of solving complex tasks that require deep reasoning, at the cost of higher latency.

Models in **bold** within each category represent the **recommended models** based on performance and reliability.

## List of LLMs in indigo.ai

### **Available Models, Providers, and Server Locations**

<table><thead><tr><th width="208.94140625">Model Name in Platform</th><th width="164.7109375">LLM Backend</th><th width="130.703125">Provider</th><th width="141.41796875">Server Location</th><th>Comment</th></tr></thead><tbody><tr><td>gpt-4.1-mini (EU)</td><td>gpt-4.1-mini-2025-04-14</td><td>Microsoft Azure</td><td>Sweden</td><td>Default</td></tr><tr><td>gpt-4.1 (EU)</td><td>gpt-4.1-2025-04-14</td><td>Microsoft Azure</td><td>Sweden</td><td></td></tr><tr><td>gpt-4.1-nano (EU)</td><td>gpt-4.1-nano-2025-04-14</td><td>Microsoft Azure</td><td>Sweden</td><td></td></tr><tr><td>gpt-5.1 (EU)</td><td>azure-se-gpt-5.1</td><td>Microsoft Azure</td><td>Sweden</td><td></td></tr><tr><td>gpt-5-mini (EU)</td><td>azure-se-gpt-5-mini</td><td>Microsoft Azure</td><td>Sweden</td><td></td></tr><tr><td>gpt-5-nano (EU)</td><td>azure-se-gpt-5-nano</td><td>Microsoft Azure</td><td>Sweden</td><td></td></tr><tr><td>gemini-2.5-pro (EU)</td><td>gemini-2.5-pro</td><td>Google</td><td>Belgium</td><td></td></tr><tr><td>gemini-2.5-flash (EU)</td><td>gemini-2.5-flash</td><td>Google</td><td>Belgium</td><td></td></tr><tr><td>gemini-2.5-flash-lite (EU)</td><td>gemini-2.5-flash-lite</td><td>Google</td><td>Belgium</td><td><br></td></tr><tr><td>claude-4.5-sonnet (EU)</td><td>claude-sonnet-4-5@20250929</td><td>GoogleVertex</td><td>Belgium</td><td></td></tr><tr><td>claude-4.5-haiku (EU)</td><td>claude-haiku-4-5@20251001</td><td>GoogleVertex</td><td>Belgium</td><td></td></tr><tr><td>mistral-small-3.2 (EU)</td><td>mistral-small-2506</td><td>Mistral</td><td>Sweden</td><td></td></tr><tr><td>gpt-oss-120b (EU)</td><td>openai/gpt-oss-120b</td><td>Groq</td><td>EU</td><td>Open-weight, high throughput</td></tr><tr><td>gpt-oss-20b (EU)</td><td>openai/gpt-oss-20b</td><td>Groq</td><td>EU</td><td>Open-weight, low latency</td></tr><tr><td>maestrale-chat (self-hosted)</td><td>hf.co/mii-llm/maestrale-chat-v0.4-beta-GGUF</td><td>indigo.ai</td><td>Germany</td><td></td></tr></tbody></table>

{% hint style="info" %}
**Groq-hosted open-weight models** (`gpt-oss-120b`, `gpt-oss-20b`) are served on Groq infrastructure with EU data residency. They offer significantly lower latency than equivalent-size proprietary models, which makes them a strong fit for real-time interactions (e.g. voice channel, high-traffic web chat).
{% endhint %}

## **Default Model in indigo.ai**

By default, we use **azure-gpt-4.1-mini (EU)** in our AI Agents and workflows. This model is selected because:

* ✅ It offers a strong balance between **performance and response time**.
* ✅ It is hosted on **Microsoft Azure EU servers**, ensuring **compliance with European data regulations**.
* ✅ It supports **advanced reasoning capabilities** while maintaining a reasonable token cost and latency.

However, you can choose to use **different models** based on your specific requirements.

## **How to Choose the Right Model**

Selecting the best model depends on several factors, including response speed, accuracy, reasoning ability, and token consumption. Here are some guidelines:

**1. Prioritize Speed (Fastest Response Time)**

Use **gpt-4.1-mini** if:

✔ You need real-time responses.\
✔ Your use case involves quick user interactions.\
✔ Advanced reasoning is not the top priority.

**2. Prioritize Power**

Use **gpt-4.1** or **gpt-5.1** if:\
✔ You need deep contextual understanding.\
✔ Your use case involves complex responses (e.g., legal, medical, or technical AI agents).\
✔ You’re willing to trade speed for accuracy.

### Model Selector (New)

indigo.ai provides an enhanced model selector panel that helps users choose the most appropriate LLM directly from the interface.

Instead of a simple dropdown, the selector displays key information for each model to support informed decision-making.

<figure><img src="/files/enJYQcTPYgxIYoPOPoYQ" alt=""><figcaption></figcaption></figure>

#### What you can see

For each model, the selector shows:

* Pricing → input/output cost per million tokens
* Context window → maximum supported context size
* Price tier → Light, Standard, or Premium
* Deployment region → e.g. EU or US (when available)
* Recommended label → highlights suggested models

Models are grouped by provider (e.g. OpenAI, Google, Anthropic, Mistral), and can be searched by name.

#### Price tiers

To simplify model selection, each model is categorized into one of three tiers:

* Light → low-cost models, ideal for simple tasks and high-volume usage
* Standard → balanced models for most use cases
* Premium → top-tier models for complex and high-performance tasks

#### Where it is available

The model selector is available in all areas where a model can be selected:

* Agent global settings
* Agent block (Agent Builder)
* Prompt block (Agent Builder)

## Best Practices for Choosing an LLM in Prompts

**Impact of Model Selection on Performance**

When configuring your AI Agent in indigo.ai, the model you choose affects:

* **Response Length**: More powerful models generate more detailed responses but consume more tokens.
* **Accuracy**: Higher-end models provide better coherence and logical reasoning.
* **Speed**: Faster models provide instant replies but may lack depth in reasoning.

## Model Deprecation and Automatic Redirects

Model providers regularly retire older versions of their LLMs. When that happens, indigo.ai guarantees **zero-downtime migrations**: agents and workflows configured on a deprecated model are **automatically redirected to the recommended replacement** at runtime, with no manual intervention required.

**How it works:**

* Each deprecated model is mapped to a successor (typically the next recommended model in the same category — e.g. `gpt-4o-mini` → `gpt-4.1-mini`).
* When an agent invokes a deprecated model, the platform transparently serves the request with the replacement and logs the redirect for observability.
* In the model selector UI, deprecated models are marked as such and are visually distinguished so you can plan migrations proactively.

**What you should do:**

* Treat automatic redirects as a safety net, not a permanent solution. When you see a model flagged as deprecated, update the configuration to the recommended replacement to keep your prompts tuned for the actual model you are running on.
* Re-test prompts after migrating: newer models often respond to instructions differently, especially around output format and tone.

{% hint style="warning" %}
Redirects keep your agents running, but prompt behavior may shift. Plan a short QA pass whenever you migrate away from a deprecated model.
{% endhint %}

## **Conclusion**

The **indigo.ai platform** offers a **diverse selection of LLMs**, each optimized for different use cases. Whether you need **fast interactions, a balanced approach, or maximum reasoning power**, selecting the right model is key to optimizing your AI’s performance. By default, we recommend **azure-gpt-4.1-mini (EU)** for most workflows, but users can choose based on their specific requirements.

Understanding LLM capabilities allows businesses to build **smarter, more efficient AI Agents**, ensuring they meet customer expectations with high-quality automated interactions.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://guide.indigo.ai/getting-started/ai-knowledge-hub/large-language-models-llms-available-on-our-platform.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
