Agent advanced settings

AI tool calling

Tool calling is only available if the response format is set to text.

Providing tools to the agent gives it the capability to perform actions. Internally, the agent decides at every step, whether to produce output or to call a tool. If it calls a tool, it generates tool parameters.

Learn more about how to configure tools in the section about AI Tools.

You can configure the agent to be able to call multiple tools in the same completion step, and configure how many tools calls it may perform at maximum.

Whether tools are called correctly and with fitting parameters may depend on the LLM used. Modern LLMs are fine-tuned to be good at calling tools.

Vector sources

LLMs are trained to generate plausible text to complete a prompt. While many advances in training these models have been made, there is no general guarantee that the model provides not only plausible, but also factually correct completions.

To improve this, you can provide extra knowledge to the LLM within its "context window".

You can provide contextual knowledge to the agent with semantic search by selecting vector-enabled tables as a context source. If you do this, for every user message to the agent, a similarity search with the user query is performed in the selected tables.

Configure the Max. number of supplied contexts and the Similarity Threshold so that it works for your use case.

The similarity measure is not necessarily calibrated. This means, that depending on the embedding model used, a similarity score of 0.47 could be a good match or a completely unrelated context.
Test your dataset with queries and see which similarity threshold is suitable for your specific case.

Model and other configurations

JSON Schema

Structured Output configuration is only available if the response format is set to json_schema.

When using the json_schema output, you can define the schema which the LLM should provide as output. Open the editor by clicking the value help on the JSON schema and configure the schema either via text or via visual editor.

Temperature

The temperature controls the "creativity" of the output. The lower the temperature, the more "expected" the output is. Use a low temperature for tasks like classification. Using a low temperature may lead to the model repeating itself or the context.

The higher the temperature, the more "creative" and unexpected the results are. Use a higher temperature if your task is akin to creative writing.

Daily user token rate

Set the daily user token rate to limit how many tokens a user can consume per day. Useful if you want to limit the usage of your agent to guard against unexpected costs.

Hide log content

When selected, input and output of the agent is visible on the Agent Trace tool, only metadata is retained.

Last N-Messages

Configure how many of the "old" messages in a thread are resent to the LLM when generating a new response. Use a low number to save token costs, use a higher number to give the LLM more context of the conversation.

Pre-processing script

You can add a custom preprocessing script that runs at agent invocation. When you add a pre-script, the pre-script receives the following data in the variable payload:

{
    userInfo: User, // the user object from the request
    query: string, // the request from the user
    messages: Array<Message>, // the previous messages of the chat
    variables: Record<string, any>, // the variables
    threadID: string, // the thread ID
    agentConfig: {
        model: string, // the model uuid
        temperature: number, // the temperature
        tools: ai_tool[], // the AI tools configured for the agent
    },
}

In the script, assign output of type string to the variable result. This result is appended to the system instruction.