Agent advanced settings
AI tool calling
Tool calling is only available if the response format is set to text.
|
Providing tools to the agent gives it the capability to perform actions. Internally, the agent decides at every step, whether to produce output or to call a tool. If it calls a tool, it generates tool parameters.
Learn more about how to configure tools in the section about AI Tools.
You can configure the agent to be able to call multiple tools in the same completion step, and configure how many tools calls it may perform at maximum.
| Whether tools are called correctly and with fitting parameters may depend on the LLM used. Modern LLMs are fine-tuned to be good at calling tools. |
Vector sources
| LLMs are trained to generate plausible text to complete a prompt. While many advances in training these models have been made, there is no general guarantee that the model provides not only plausible, but also factually correct completions. |
To improve this, you can provide extra knowledge to the LLM within its "context window".
You can provide contextual knowledge to the agent with semantic search by selecting vector-enabled tables as a context source. If you do this, for every user message to the agent, a similarity search with the user query is performed in the selected tables.
Configure the Max. number of supplied contexts and the Similarity Threshold so that it works for your use case.
The similarity measure is not necessarily calibrated. This means, that
depending on the embedding model used, a similarity score of 0.47 could be a
good match or a completely unrelated context.
|
| Test your dataset with queries and see which similarity threshold is suitable for your specific case. |
Model and other configurations
JSON Schema
Structured Output configuration is only available if the response format is
set to json_schema.
|
When using the json_schema output, you can define the schema which the LLM should
provide as output. Open the editor by clicking the value help on the JSON schema
and configure the schema either via text or via visual editor.
Temperature
The temperature controls the "creativity" of the output. The lower the temperature, the more "expected" the output is. Use a low temperature for tasks like classification. Using a low temperature may lead to the model repeating itself or the context.
The higher the temperature, the more "creative" and unexpected the results are. Use a higher temperature if your task is akin to creative writing.
Daily user token rate
Set the daily user token rate to limit how many tokens a user can consume per day. Useful if you want to limit the usage of your agent to guard against unexpected costs.
Hide log content
When selected, input and output of the agent is visible on the Agent Trace tool, only metadata is retained.
Last N-Messages
Configure how many of the "old" messages in a thread are resent to the LLM when generating a new response. Use a low number to save token costs, use a higher number to give the LLM more context of the conversation.
Pre-processing script
You can add a custom preprocessing script that runs at agent invocation. When you
add a pre-script, the pre-script receives the following data in the variable payload:
{
userInfo: User, // the user object from the request
query: string, // the request from the user
messages: Array<Message>, // the previous messages of the chat
variables: Record<string, any>, // the variables
threadID: string, // the thread ID
agentConfig: {
model: string, // the model uuid
temperature: number, // the temperature
tools: ai_tool[], // the AI tools configured for the agent
},
}
In the script, assign output of type string to the variable result. This result
is appended to the system instruction.