Main Settings

Add Memory

Add memory to your LLMs in Stack AI. Improve user interaction by enabling models to remember previous conversations and provide more context-aware responses.

LLMs do not hold an internal state, and many applications require tracking previous interactions with the LLM as part of the interface (e.g. chatbots). To this end, you can add memory to an LLM node under the Stack AI tool by clicking on the gear icon of the LLM node.

Some quick facts:

All the LLM memory is encrypted end-to-end in the Stack AI database.
This data can be self-hosted under the Stack AI enterprise plan.
The LLM memory is user-dependent and an instance of the LLM memory.
Once the deployed as an API, you can specify the user_id for the LLM memory for each user (see Deployer Guide ).
By default, the “Sliding Window Input” memory is selected when a new LLM node is added to the flow.

We offer three types of memory modalities:

Sliding Window
- Stores all LLM prompts and completions.
- This strategy may consume many tokens as the LLM prompts can often occupy thousands of tokens.
- Loads a window of the previous prompts and completions as part of the LLM conversation memory, up to the number of messages in the window.
- In non-chat models (e.g. Davinci), the memory is added as part of the prompt as a list of messages at the end of the prompt.
Sliding Window with Input
- Stores one LLM input parameter (e.g. in-0) and all LLM completions, without storing the entire prompt from each turn.
- This strategy is more token efficient and aligned with many applications (e.g. when only the user message from input is relevant)
- Loads a window of the previous inputs and completions as part of the LLM conversation memory, up-to the number of messages in the window. In non-chat models (e.g. davinci-003-text), the memory is added as part of the prompt as a list of messages at the end.
VectorDB
- Stores all of the inputs and outputs to the LLM in a Vector Database and retrieves the most relevant messages to use as LLM memory.
- This is especially useful if you expect some of the information to be needed at a later time but not in a sequential manner.
- Allows the LLM to access older, contextually relevant interactions without the constraint of a fixed window size.

Sliding Window

The sliding window allows you to set the number of turns you would like to be included in your context.

Input Id

If you chose to have 'Sliding Window with Input' saved in memory, then you can also select the id of the input that you would like to be held in context. All other inputs will not be stored in context.

Citations

Turn on citations to allow the AI to provide citations (references) for the information it generates, especially when it uses external sources or uploaded documents.

Response Format

Text is the default response format. You can also choose to have the AI return a response formatted as a JSON object

Response Format

When to Use?

Text

The default option. Best for most conversational, summary, or narrative outputs.

JSON Object

Useful when you want structured data for further processing, such as extracting specific fields, integrating with APIs, or using the output in downstream nodes that expect JSON.

JSON Object with Schema

When you want to specify an exact JSON schema for the output so the AI outputs data in a very specific format for integration, automation, or validation.

JSON Object with Schema

To have an LLM output a JSON object according to a provided schema, you must provide the schema in the following format. JSON schemas not formatted according to this specification may throw an error.

"strict": true if you want to enforce exactly this output schema
"description": a description of your schema
"schema": the actual schema of your ouput
- "type": set to "object" if you would like to return a JSON object
- "properties": the outputs you would like to return, include each outputs type, and an informative description for the LLM

{
    "strict": true,
    "name": "weather-schema",
    "description": "Schema for a weather API request",
    "schema": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The location to get the weather for"
            },
            "unit": {
                "type": "string",
                "description": "The unit to return the temperature in",
                "enum": ["F", "C"]
            }
        },
        "additionalProperties": false,
        "required": ["location", "unit"]
    }
}

PreviousTools NextAdvanced Settings

Last updated 1 month ago

Was this helpful?