First Run Detection

Some StackAI workflows have two fundamentally different execution phases:

Initialization (first run): time-consuming, setup-oriented work such as reading large documents and establishing baseline context;
Steady-state execution (subsequent runs): lightweight, repeatable logic that operates on previously established context.

First-run detection is a workflow hack that allows agents to automatically distinguish between these two phases and adjust behavior accordingly. This way, instead of redoing heavy setup on every execution, the agent can perform initialization once, reuse the results on future runs, and stay fast and reliable.

When to use this hack

The initial execution of your agent needs to read large inputs, such as documents, files, or datasets, and extract information from them. But this is only necessary for the first run and should not be repeated on every run.
The first run establishes important context, such as document structure or policies, and subsequent runs will use this cached "working context”.
The workflow relies on costly operations, such as large knowledge base retrieval, external API calls, or multi-step tool call. These operations need to be run once and the results can be reused later.

Why this matters

Prevents repeated retrieval over large sources.
Reduces latency and token cost.
Improves consistency by reusing the same baseline context and reducing context noise.

Potential use cases

Trial preparation: summarize all prior filings once, then draft motions accordingly afterwards.
Medical claim processing: summarize a patient/claim history once, then use that history to adjudicate current claims.
Customer support: build a “customer account brief” with all relevant account information once, then answer new tickets from this account accordingly.
Compliance review: review all relevant policies and regulations once, then evaluate new transactions or incidents against the established context.

How to set up first-run detection

There are three key nodes to use:

Conceptually, we are using the shared memory node to tell us whether the first run of this workflow has taken place or not. If it has, we will route the workflow to the subsequent branch that relies on the shared memory as context to complete subsequent tasks, instead of re-doing the initial run of heavy data ingestion.

Enable memory on the LLM nodes

Turn on memory in the LLM node settings (see detailed instructions here). Use a stable user_id if you deploy as an API.

Add a Shared Memory node

Configure it to pass memory from your “first run LLM" node. You’ll use its output as the signal for “have we talked before?”.

Add an If/Else node after Shared Memory

Condition idea:

If Shared Memory output is empty → first run branch
Else → subsequent run branch

Use the variable picker in the If/Else UI. Point it at the Shared Memory node output.

First run branch: build the baseline context

Add an LLM ("first run LLM") that does the expensive work once. Example outputs that help later:

A structured “context summary” (bullets, entities, timeline).
A marker like FIRST_RUN_DONE in the final line.

Subsequent run branch: answer using the cached context

Add an LLM ("subsequent LLM") that:

Reads the summary from memory.
Avoids running the expensive tool path again.

PreviousExporting and Importing Projects NextTroubleshoot a Workflow

Last updated 21 days ago

Was this helpful?

hashtagWhen to use this hack

hashtagWhy this matters

hashtagPotential use cases

hashtagHow to set up first-run detection

hashtagEnable memory on the LLM nodes

hashtagAdd a Shared Memory node

hashtagAdd an If/Else node after Shared Memory

hashtagFirst run branch: build the baseline context

hashtagSubsequent run branch: answer using the cached context