Using Multiple LLMs
When using multiple LLMs in one project, there are important points to consider in order to ensure they work well together.
Clear Input/Output Flow
Explicit Connections: Each LLM node should have clearly defined input and output connections. Use Input nodes (
in-0
,in-1
, etc.) to gather user data, and connect them to the relevant LLM nodes.Output Handling: Route the output of each LLM node to Output nodes or downstream processing nodes (like Template or Python nodes) for further formatting or logic.
Sequential vs. Parallel LLMs
Sequential Orchestration: If the output of one LLM is needed as input for another, connect them in sequence (e.g.,
llm-0
→llm-1
). This is useful for multi-step reasoning or refinement. Having initial LLMs give structured outputs to downstream LLMs can be helpful.Parallel Orchestration: If you want to compare or aggregate results from multiple LLMs, connect the same input to several LLM nodes in parallel, then merge their outputs downstream using the Combine Node or a third LLM that will summarize and logically merge the two outputs
Shared Memory
Memory and State
Sliding Window Memory: Use the memory feature in LLM nodes to maintain context across turns or steps, especially in multi-turn workflows.
Stateful Processing: If you need to track or update state, consider using Python nodes between LLMs to manipulate or store intermediate results.
5. Error Handling and Fallbacks
On Failure Branches: Configure
on_failure_branch
and retry settings for each LLM node to handle errors gracefully.Fallback LLMs: Use the fallback options to specify alternative models/providers if the primary LLM fails.
6. Data Formatting and Validation
Template Nodes: Use Template nodes to format or merge outputs from multiple LLMs before presenting to the user.
Output Validation: If LLMs are expected to return structured data (e.g., JSON), use the
json_schema
parameter to enforce output format and validate results.
7. Chaining with Other Nodes
Integration with Actions: LLM outputs can be passed to Action nodes (e.g., sending emails, updating databases) for real-world effects.
Custom Logic: Insert Python nodes between LLMs for custom logic, filtering, or aggregation.
8. Citations and Traceability
Citations: Enable citations in LLM nodes if you want to track sources or provide references in the output.
Auditability: Use Output nodes and logs to trace the flow of data and decisions across multiple LLMs.
9. Performance and Latency
Parallelization: Where possible, run LLMs in parallel to reduce overall latency.
Token and Cost Management: Set appropriate
max_tokens
and temperature settings to control cost and response quality.
Summary Table:
Input/Output Flow
Use explicit node connections and references
Orchestration Style
Choose sequential or parallel based on use case
Prompt Engineering
Customize prompts and use context passing
Memory/State
Use memory features and Python nodes for stateful logic
Error Handling
Configure retries, fallbacks, and failure branches
Data Formatting
Use Template nodes and output validation
Chaining/Integration
Connect to Action nodes and use Python for custom logic
Citations/Traceability
Enable citations and use Output nodes for auditability
Performance
Parallelize where possible, manage tokens and latency
If you have a specific orchestration scenario in mind, I can provide a tailored example or workflow structure!
Last updated
Was this helpful?