Using Multiple LLMs

When using multiple LLMs in one project, there are important points to consider in order to ensure they work well together.


Clear Input/Output Flow

  • Explicit Connections: Each LLM node should have clearly defined input and output connections. Use Input nodes (in-0, in-1, etc.) to gather user data, and connect them to the relevant LLM nodes.

  • Output Handling: Route the output of each LLM node to Output nodes or downstream processing nodes (like Template or Python nodes) for further formatting or logic.

Sequential vs. Parallel LLMs

  • Sequential Orchestration: If the output of one LLM is needed as input for another, connect them in sequence (e.g., llm-0llm-1). This is useful for multi-step reasoning or refinement. Having initial LLMs give structured outputs to downstream LLMs can be helpful.

  • Parallel Orchestration: If you want to compare or aggregate results from multiple LLMs, connect the same input to several LLM nodes in parallel, then merge their outputs downstream using the Combine Node or a third LLM that will summarize and logically merge the two outputs

Shared Memory

Memory and State

  • Sliding Window Memory: Use the memory feature in LLM nodes to maintain context across turns or steps, especially in multi-turn workflows.

  • Stateful Processing: If you need to track or update state, consider using Python nodes between LLMs to manipulate or store intermediate results.

5. Error Handling and Fallbacks

  • On Failure Branches: Configure on_failure_branch and retry settings for each LLM node to handle errors gracefully.

  • Fallback LLMs: Use the fallback options to specify alternative models/providers if the primary LLM fails.

6. Data Formatting and Validation

  • Template Nodes: Use Template nodes to format or merge outputs from multiple LLMs before presenting to the user.

  • Output Validation: If LLMs are expected to return structured data (e.g., JSON), use the json_schema parameter to enforce output format and validate results.

7. Chaining with Other Nodes

  • Integration with Actions: LLM outputs can be passed to Action nodes (e.g., sending emails, updating databases) for real-world effects.

  • Custom Logic: Insert Python nodes between LLMs for custom logic, filtering, or aggregation.

8. Citations and Traceability

  • Citations: Enable citations in LLM nodes if you want to track sources or provide references in the output.

  • Auditability: Use Output nodes and logs to trace the flow of data and decisions across multiple LLMs.

9. Performance and Latency

  • Parallelization: Where possible, run LLMs in parallel to reduce overall latency.

  • Token and Cost Management: Set appropriate max_tokens and temperature settings to control cost and response quality.


Summary Table:

Aspect
Best Practice

Input/Output Flow

Use explicit node connections and references

Orchestration Style

Choose sequential or parallel based on use case

Prompt Engineering

Customize prompts and use context passing

Memory/State

Use memory features and Python nodes for stateful logic

Error Handling

Configure retries, fallbacks, and failure branches

Data Formatting

Use Template nodes and output validation

Chaining/Integration

Connect to Action nodes and use Python for custom logic

Citations/Traceability

Enable citations and use Output nodes for auditability

Performance

Parallelize where possible, manage tokens and latency

If you have a specific orchestration scenario in mind, I can provide a tailored example or workflow structure!

Last updated

Was this helpful?