How to improve LLM Performance
This guide presents various strategies and techniques to enhance the results obtained from your LLM. You can experiment with these methods individually or in combination to find the most effective approach for your needs. Some strategies include:
Write Clear Instructions: Request brief responses if the outputs are too long. If the results are too simple, ask for expert-level writing. If you dislike the format, demonstrate the format you prefer. The fewer the LLM have to guess your intentions, the more likely you'll receive the desired output.
Tactics:
Include specific details in your query for more relevant answers.
Ask the model to adopt a persona.
Use delimiters to indicate distinct parts of the input.
Specify the steps required to complete a task.
Provide examples.
Specify the desired length of the output.
Provide Reference Text: LLMs can confidently generate fake answers, particularly when asked about obscure topics or for citations and URLs. Providing reference text that can help them answer with fewer fabrications.
Tactics:
Instruct the model to answer using a reference text.
Instruct the model to answer with citations from a reference text.
See Offline Data Loaders.
Break Down Complex Tasks into Simpler Subtasks: Complex tasks have higher error rates than straightforward tasks. Breaking down complex tasks into more specific tasks can improve performance. The outputs of earlier studies can be used to construct the inputs for later missions.
Tactics:
Use intent classification to identify the most relevant instructions for a user query.
For dialogue applications requiring long conversations, summarize or filter previous dialogue.
Summarize long documents piecewise and construct a full summary recursively.
Allow LLMs Time to "Think": LLMs make more reasoning errors when trying to answer immediately rather than taking time to work out an answer. Asking for a chain of reasoning before a response can help LLMs reason their way toward correct answers more reliably.
Tactics:
Instruct the model to work out its solution before rushing to a conclusion.
Use an inner monologue or a sequence of queries to hide the model's reasoning process.
Ask the model if it missed anything on previous passes.
Utilize External Tools: Compensate LLM weaknesses by feeding them the outputs of other tools. For example, a text retrieval system can inform LLms about relevant documents. A code execution engine can help LLMs perform math and run code. If a task can be done more reliably or efficiently by a tool rather than an LLM, offload it to get the best results.
Tactics:
Use embeddings-based search for efficient knowledge retrieval.
Use code execution for more accurate calculations or call external APIs.
Test Changes Systematically: Improving performance is easier if you can measure it. In some cases, modifying a prompt will improve performance on a few isolated examples but lead to worse overall performance on a more representative set of examples. Defining a comprehensive test suite (also known as an "eval") may be necessary to ensure that a change is net positive to performance.
Tactic:
Evaluate model outputs with gold-standard answers.
See Evaluation.
Each of the strategies listed above can be instantiated with specific tactics. These tactics are meant to provide ideas for things to try. They are not fully comprehensive, and you should feel free to test creative ideas not represented here.
Last updated