What are some techniques for reducing hallucinations in an LLM?

To reduce hallucinations from a large language model (LLM), prompt engineering is the best place to start. Key prompt engineering techniques to try include:

Explicit instruction: Clearly tell the model what not to do or where it tends to go wrong. For example, your prompt could include phrases like: “only base your responses on this data: … ” or “if you do not have a clear source for your response, say ‘I don’t know’.”
Chain-of-thought prompting: Guide the model through a logical sequence of steps to reach an answer. Provide detail about what each step should entail, and consider asking the LLM to share its thought process for each step as a part of its response. This can be especially helpful in debugging why certain hallucinations are happening.
Modular prompting: Break up complex tasks into multiple LLM sessions so that each prompt has a single responsibility. Then, merge the results back together.
Combine chain-of-thought and modular prompting: If your prompt can be broken down into smaller tasks, but those tasks must be done sequentially instead of in parallel, a combination of the above two techniques might help. Each step could be run as its own LLM session using the output from the prior step.

Beyond these prompt engineering techniques, there are two additional steps you can take to reduce hallucinations — retrieval-augmented generation (RAG) and adjusting the LLM’s temperature.

RAG enables the LLM to leverage specific data sources beyond the data it was originally trained on. This helps by grounding the model’s output in data sources most relevant to your solution. If hallucinations persist even after implementing RAG, it's worth checking whether the retrieved chunks may be too large or too small or if there is a more effective chunking strategy that you can leverage.

Adjusting the temperature of an LLM means updating the parameter that controls the model’s level of randomness. A low temperature will result in less random and more deterministic outcomes than a high temperature and therefore can reduce hallucinations. However, it comes with the tradeoff of less creative (and more repetitive) responses, which may not be desirable for your use case.

Additional Resources:

This response has been generated by an LLM based on notes from PJMF technical consultations. All responses go through human review by our PJMF Products & Services team and are anonymized to protect our consultation participants.