To reduce hallucinations from a large language model (LLM), prompt engineering is the best place to start. Key prompt engineering techniques to try include:

Beyond these prompt engineering techniques, there are two additional steps you can take to reduce hallucinations — retrieval-augmented generation (RAG) and adjusting the LLM’s temperature.

RAG enables the LLM to leverage specific data sources beyond the data it was originally trained on. This helps by grounding the model’s output in data sources most relevant to your solution. If hallucinations persist even after implementing RAG, it's worth checking whether the retrieved chunks may be too large or too small or if there is a more effective chunking strategy that you can leverage.

Adjusting the temperature of an LLM means updating the parameter that controls the model’s level of randomness. A low temperature will result in less random and more deterministic outcomes than a high temperature and therefore can reduce hallucinations. However, it comes with the tradeoff of less creative (and more repetitive) responses, which may not be desirable for your use case.

Additional Resources:

This response has been generated by an LLM based on notes from PJMF technical consultations. All responses go through human review by our PJMF Products & Services team and are anonymized to protect our consultation participants.