LLM Explainability

Exploring methods for explaining factuality and attributions of LLM outputs

LLMs can generate factually incorrect information that appears plausible, a phenomenon known as “hallucination”. Hallucinations pose risks not only for companies that leverage LLMs, but also for end users. For example, Google’s stock value dropped after its AI-powered product generated factual errors during a public demonstration; Air Canada was sued due to false information given by its AI-powered chatbot; and a lawyer was reprimanded by judges for referencing hallucinated case law.

There have been technical advancements in calculating factuality estimates, i.e., the assessment of how factual an AI-generated response is. Presenting an AI-generated response with factuality estimates allows users to recognize incorrect information, thus helping them to cross-check verified sources and make better decisions. We conducted a series of survey-based experiments aimed at identifying the most effective strategy for communicating the factuality estimates of an LLM’s response to users in a way that helps them comprehend the accuracy of the model’s response and calibrate their trust while aligning with their preferences.

In (Do et al., 2025), we found that highlighting every phrase in a response using a color scale for its factuality estimate was the most preferred and trusted design.

In a follow-up study (Do & Geyer, 2025), we found a promising alternative: hiding content estimated to be less factual, either by removing it or replacing it with ambiguous statements, can enhance user trust while maintaining perceived quality and transparency.

References

2025

  1. Highlight All the Phrases: Enhancing LLM Transparency through Visual Factuality Indicators
    Hyo Jin Do, Rachel Ostrand, Werner Geyer, Keerthiram Murugesan, Dennis Wei, and 1 more author
    In Proceedings of the Eighth AAAI/ACM Conference on Artificial Intelligence, Ethics and Society (AIES 2025), 2025
  2. Hide or Highlight: Understanding the Impact of Factuality Expression on User Trust
    Hyo Jin Do, and Werner Geyer
    In Proceedings of the Eighth AAAI/ACM Conference on Artificial Intelligence, Ethics and Society (AIES 2025), 2025