AttnTrace: Attention-based Context Traceback for Long-Context LLMs

AttnTrace traces a model's generated statements back to specific parts of the context using attention-based traceback. Try it out with Meta-Llama-3.1-8B-Instruct here! See the [paper] and [code] for more! Maintained by the AttnTrace team.

AttnTrace is an efficient context traceback method for long contexts (e.g., full papers). It is over 15× faster than the state-of-the-art context traceback method TracLLM. Compared to previous attention-based approaches, AttnTrace is more accurate, reliable, and memory-efficient.

AttnTrace can be used in many real-world applications, such as tracing back to:

📄 prompt injection instructions that manipulate LLM-generated paper reviews.
💻 malicious comment & code hiding in the codebase that misleads the AI coding assistant.
🤖 malicious instructions that mislead the action of the LLM agent.
🖋 source texts in the context from an AI summary.
🔍 evidence that supports the LLM-generated answer for a question.
❌ misinformation (corrupted knowledge) that manipulates LLM output for a question.
And a lot more...

🚀 Try These Examples!

Enter your context and instruction below to try out AttnTrace! You can also click on the example buttons above to load pre-configured examples.

Color Legend for Context Traceback (by ranking): Red = 1st (most important) | Orange = 2nd | Golden = 3rd | Yellow = 4th-5th | Light = 6th+

Instruction

Response (Editable)

Leave empty and click button to generate from LLM, or type your own response to use for traceback

Click to select text for traceback!

Click the 'Generate/Use Response' button on the left to see response text here for traceback analysis.