Skip to main content
Photo of DeepakNess DeepakNess

Prompt injection attacks

Unproofread notes

A new paper titled "Design Patterns for Securing LLM Agents against Prompt Injections" has been recently published where they discuss 6 different design patterns to prevent LLM Agents from prompt injection attacks.

I got to know about this from Simon Willison's post, and he has also written a detailed blog post about the paper.

I also discovered this SAIF Risk Map from Google, which Daniel Di Bartolo shared under Simon's tweet. Basically, it's a mental model for securing complex AI systems and SAIF stands for Secure AI Framework. And they have also recently published a paper titled "Google's Approach for Secure AI Agents", which seems interesting.

Comment via email