For pre-release testing, the threat model should inform the methods
You should reach out to Ycombinator and pitch a startup that provides a "Jailbreak detection and prevention for LLM models." Sort of like the Palo Alto firewalls that inspect application traffic for threats.
You should reach out to Ycombinator and pitch a startup that provides a "Jailbreak detection and prevention for LLM models." Sort of like the Palo Alto firewalls that inspect application traffic for threats.