I’ve written previously about strategies for evaluating new Large Language Models (LLMs) to assess whether they can be used to provide instructions for tasks like building bioweapons. One key topic is the importance of automating these evaluations to build fast, repeatable, transparent, and testable processes.
Selecting Tools for LLM Pre-Release…
I’ve written previously about strategies for evaluating new Large Language Models (LLMs) to assess whether they can be used to provide instructions for tasks like building bioweapons. One key topic is the importance of automating these evaluations to build fast, repeatable, transparent, and testable processes.