PROMPTFOO

Test, evaluate, and red-team LLM applications

22.9k starsMIT

ABOUT

Developers lack systematic tools to evaluate, test, and secure LLM prompts and applications. Promptfoo eliminates trial-and-error prompt development by providing automated evaluations, red teaming for vulnerability detection, and side-by-side model comparisons, enabling data-driven decisions about LLM quality and security.

INTEGRATION GUIDE

1. Automated evaluation and benchmarking of LLM prompt quality across multiple models including OpenAI, Anthropic, Azure, Bedrock, and Ollama 2. Red teaming and vulnerability scanning of LLM applications to detect prompt injections, jailbreaks, data leaks, and other security risks 3. CI/CD integration for continuous LLM eval and security testing in development workflows using GitHub Actions and other pipelines 4. Side-by-side comparison of different LLM models and prompt variations to identify the best-performing configuration 5. Code scanning for LLM-related security and compliance issues in pull requests

PROMPTFOO

ABOUT

INTEGRATION GUIDE

TAGS