LANGSMITH

Observe, debug, and improve your LLM apps in production

ABOUT

LLM applications fail in non-obvious ways — a prompt works in testing but hallucinates in production, latency spikes are hard to trace, and regressions are invisible without baselines. LangSmith captures every LLM call as a structured trace (inputs, outputs, latency, cost), lets you replay runs for debugging, and runs automated evaluations so you know when a model or prompt change made things worse.

INSTALL

npm install langsmith

INTEGRATION GUIDE

1. Trace every chain step in a LangChain agent to find where it hallucinates or gets stuck 2. Run automated evals against a golden dataset whenever you change your prompt 3. Monitor production latency and token costs across different LLM providers 4. Build a dataset of real user queries + expected outputs for regression testing 5. A/B test prompt variants and compare quality scores side by side

LANGSMITH

ABOUT

INTEGRATION GUIDE

TAGS