DEV Community

Aayush kumarsingh profile picture

Aayush kumarsingh

AI Systems Engineer. Multi-agent orchestration, RAG, LLM evals. Contributor to Mastra AI (22k★, YC-backed). Building TraceMind — open-source LLM observability.

Location Jodhpur, india Joined Joined on 
Why comparing average scores is the wrong way to evaluate LLM prompts (and what to do instead)

Why comparing average scores is the wrong way to evaluate LLM prompts (and what to do instead)

5
Comments
6 min read

Want to connect with Aayush kumarsingh?

Create an account to connect with Aayush kumarsingh. You can also sign in below to proceed if you already have an account.

Already have an account? Sign in
TraceMind v3 — I built an AI agent that diagnoses why your LLM quality dropped

Uses ReAct loops and semantic failure search

TraceMind v3 — I built an AI agent that diagnoses why your LLM quality dropped

7
Comments 5
5 min read
The gap between detecting hallucinations and handling them

The gap between detecting hallucinations and handling them

2
Comments
2 min read
The gap between detecting hallucinations and handling them

The gap between detecting hallucinations and handling them

1
Comments
2 min read
TraceMind v2 — I added hallucination detection and A/B testing to my open-source LLM eval platform

Separate claim extraction and verification

TraceMind v2 — I added hallucination detection and A/B testing to my open-source LLM eval platform

23
Comments 9
2 min read
I built an open-source LLM eval platform with a ReAct agent that diagnoses quality regressions

I built an open-source LLM eval platform with a ReAct agent that diagnoses quality regressions

1
Comments
3 min read
loading...