Aayush kumarsingh

AI Systems Engineer. Multi-agent orchestration, RAG, LLM evals. Contributor to Mastra AI (22k★, YC-backed). Building TraceMind — open-source LLM observability.

Jodhpur, india Joined on Apr 9, 2026

Aayush kumarsingh

May 8

Why comparing average scores is the wrong way to evaluate LLM prompts (and what to do instead)

#python #llm #machinelearning #opensource

6 min read

Want to connect with Aayush kumarsingh?

Create an account to connect with Aayush kumarsingh. You can also sign in below to proceed if you already have an account.

Create Account

Already have an account? Sign in

Uses ReAct loops and semantic failure search

Aayush kumarsingh

May 5

TraceMind v3 — I built an AI agent that diagnoses why your LLM quality dropped

#python #opensource #agents #llmops

5 min read

Aayush kumarsingh

Apr 15

The gap between detecting hallucinations and handling them

#python #opensource #llm #rag

2 min read

Aayush kumarsingh

Apr 15

The gap between detecting hallucinations and handling them

#python #opensource #llm #rag

2 min read

Separate claim extraction and verification

Aayush kumarsingh

Apr 14

TraceMind v2 — I added hallucination detection and A/B testing to my open-source LLM eval platform

#python #llmops #opensource #llm

2 min read

Aayush kumarsingh

Apr 9

I built an open-source LLM eval platform with a ReAct agent that diagnoses quality regressions

#python #rag #opensource #llm

3 min read

DEV Community

Aayush kumarsingh

Badges

1 Week Community Wellness Streak

Python

Writing Debut

Why comparing average scores is the wrong way to evaluate LLM prompts (and what to do instead)

Want to connect with Aayush kumarsingh?

TraceMind v3 — I built an AI agent that diagnoses why your LLM quality dropped

The gap between detecting hallucinations and handling them

The gap between detecting hallucinations and handling them

TraceMind v2 — I added hallucination detection and A/B testing to my open-source LLM eval platform

I built an open-source LLM eval platform with a ReAct agent that diagnoses quality regressions