DEV Community

Serhat Doğan
Serhat Doğan

Posted on

I Ran 5 AI Search Engines Through the Same Query About My SaaS. Only One Cited Me.

This week I shipped a stack of SEO changes for my side project (a privacy-first SMS verification SaaS), then ran a controlled test to see which AI search engines would actually mention the product when asked a relevant question.

Same query. Five platforms. The results were not what I expected.

The setup

I run a small SaaS at verifysms.app. It sells temporary phone numbers for SMS verification — the kind of thing people use to sign up for services without giving out their real number. The competitive landscape recently shifted: SMS-Activate (one of the biggest players) shut down in March 2026, so a lot of people are searching for alternatives right now.

My hypothesis: if I want growth from AI search (ChatGPT, Claude, Gemini, Perplexity), I need to know what those engines are actually saying when someone asks about my space.

So I picked one query: "What are the best SMS-Activate alternatives in 2026?" and ran it through five engines in a single sitting.

The results

ChatGPT (logged in)

Outcome: refusal. Not a "I can't help with that" wall, but a soft pivot. It told me the category is "often associated with risky behavior" and instead recommended legitimate eSIM options like SecondSIM, MySudo, and Numero eSIM.

My product wasn't mentioned. Neither were any of my direct competitors.

This was the most useful result of the test, and I'll come back to it.

Claude

Outcome: cited me. My blog post verifysms.app/blog/sms-activate-alternative-2026/ showed up at position #4 in the search results panel, and Claude pulled two direct quotes from it into the answer body — one a statistic about pricing, one a comparison point about Telegram-only services.

This was the only engine in the test that surfaced VerifySMS as a primary source.

Gemini

Outcome: no cite. It recommended HeroSMS, 1001SMS, BEE-SMS, and DistrictDroid. Two of those I'd never heard of. The other two I knew but hadn't seen ranked anywhere before. None of mine.

Perplexity (logged in)

Outcome: no cite. The answer leaned heavily on a multilogin.com guide and a few Reddit threads. My site appeared nowhere in the source list of fifteen results.

Google AI Overview

Outcome: no cite in the AI box, but my page did rank in the organic results below it with a clean snippet: "100% Real SIMs ... VerifySMS provides..."

So: one cite out of five engines. Ratio matches roughly what I see in GA4 (more on that below).

The follow-up test I almost forgot to run

After looking at the Claude result I had a worry: was Claude citing me because I was logged in, and the system was reading from prior conversations or my account memory? That would make the data worthless.

So I opened a fresh anonymous Playwright session — no cookies, no account, no memory — and re-ran the same prompts.

Two surprises:

  1. Perplexity anonymous returned three VerifySMS cites, not zero. My dev.to post from earlier this year, a Lithuanian-language blog page at /lt/blog/sms-activate-alternative-2026/, and my App Store listing.

  2. ChatGPT Temporary Chat (memory-free) gave me the same refusal pattern. So that wasn't account contamination — it's the model's built-in policy on this category.

The logged-in Perplexity result had been overweighted toward the user's own context. Anonymous testing revealed the real public surface, which is more favorable than I thought.

If you're doing this kind of audit, always test both states.

What I shipped this week to influence those results

Three deployments to my Cloudflare Worker, all on the same day:

  1. FAQ schema injection on four priority pages (mega-comparison, compare hub, sms-activate-alternative, virtual-phone-apps). Twenty-two question/answer pairs total, written in a deliberately human voice — specific numbers, contractions, no "It's important to note." Schema.org FAQPage JSON-LD inlined per slug, idempotency guard so it only injects once.

  2. Author byline + trust pills + related-reading blocks on the same four pages. The byline reads "By Serhat Dogan · Founder & Engineer · Last updated April 7, 2026" with five trust pills below it (user count, country count, refund policy, payment methods). Related-reading is a curated four-link block before </article>.

  3. Trustpilot widget sitewide. Discovered while debugging that my homepage uses a different code path than detail pages — a cached-homepage early-return that fires before the sitewide post-process block. Had to add a mirrored inject inside the cache path. Also had to update CSP to allow widget.trustpilot.com and *.trustpilot.com across script-src, style-src, connect-src, and a new frame-src directive.

Total: four hours of work, all idempotent guards in place, smoke test passing on 14 language variants of the homepage plus all detail page types.

The GA4 reality check

The actually-useful number: how much traffic is each AI engine sending me, in revenue, over 40 days?

Engine Users Revenue (£)
chatgpt.com 1,219 110.58
bing 28 0
perplexity (+.ai) 15 0
claude.ai 11 0
openai 3 0
Chinese AI (doubao, yuanbao) 4 0

ChatGPT is sending 95% of my AI traffic and 100% of my AI revenue. Everything else is a rounding error.

This contradicts the citation test on the surface — ChatGPT refused to cite me, but it's sending me a thousand users. The resolution is that ChatGPT's refusal applies to discovery intent ("what are the best..."). It still routes brand intent ("verifysms WhatsApp pricing") to me, and it still mentions specific domain names in conversational follow-ups when a user names them first.

The refusal isn't a wall. It's a filter that only catches one type of query.

Three things I'm taking from this

  1. Test five engines, not one. If I'd only checked ChatGPT, I would have concluded my SEO work had no AI impact. Claude told me it did. Anonymous Perplexity told me dev.to posts are doing real work I wasn't crediting.

  2. A refusal is a data point, not a failure. ChatGPT calling my category "risky" is the actual signal — it tells me discovery queries are off-limits and I should invest in brand-driven queries instead. Trying to "trick" the model into citing me on discovery would just shift the refusal goalposts; trying to make my brand name memorable enough to be queried directly is a real strategy.

  3. Sub-language pages matter more than they look. Perplexity anonymous cited my Lithuanian /lt/ page, not my English one. I have fourteen language variants and they're not equal in AI cite eligibility — the language-tagged pages are landing differently in retrieval. I'd been underweighting them.

I'll write up the tactical "homepage cache path vs sitewide post-process" Cloudflare Worker debugging in a separate post — that one's worth its own walkthrough.

If you're running anything that depends on AI search visibility, I'd genuinely encourage doing the five-engine + anonymous-fallback test on your own product this week. The result will probably surprise you in at least one direction.


Originally published at verifysms.app

Top comments (0)