I Checked Which of My Posts Perplexity Cites vs Ignores. The Pattern Was Obvious in Hindsight

#discuss #ai #seo #webdev

`
I write mostly about backend development. Node.js, database performance, API design, that kind of thing. Over the past two years I have published 34 posts on this blog and its DEV.to mirror.

A few weeks ago I spent an afternoon going through all 34 and checking each one against Perplexity AI. For each post I typed in the question its title implies and watched whether my post appeared as a cited source in the response.

11 of 34 were being cited. 23 were not.

My instinct going in was that the cited posts would be my best writing. More thorough research, clearer explanations, better examples. That is not what I found at all.

What the cited posts had in common

I laid out all 11 cited posts and just read through them with fresh eyes, looking for what they shared.

The first thing I noticed is that every single one of them opens with a sentence that directly answers something. Not a question, not a setup, not a framing device. A statement that contains real information.

My most-cited post, which shows up in Perplexity responses about once or twice a week, opens like this: "Connection pool exhaustion in PostgreSQL happens when your application opens connections faster than it closes them, typically caused by missing connection limits in your ORM configuration or uncaught errors that prevent proper cleanup." That sentence on its own is a citable answer. You could lift it out of the post entirely and it would satisfy someone's question.

My second most-cited post opens with a specific list of things you need to do to configure Redis pub/sub correctly. Again, first sentence, real information, no buildup.

Now I looked at the 23 that get zero citations. Without exception, they open with some version of context-setting. "Caching is one of the more misunderstood topics in backend development." "If you have ever hit rate limit errors at scale, you know how frustrating they can be." "API versioning comes up in almost every project eventually." All of that is accurate and reasonable. None of it is a citable answer to anything.

The second pattern: how specific the language was

The cited posts are full of names. Library names with version numbers. Specific configuration options. Named error types. Exact command syntax.

One of my cited posts references pg-pool version 3.6, MAX_CONNECTIONS defaults, idleTimeoutMillis, connectionTimeoutMillis, specific numbers for what reasonable pool sizes look like in different deployment environments. A developer reading it gets something they can act on immediately.

One of my non-cited posts about API rate limiting talks about "token bucket algorithms," "popular rate limiting libraries," and "standard approaches to backoff." All accurate. All vague. A reader understands the concepts but has nothing specific to do with them. An AI system has nothing specific to extract and cite with confidence.

The cited posts did not necessarily go deeper on the topic. They just named things specifically instead of describing them vaguely.

The one I was most surprised by

I have a post about database indexing strategies that I considered one of my stronger pieces. It covers B-tree indexes, partial indexes, covering indexes, the tradeoffs between index size and query performance. About 2,800 words. Good traffic from Google. I was proud of it.

Zero Perplexity citations.

I ran it through GoForTool's AI SEO Analyzer to see what the automated audit said. GEO score: 31 out of 100. The top two issues on the fix list were answer position (my actual guidance did not appear until word 410) and entity density (I had referenced "modern databases" and "popular query planners" throughout without naming PostgreSQL 16, MySQL 8.0, or SQLite 3.44 even once).

I had written a thorough, accurate post about databases that did not name a single database.

That landed differently when I saw it written out.

What I changed on three of the non-cited posts

I picked three posts that get decent Google traffic and applied fixes to them, using the GoForTool audit as a guide for each one.

For each post I did four things: moved the direct answer to the opening paragraph, replaced every vague reference with a specific named thing, added FAQPage schema with three question-answer pairs, and checked that PerplexityBot was not blocked in my robots.txt (it was not, thankfully, I had already fixed that separately).

Two weeks later I checked all three again.

One of the three now appears in Perplexity responses for two different query phrasings. A second one is appearing occasionally. The third has not changed yet, though its GEO score went from 28 to 74, so I expect citations to follow -- in my experience there is usually a lag of one to three weeks between the score improving and citations actually appearing.

The uncomfortable part to admit

When I look at the cited posts versus the non-cited posts, the cited ones are not my best writing in the sense I had always measured it. They are not the most carefully structured arguments or the most thorough explorations of a topic. Some of them are actually pretty direct and almost blunt.

What they are is immediately useful. They hand you something specific in the first paragraph and then keep being specific throughout. There is less narrative, less buildup, less "setting the scene."

I used to think that narrative structure was what made technical writing readable as opposed to being just documentation. I still think that is true for a human audience building up context. But for AI retrieval, the narrative is friction. The answer is what matters, and the faster you get to it the better.

This does not mean I am going to write without any narrative from now on. But it does mean I am a lot more deliberate about where the actual information sits in a post, and I run everything through the AI SEO Analyzer before publishing now to make sure I am not accidentally burying the answer again.

The 23 non-cited posts are slowly getting worked through. A few per week. The pattern for fixing them is always the same: find the first sentence that contains a real, specific, actionable piece of information, and move it to the top.

Have you done a similar audit on your own content? Curious whether the pattern I found holds across different technical writing areas or whether backend content has something specific going on here.