DEV Community

TechMan09
TechMan09

Posted on

SEO - Crawled - currently not indexed - Help

Google has crawled by site about a week ago, and only marked 5 pages as "valid". The majority of my pages are marked as "Crawled - currently not indexed".

Google says that means that "the page was crawled by Google, but not indexed. It may or may not be indexed in the future; no need to resubmit this URL for crawling."

Why did Google tag most of my pages as this, and how long will it take before they are added to the index (And is there anything I can do to speed up the process)?

Thanks!

EDIT: My robots.txt file. All the files that are not showing up in the index are inside of the “webhosting” folder, not mentioned in the file below.


User-agent: *
Disallow: /admin
Disallow: /menu
Disallow: /menu.html
Disallow: /login
Disallow: /login.html

Disallow: /shop/*?page=$
Disallow: /shop/*&page=$
Disallow: /shop/*?sort=
Disallow: /shop/*&sort=
Disallow: /shop/*?order=
Disallow: /shop/*&order=
Disallow: /shop/*?limit=
Disallow: /shop/*&limit=
Disallow: /shop/*?filter_name=
Disallow: /shop/*&filter_name=
Disallow: /shop/*?filter_sub_category=
Disallow: /shop/*&filter_sub_category=
Disallow: /shop/*?filter_description=
Disallow: /shop/*&filter_description=
Enter fullscreen mode Exit fullscreen mode

Top comments (12)

Collapse
 
stackdiary profile image
Alex Ivanovs

It's a common pattern for new sites. Google takes quite a while to properly understand a website and determine where it should stand in terms of being a reliable source of information. I imagine a lot of it has to do with avoiding sudden manipulation, though ultimately has an effect on average people like you and me, also.

Collapse
 
techman09 profile image
TechMan09

About how long did it take for you, just so I can get a rough idea?

Thanks!

Collapse
 
stackdiary profile image
Alex Ivanovs

The whole process? I would say around ~3 months.

I run a content blog myself, and have like 12 long-form posts published, but only 4 of them are indexed. I have great links and decent exposure already, but those pages haven't been indexed yet. So, like you, I have to wait.

I speak from experience, too. I have done sites like this before and it's always around the 3 month mark when indexing becomes more regular. Granted, you have to keep the site active and work on it regularly.

Thread Thread
 
techman09 profile image
TechMan09

Thanks so much for the knowledge, it helps out a ton!

Collapse
 
shriekdj profile image
Shrikant Dhayje

try changing last line from
Disallow: /shop/*&filter_description=
To

Disallow: /shop/*&filter_description=

Sitemap: yourwebsite.com/sitemap.xml
Enter fullscreen mode Exit fullscreen mode

by this it will not cause errors in future

Collapse
 
optimisedu profile image
optimisedu • Edited

As a rule and I am fairly experienced in this field it takes two to six weeks to fully crawl a traditional new site with no authoritative backlinks, which is made mainly in html. You are clearly using search console which will help. During this period putting a link to your sitemap in your footer can be useful.

Try generating authority both with internal and external linking. If you are client side rendering a site using a js framework Google may take longer and it is true in a rare niche small website using client side js it can take longer than six weeks. Links and content build authority. If you use a is framework let me know which one or better just link the site, but with new sites you can't just trigger a recrawl and expect Google to fetch all your content as you used to. dev.to is an authoritive backlink and googlebot can follow no-follow links

Collapse
 
shriekdj profile image
Shrikant Dhayje • Edited

i also had the same issue but there is a way:

  • Check the robots.txt is written correctly which finds where google bot can index the sites and also Add sitemap.xml file location with domain in robots.txt.
  • Most of Time this will solve your issue.
Collapse
 
techman09 profile image
TechMan09

Thanks for the reply! Unfortunately, all the pages marked as that are in my site map, and are not restricted by robots.txt. The files that are rerun an error instead.

Collapse
 
cicirello profile image
Vincent A. Cicirello • Edited

It is definitely not a robots.txt thing. If pages are blocked by robots.txt then they won't show up in Google search console at all. Google's crawler respects robots.txt so won't crawl stuff if you direct them not to.

Anyway, there isn't really anything you can do to make them index those pages quicker. The fact that they crawled them is the first step. If they were listed as duplicates, or that Google selected a different url as canonical, or something similar, would be a different story. You'll likely see the number of pages marked as "crawled currently not indexed" go down over time as Google adds them to index, or excludes them for other reasons.

Thread Thread
 
techman09 profile image
TechMan09

Please see post edit

Thread Thread
 
cicirello profile image
Vincent A. Cicirello

Even without seeing your robots.txt it was clear that wasn't the cause, as Google won't even crawl those. Just be patient with the ones marked as "crawled not currently indexed." Google doesn't instantly index the pages they crawl, and doesn't necessarily index all that they crawl. It sounds like your site is new since you mentioned a week ago. Watch for other reasons pages are excluded. The "crawled currently not indexed" status is where pages end up before either being indexed or excluded for other reasons. Some pages may stay in that status for longer than others. On my domain, there are close to 400 urls that are indexed currently, and around 80 with the status "crawled currently not indexed." Some of those will likely be indexed in the future. Others may stay in that status indefinitely.

Thread Thread
 
techman09 profile image
TechMan09

Thanks. I read this part wrong, mistakenly thinking you sad that was the issue! “It is definitely not a robots.txt thing”.

I was worried that there was nothing I could do, although I kind of expected it. Thanks for the conformation though.