admin管理员组

文章数量:1122832

I have just received an email from Google advising that it can't accessing certain javascript and css content from my site.

I've looked at the robots.txt file which contains:

User-agent: * Crawl-delay: 5

Disallow: /feed/

Disallow: /trackback/

Disallow: /wp-admin/

Disallow: /wp-includes/

Disallow: /xmlrpc.php

Disallow: /wp-*

It looks like it is the Disallow: /wp-* that is doing the damage.

I am just going through the process of disabling each plugin in turn to see which one (if any) is causing this line to appear in the robots.txt file, but could there be another reason for it (e.g., core WordPress feature/setting)?

And is it fine and safe for me to just remove this Disallow: /wp-* line?

I have just received an email from Google advising that it can't accessing certain javascript and css content from my site.

I've looked at the robots.txt file which contains:

User-agent: * Crawl-delay: 5

Disallow: /feed/

Disallow: /trackback/

Disallow: /wp-admin/

Disallow: /wp-includes/

Disallow: /xmlrpc.php

Disallow: /wp-*

It looks like it is the Disallow: /wp-* that is doing the damage.

I am just going through the process of disabling each plugin in turn to see which one (if any) is causing this line to appear in the robots.txt file, but could there be another reason for it (e.g., core WordPress feature/setting)?

And is it fine and safe for me to just remove this Disallow: /wp-* line?

Share Improve this question edited Jul 29, 2015 at 1:08 fuxia 107k38 gold badges255 silver badges459 bronze badges asked Jul 28, 2015 at 16:43 Boycott A.I.Boycott A.I. 15811 silver badges24 bronze badges
Add a comment  | 

3 Answers 3

Reset to default 1

It seems to be a WP default setting, as many Webmasters have gotten this warning and never edited the robots.txt. Removing all the disallows is the easiest solution, but I assume you want some or all of those directories blocked.

Google is only concerned about the .js and .css files, so you could in theory edit the robots.txt to include:

User-Agent: Googlebot Allow: /.js Allow: /.css

However, being that specific could require future changes to the user agent, in case more search crawlers follow Google's example.

You want to make sure you know how robots.txt work so you don't accidentally block your entire site or important sections. Here is a good reference for more details about robots.txt:

http://www.robotstxt.org/robotstxt.html

I have sorted this now. I'm not sure where the original robots.txt content could have come from(??), but I have now changed it on the origin server to:

User-agent: * Crawl-delay: 5

Disallow: /feed/

Disallow: /trackback/

Disallow: /wp-admin/

Disallow: /wp-includes/

Disallow: /xmlrpc.php

Disallow: /wp-content/

Disallow: /wp-*

Also, the site uses a CDN, so I specified a seperate set of rules for the CDN's robots.txt file:

User-agent: *

Allow: /wp-content/

Disallow: /

I would recommend the following:

User-agent: *
Disallow: */trackback/
Disallow: */xmlrpc.php
Disallow: /wp-*.php
Disallow: /cgi-bin/
Disallow: /wp-admin/
Allow: */wp-content/uploads/

本文标签: pluginsDisallow wp* in robotstxt