Welcome to Peter Zmijewski webiste HOW I MADE MY FIRST MILLION where he reveals his secrets and shares the ways to make millions.

18th
JUN

Robots.txt File and SEO

Posted by peterzmijewski | Filed under Internet Marketing

When it comes to SEO, most people understand that a website must have content, “search engine friendly” site architecture/HTML, and meta data (title tags and meta descriptions).

Another meta element, if implemented incorrectly, that can also trip up websites is robots.txt. I was recently reminded of this while reviewing the website of a large company that had spent considerable money on building a mobile version of their website, on a sub-directory. That’s fine, but having a disallow statement in their robots.txt file meant that the website wasn’t accessible to search engines (Disallow: /mobile/)

Let’s review how to properly implement robots.txt to avoid search ranking problems and damaging your business, as well as how to correctly disallow search engine crawling.

Robots.txt File?

Simply put, if you go to domain.com/robots.txt, you should see a list of directories of the website that the site owner is asking the search engines to “skip” (or “disallow”). However, if you aren’t careful when editing a robots.txt file, you could be putting information in your robots.txt file that could really hurt your business.

There’s tons of information about the robots.txt file available at the Web Robots Pages, including the proper usage of the disallow feature, and blocking “bad bots” from indexing your website.

The general rule of thumb is to make sure a robots.txt file exists at the root of your domain (e.g., domain.com/robots.txt). To exclude all robots from indexing part of your website, your robots.txt file would look something like this:

User-agent:
* Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/

The above syntax would tell all robots not to index the /cgi-bin/, the /tmp/, and the /junk/ directories on your website.

Robots.txt Dos and Don’ts:

There are many good reasons to stop the search engines from indexing certain directories on a website and allowing others for SEO purposes. Let’s look at some examples.

Do with robots.txt:
• Take a look at all of the directories in your website. Most likely, there are directories that you’d want to disallow the search engines from indexing, including directories like /cgi-bin/, /wp-admin/, /cart/, /scripts/, and others that might include sensitive data.
• Stop the search engines from indexing certain directories of your site that might include duplicate content. For example, some websites have “print versions” of web pages and articles that allow visitors to print them easily. You should only allow the search engines to index one version of your content.
• Make sure that nothing stops the search engines from indexing the main content of your website.
• Look for certain files on your site that you might want to disallow the search engines from indexing, such as certain scripts, or files that might contain email addresses, phone numbers, or other sensitive data.

Source: Searchenginewatch

Peter Zmijewski is the founder and CEO at KeywordSpy. His expert knowledge on Internet Marketing practices and techniques has earned him the title “Internet Marketing Guru”. He is also an innovator, investor and entrepreneur widely recognized by the top players in the industry.

Tags: ,

Leave a Reply

You must be logged in to post a comment.