About
Robots.TXT

What is a robots.txt file?

The "robots.txt" file is an essential element you can place on your Web site to help search engines understand which parts of your site they may visit and index, and which parts they should avoid. By using the robots.txt file, you get control over how search engines interact with your site. So also over its interactions with CompanySpotter. It is especially useful if there are certain parts of your site that you do not want to see appear in search results, such as admin pages, private sections, or perhaps parts of your site that are still under construction. The file normally resides in the root directory of your site. In other words, if your site is www.test.com, your robots.txt file would be found at www.test.com/robots.txt.

The use of a robots.txt file is a basic aspect of SEO (Search Engine Optimization) and helps to get your site's appropriate content indexed and presented to search engine users. It can be a useful tool, but like all SEO strategies, it must be used carefully and wisely.

The important thing to remember is that although most search engines (including CompanySpotter) operate carefully and respect the rules of the robots.txt file, it is not an absolute guarantee. Not all search engines respect the rules, and malicious bots can willfully ignore the instructions.

How do you create a robots.txt file?

Open a text editor (such as Notepad on Windows, TextEdit on macOS or a more advanced text editor such as Sublime Text or Atom).
Write the lines you want to use (more on this later).
Save the file as "robots.txt." Make sure the file type is set to 'All files' and not '.txt'. The file should be named exactly 'robots.txt', not 'robots.txt.txt' or something else.
Place the file in the root of your Web site. Usually you do this through an FTP client or through the file management function of your hosting panel.

Useful examples

Below are some useful examples on how to build a robots.txt file:

Example 1: Block all search engines

If you don't want search engines to index your website, you can put the following in your robots.txt file:

User-agent: *
Disallow: /

Here User-agent says: * that the following rules apply to all search engines, and Disallow: / tells them to avoid the entire site. It then boils down to asking all search engines not to index pages.

Example 2: Blocking a specific search engine

For example, if you don't want Google to index your site, but you would like other search engines to index it:

User-agent: Googlebot
Disallow: /

Here User-agent: Googlebot says that the following rules apply to Google's search engine bot. Essentially, the Googlebot is asked not to index pages, while all other search engines are allowed to index.

Example 3: Blocking specific directories

If you want to prevent search engines from indexing certain directories of your site:

User-agent: *
Disallow: /private/
Disallow: /test/

In this example, all bots are instructed to avoid the directories "/private/" and "/test/." The bottom line is that all search engines are allowed to index all pages except those that are part of "/private/" and "/test/."

Example 4: Blocking specific files

If you want to prevent search engines from indexing certain files on your site:

User-agent: *
Disallow: /directory/my-file.html

This example instructs bots to avoid the specific file "my-file.html" in the "/directory/". The bottom line is that all search engines are allowed to index all pages except the /directory/my-file.html page.

AboutRobots.TXT

What is a robots.txt file?

How do you create a robots.txt file?

Useful examples

About
Robots.TXT