What is a Robots txt We tell you clearly and simply

There are certain standards that allow you to tell these bots: I don't want you to visit these places on my site because they are not relevant or have sensitive information." How do we indicate that to the bots? With guidelines established in the standard to create robots.txt, which we will see later. So the bot always There are certain types of bots that ignore the guidelines, such as malware robots, which are used to find vulnerabilities on your site or to spam. But what does a robots.txt file look like? robots.txt view And how is it created or configured? There are several ways to create a robots.txt file, one of them is by creating the file on our computer and uploading.

But what are those bots that block robots txt

However, in the case of WordPress, the robots.txt comes by default when installing the CMS. But what is the dilemma? Which, as we see in the following example, does not have any parameters established and the bots crawl all the URLs of my site Before continuing to explain the parameters of robots.txt, let's talk about robots or a web bot, which are programs that, using specific parameters, navigate the World Wide Web with the purpose of crawling. The contents of any website. In the case of search engines, each of them has its own bot: Google Bot: Google's crawler. Slurp Bot: Yahoo's bot. Bing Bot: The Bing crawler. The Alexa tracking bot. robots.txt considerations Although we will continue to call them.

Why is it important to use robots txt

As we said, robots.txt tells crawlers or spiders what content you do not want them to visit on your website, including: Link analysis. HTML code validation. Or content analysis (to detect spam) Search for images, videos, audios, etc. Although there are other types of content that bots crawl, with robots.txt you can tell the spiders what content I want it to avoid every time it "reads and analyzes" the URLs of my website. Why tell spiders not to read certain types of content? Easy: there are certain routes or information on a website that are not relevant either to the business or to the users.

