What is the Robots.txt file?
A robots.txt file on the server is what web pages or links you want to show on google searches or which pages you want to hide.Simply, we can say it restricts the crawler bot to crawl our website. By this, you can set your sitemap that how many posts links you want to show in google search.
Note: Please read the information carefully otherwise, it will affect your website appearance on google searches.
Why use Robots.txt?
Remember that before the google bot crawls your site, it checks your robots.txt file.By this, you can command crawlers not to crawl pages like thank you, login page, survey pages. Also, you can hide particular posts and pages with it. So, it benefits you as wells as for crawlers to easily index our website.
Here is how crawler checks the robots.txt file.you can find robots.txt file as it looks like this
http://example.com/robots.txt.Similarly our website robots.txt look like this http://whizzyshubham.blogspot.com/robots.txt
Robot.txt Code for Blogger
User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /search
Allow: /
Sitemap: http://example.blogspot.com/feeds/posts/default?orderby=UPDATED
Disallow:
User-agent: *
Disallow: /search
Allow: /
Sitemap: http://example.blogspot.com/feeds/posts/default?orderby=UPDATED
Note: There is no need to add robots.txt code in bogger until you want to customize it.
Robots.txt Customization?
In robots.txt you should understand the meaning of code first.
- User-agent: Mediapartners-Google
This code is for Google Adsense bot to serve better ads on your site. Whether you are using Adsense or not leave this as it is.
- User-agent: *
This is for all robots marked with an asterisk (*). In default settings, our blog’s labels links are restricted to indexed by search crawlers that mean the web crawlers will not index our labels page links because of the below code.
- Disallow: /search
That means the links having a keyword search just after the domain name will be ignored. See the below example which is a link of the label page named PC.http://www.bloggertipstricks.com/search/label/PC
And if we remove Disallow: /search from the above code then crawlers will access our entire blog to index and crawl all of its content and web pages.
- Allow:/
Here Allow: / refers to the Homepage that means web crawlers can crawl and index our blog’s homepage.
- Disallow Particular Post
If we want to exclude a particular post from indexing then we can add below lines in the code.Disallow: /yyyy/mm/post-url.html
Here yyyy and mm refer to the publishing year and month of the post respectively. For example, if we have published a post in the year 2020 in April then we have to use the below format.
Disallow: /2020/04/post-url.html
To make this task easy, you can simply copy the post URL and remove the blog name from the beginning.
- Disallow Particular Page
If we need to disallow a particular page then we can use the same method as above. Simply copy the page URL and remove blog address from it which will something look like this:Disallow: /p/page-url.html
- Custom Sitemap
Sitemap: http://example.blogspot.com/feeds/posts/default?orderby=UPDATED This code refers to the sitemap of our blog. By adding a sitemap link here we are simply optimizing our blog’s crawling rate.
Means whenever the web crawlers scan our robots.txt file they will find a path to our sitemap where all the links of our published posts present. Hence, web crawlers will find it easy to crawl all of our posts.
Note: This sitemap will only tell the web crawlers about the recent 25 posts. If you want to increase the number of links in your sitemap then replace the default sitemap with below one. It will work for the first 500 recent posts.
Sitemap: http://example.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500
If you have more than 500 published posts in your blog then you can use two sitemaps like below:
Sitemap: http://example.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500
Sitemap: http://example.blogspot.com/atom.xml?redirect=false&start-index=500&max-results=1000
How to add Robots.txt in Blogger?
Now the main part of this tutorial is how to add custom robots.txt in blogger. So below are steps to add it.
Go to your blogger blog.
- Navigate to Settings >> Search Preferences ›› Crawlers and indexing ›› Custom robots.txt ›› Edit ›› Yes
- Now paste your robots.txt file code in the box.
- Click on the Save Changes button.
You are done!
2 Comments
but my website isn't found by Google, when I search on it
ReplyDeletemy custom domain cannot be found on the site
ReplyDelete