As of June 2021 it is (finally) possible to edit your robots.txt file. This is great because it can help you focus crawling resources on the pages that you actually want to rank in Google.
Why limit crawling?
Not all pages are good for SEO and some Google even require you to exclude – such as search result pages.
But also thin pages, pages with no or very little original content or pages that are just not good landing pages from search should be excluded.
The standards for excluding pages in Search Engines
There are two standards for excluding pages from the search engines:
- Robots.txt
- META-robots
Robots.txt is a standard that most search engines and other agents follow. It can tell all of them – or specific agents which files or directories not to crawl.
However, you have to keep in mind that Google often index pages that is not crawled. This can be because lots of other sites link to them – so Google know they exist. However, as they do not crawl the pages you exclude in your robots.txt file you are safe from thin- or duplicate content filtering – and the damaging SEO-effect it can have on your SEO, as the search engines do not actually see the content on the page.
META-robots is a standard all major search engines follow – but not always respect. With this you can tell them to not index, follow or not follow links from pages (NOINDEX, FOLLOW – or NOFOLLOW). The standard also include the option to allow indexing (INDEX) but that serves no real purpose as that will be assumed even with no tag.
It is very important to understand that the META-robots tag is just a signal. Not a directive. Sometimes other signals will overrule this signal. So you can never be 100% sure pages you add NOINDEX to will not be indexed.
Do not NOINDEX pages excluded in your robots.txt
If you exclude pages from being crawled it serves absolutely no purpose to also include META-robots tags. Because, when a page is not crawled the search engines will not see the content of that page – including the META-robots.txt.
Editing your Shopify Robots.txt file
You can see your current Robots.txt file here: YourDomain.com/robots.txt
Unlike many other CMS, where the robots.txt file is a static text file you can edit, in Shopify it is dynamically generated. So to alter it you need to do it with liquid code.
First, go to your Theme Editor. Under Templates click “Add a new template“.
Select “robots.txt” and click Create template.
Now you have created the liquid robots.txt file
The code in the current default version of the file looks like this:
# we use Shopify as our ecommerce platform
{%- comment -%}
# Caution! Please read https://help.shopify.com/en/manual/promoting-marketing/seo/editing-robots-txt before proceeding to make changes to this file.
{% endcomment %}
{% for group in robots.default_groups %}
{{- group.user_agent -}}
{% for rule in group.rules %}
{{- rule -}}
{% endfor %}
{%- if group.sitemap != blank -%}
{{ group.sitemap }}
{%- endif -%}
{% endfor %}
Copy
As pointed out in the robots.txt file it may be a good idea for you to read the official documentation before you start editing it.
You can add, edit or remove rules in the default file following the documentation above.
PLEASE KEEP IN MIND that if you mess up your robots.txt file it can have a serious negative impact on your SEO. So be careful and only do edits you KNOW how works. If in doubt then consult qualified SEO-experts and coders before you do anything.
We also strongly recommend that you validate and test the rules in your updated robots.txt file when done. Here is a good free robots.txt testing tool.