What are canonical links?

Canonical links are also known as “preferred links”. When search engines have to deal with duplicate content, they have to choose which link should be favored over all others that contain identical page content. Matt Cutts, Google Engineer, defined canonicalization on his blog as: “Canonicalization is the process of picking the best URL when there…

Continue reading

How does a Web Crawler work?

The first thing you need to understand is what a Web Crawler or Spider is and how it works. A Search Engine Spider (also known as a crawler, Robot, SearchBot or simply a Bot) is a program that most search engines use to find what’s new on the Internet. Google’s web crawler is known as GoogleBot. There are many types of web spiders in use, but for now, we’re only interested in the Bot that actually “crawls” the web and collects documents to build a searchable index for the different search engines. The program starts at a website and follows every hyperlink on each page.

Google Bot

So we can say that everything on the web will eventually be found and spidered, as the so called “spider” crawls from one website to another. Search engines may run thousands of instances of their web crawling programs simultaneously, on multiple servers. When a web crawler visits one of your pages, it loads the site’s content into a database. Once a page has been fetched, the text of your page is loaded into the search engine’s index, which is a massive database of words, and where they occur on different web pages. All of this may sound too technical for most people, but it’s important to understand the basics of how a Web Crawler works.

Continue reading

Create a Sitemap for your Blog

A Sitemap is a very important page of every Blog, whose function is to inform the Search Engines about all the URLs on a website that are available for crawling. A Sitemap is an XML file that shows all the URLs for a site and is updated every time you create a new article. This allows search engines to spider the site more intelligently and better. To create a Sitemap for your Wordpress Blog, you will need the XML Sitemap Generator for WordPress.

Sitemap for Wordpress

You can download it here. This Wordpress plugin will create a compliant sitemap in the format supported by most Search Engines, including the most popular ones, Google, Yahoo and Bing. Like I said above, Sitemaps are very useful because they give valuable information to the search engines.

Continue reading