How does a Web Crawler work?

The first thing you need to understand is what a Web Crawler or Spider is and how it works. A Search Engine Spider (also known as a crawler, Robot, SearchBot or simply a Bot) is a program that most search engines use to find what’s new on the Internet. Google’s web crawler is known as GoogleBot. There are many types of web spiders in use, but for now, we’re only interested in the Bot that actually “crawls” the web and collects documents to build a searchable index for the different search engines. The program starts at a website and follows every hyperlink on each page.

Google Bot

So we can say that everything on the web will eventually be found and spidered, as the so called “spider” crawls from one website to another. Search engines may run thousands of instances of their web crawling programs simultaneously, on multiple servers. When a web crawler visits one of your pages, it loads the site’s content into a database. Once a page has been fetched, the text of your page is loaded into the search engine’s index, which is a massive database of words, and where they occur on different web pages. All of this may sound too technical for most people, but it’s important to understand the basics of how a Web Crawler works.

Continue reading

Create a Sitemap for your Blog

A Sitemap is a very important page of every Blog, whose function is to inform the Search Engines about all the URLs on a website that are available for crawling. A Sitemap is an XML file that shows all the URLs for a site and is updated every time you create a new article. This allows search engines to spider the site more intelligently and better. To create a Sitemap for your Wordpress Blog, you will need the XML Sitemap Generator for WordPress.

Sitemap for Wordpress

You can download it here. This Wordpress plugin will create a compliant sitemap in the format supported by most Search Engines, including the most popular ones, Google, Yahoo and Bing. Like I said above, Sitemaps are very useful because they give valuable information to the search engines.

Continue reading

SEO tips for your WordPress Blog

I took some notes from Google’s Matt Cutts conference at WordCamp 2009 San Francisco. Matt Cutts is one of the heads of Google. He talked about how Google search works and how can you improve your Wordpress Blog for Google.

He says that Wordpress is agreat choice if you want to do better in Google. Wordpress automatically solves a lot of SEO issues. According to Matt, Wordpress takes care of 80-90% of the mechanics of Search Engine Optimization (SEO). So, by using Wordpress you are already taking the first big step. However, there are many things you can do to optimze your site. To rank well in google your site needs to be relevant and reputable. You have to be on topic and you have to talk about something that you care about.

Apply Katamari Philosophy

About Keywords: You should use categories and tags that are also keywords relevant to your site’s content. Don’t make your categories “cool stuff”. Take a look at my categories: Wordpress, Design, WpThemesPlanet, Patagonia Theme. All those categories are relevant to my site’s content. What I’m trying to say is that I’m not going to use the name of a medication as category because my site’s niche is not about meds. It’s about Wordpress Themes and Blogging Articles. Take this in mind when you build your category list. Make sure to use the keywords you want to rank well for in your posts and articles.

Continue reading