Wednesday, July 16, 2008

Understand the Search Engines and Search Engine Spiders

What is the best way to get your site into the search engines? Search engines obtain your URL either by you manually submitting your web pages directly to search engines or by other websites having a link to your web pages. Once it is recognized, then search engines send out their spider or (bot) to review your website.

After landing on your site, the spider begins to read all the content in the body of your web pages, including markup elements, all the links to other web pages and outbound links to other websites, plus elements from the page head including some meta tags 'depending on the search engine' and the title tag.

Once finished browsing and gathering the information in your content then it goes back to it's central database for indexing which can be several months (2 to 3).

Spiders follow links on pages, repeating the same procedures over and over again. Spiders are not always to bright, for lack of a better term, dumb. They usually follow basic HTML coding. Encasing a link in a JavaScript that spiders won’t comprehend is usually not the best way to go about it, spiders basically will ignore both the JavaScript and the link. Same thing goes for forms; spiders are not able fill forms out and click the “submit” button.

Understanding what spiders see, try accessing your web pages with a Lynx browser from a Unix server. Lynx is a non-graphical website, it does not support JavaScripts, this site displays only text and regular href tags. This is what the spider can see and indexes. Does your web pages work without graphics or JavaScript? If not, the spiders will not work either, and that calls for an immediate back track to re-do your site.

Once the Search Engine has all your content in its database, it runs an algorithm (a mathematical formula) compared with your content. Algorithms are unique to each Search Engine and are constantly reformatted, but, in essence, all search engines look for the important words on your web pages “based on word density” how often your keywords or phrases are used in respect to the total amount of content and they assign a value to these words based on the code surrounding the words.

They also go back to the links on your web pages. They search and look for what other websites, or pages on the same site are linking to that page. More links on a page equals more importance to that website by search engines. Having one way links from other sites linking to your website is “Very Important”, It has nothing to do with optimizing your website, but I will be covering that in a future article. Optimization wise it is recommended, to assure you link to your important web pages from more than just the index page (e.g., create a primary navigation that appears on all pages.)

Tip 1

Rule #1 of search engine optimization (SEO). Prevent your website design to interfere in such a way that the code hinders the spider (bot) from being able to index it. Which means avoiding pages with 100% graphics and no text “pages that contain all images”, or flash-only. I must say that web design and (SEO) should go hand in hand. When someone lands on a website and encounters is a log-in page, before being able to see the site’s information (text or content), then that’s what spiders will see and it will pause right their, so think a little further and let experts in the field show you step by step what you should do.

Have you thought of building Web Page entirely in Flash, DON’T.