|
While looking for near-magical formulae for the
perfect placement and counts of keywords,
many webmasters forget
to allow for the effects that web-design issues such as the use
of javascript, tables or images, may have upon the spider's perception.
A search engine spider is not a browser. It does
not 'see' a page in the way that a browser does. In fact a spider
is much more like a word-processor or text editor. It looks at the
text, performs complex search and replace operations, and then attempts
to process a page, similar to the way that a word processor might
attempt to check grammar.
Anything that is not actual text is almost akin
to an obstacle for the spider. It is simply extra characters in
the text it sees that it has to decide what to do with. This page,
for example, uses tables and graphics to present information more
clearly for the human eye.
This section of
Rithmix will look
at various basic web-design issues and see if and how these affect
the algorithms.
Remember above
all that these pages are primarily a tutorial in how to design your
own search engine algorithm analysis pages and experiments. These
pages here are very public and therefore could be subject to tampering
or abuse which would affect the results.
The following
are the pages in this test:
-
web-design_1.html
- This is the plainest possible HTML page consisting of absolutely
minimal tags and with the emphasis on pure text content.
-
web-design_2.html
- This page places the text of the above inside a table layout
with nested tables. This adds a lot to the layout but leaves
the content unchanged.
-
web-design_3.html
- This page adds some graphics to the page to improve the appearance
and design of the plain page. We replace one incidence of the
keyword 'Rithmix' with an image which has the keyword in the
ALT attribute
-
web-design_4.html
- This page adds some Javascript functions to the content while
again leaving as much of the content itself untouched as possible.
-
web-design_5.html
- This page is an exact duplicate of page 1.
-
web-design_6.html
- This page is an exact duplicate of page 2.
-
web-design_7.html
- this page is an exact duplicate of page 3.
-
web-design_8.html
- This page is an exact duplicate of page 4.
The reason for
having a duplicate of each page is to isolate rankings that are
based on content, from other factors. If the layout is the ranking
factor then we should see the identical pages ranked together. For
Example: If page 1 were the highest ranked, we would expect to see
page 5 (the duplicate) right behind it if the layout is indeed the
factor.
|