TIPS and TRICKS
Website optimization
When you're building a new website or completely renovating an old one, it's important to create your design in a search engine friendly way. The choices you make are going to be with you for a long time and errors will be very time-consuming to repair at a later stage.
In other parts of this site, we've looked at how to make individual pages rank well. Now, let's focus on website optimization and examine your site as a whole. We'll go over the design techniques and principles that the search engines like, but we'll also take a brief glimpse at some potential pitfalls. Welcome aboard, I hope you enjoy the trip!
Use as much text as possible
When the World Wide Web was born in the early 1990's, it was mainly a text-based medium. Sounds, images and complex animations were either very rare or completely unheard of. Not surprisingly, the first major search engines that came around a couple years later were built to classify and rank WWW pages largely based on textual content. After all, the WWW consisted of text and would continue to do so for the foreseeable future, right?
Towards the late 1990's, the web had started to change. Although the role of text was still very important, it was now common for web pages to contain large images, Flash animations and other bells and whistles. However, due to numerous technical difficulties, the search engines were unable to widen their reach beyond the world of text. While search engines that specifically search for images have been created, general-purpose engines still mostly ignore everything that is not in text.
The moral of the story is, unless your pages are built to contain a lot of text, they're unlikely to do well in most search engines. This doesn't mean that you should drop all the images from your website, but keep in mind that as far as the search engines are concerned, images, Flash animation and sounds do not exist.
Keep non-HTML code in external files
Many of today's sites use JavaScript, CSS, or both in their designs. Some of them have quite a lot of code in these languages on each of their pages and have placed it above the HTML containing the text used on the page. In terms of website optimization, this is a bad idea.
First of all, it forces the spider to wade through something that it is not at all interested in before being able to read the text. While modern spiders are probably quite well-accustomed to such unfriendly pages, it's safe to say that filling your pages with non-HTML code is more likely to hurt than to help you.
Second, the less the search engine knows what kind of CSS and JavaScript you use, the better. If your code is attached to the HTML, search engine spiders can freely read and analyze it if they want to. On the other hand, if you place your code in external files and use a robots.txt file to forbid search engines from downloading them, your code is fairly secure. Of course the search engines could still get it if they wanted to, but then they would have to both disobey your robots.txt and grab the .css or .js file, both things that they're unlikely to do.
But why would you want to keep your CSS and JavaScript away from the eyes of the search engines if you're not doing anything wrong? Well, the problem is that search engines define what is acceptable and what is not, and it often seems like they have a lot of trouble making up their minds. For example, using a JavaScript redirect is occasionally "OK, if you have a legitimate reason for doing it" and occasionally "spamming, and we'll skin you from head to toe if we catch you". The point is that it's better to be safe than sorry, because the rules change all the time.
Frames or tables - or CSS?
The layout of your website and the way it is created is another factor that can either boost or reduce your search engine success. Here at the APG site, I've decided to use a table-based layout, which is usually considered something both human visitors and search engines can appreciate. However, it is not the only method available and all of them have their pro's and con's.
Tables
Search engines generally don't have any trouble reading a table-based page, provided that the layout is not overly complex or incorrectly designed. The only serious problem arises if you wish to have a navigation menu on the left side of the screen, just like I do. Placing the menu on the left causes its contents to be displayed above the rest of the content on the page in your source code. Humans won't mind about that, but because search engines read your source code rather than what you see on the screen, this kind of arrangement may damage your ranking in them.
You see, most search engines consider the text at the very top of the page to be more important than the one at the middle. This sounds a bit odd, but it's actually a very reasonable assumption. Take a look at some of the pages on this site for example; if you begin reading from the top, it won't take long before you've got a general idea about the contents of the page. But if you start from the middle, it will take on average substantially longer to determine what subject is being discussed.
So, if your menu pushes the actual content of your page downwards in your source code, the search engine will have difficulty determining what your page is about, which might cause your ranking to drop. However, fortunately there is a solution to this problem that allows you to use tables, keep your menu on the left and please the search engines at the same time. If you plan to use tables, I recommend using the table trick.
Frames
Some like them, some hate them. Think of them what you will, but generally frames are not as search engine friendly as tables. That is not to say that its impossible to build a site that uses frames and does well in the engines, it is just harder to do than with tables.
If you already have a site that uses frames, or if you just are determined to use them, it would be a good idea to implement a few website optimization tricks to prevent some of the most common problems.
To begin with, use a <NOFRAMES> tag on your frameset page. In it, have a simplified version (less graphics, no Flash, no JavaScripts etc.) of the content page your frameset points to and links to all of your other content pages. By having a good NOFRAMES tag, you'll make it easier for the search engines that can't read framesets to index your pages. As an added bonus, the NOFRAMES tag enables those who are using browsers that can't read frames to access your site.
However, there's another serious problem caused by frames that can't be solved with the NOFRAMES tag. Usually, a typical design that uses frames has the site navigation in one frame and the content in another. After submitting your content pages to the search engines, they will eventually be indexed and hopefully start receiving visitors. The trouble is that when someone arrives directly to one of the content pages, the navigation frame will not load. This can deter visitors from venturing further to your site and thus reduce the usefulness of the traffic sent to you by the search engines.
While this is a difficult situation, there are things you can do to correct it. The simplest of them is to install the following JavaScript to all of your content pages:
<script type="text/javascript" language="javascript">
<!--
if (top == self) location.replace("FILENAME OF YOUR FRAMESET PAGE");
-->
</script>
As long as you remember to place the name of your frameset page into the script, you can get it to work simply by cut 'n pasting it to between the <HEAD> and </HEAD> tags in your HTML. However, as mentioned above, it would be best to spend some extra time and place the script in an external file instead.
So, what will the script do? Quite simply, it'll check whether the frameset is loaded and if not, it will load it. This will give the visitors who arrive directly to your content pages the opportunity to see your navigation menu and thus browse your site. Sounds great, right?
Unfortunately, the script is not as good as it seems. If you point it to your entry frameset page, you'll notice that while it loads the navigation, it will also load your homepage. You've given the visitor a possibility to navigate your site, but in turn, you're redirecting him to a page that might be completely different from the one he found in the search engine. This is in my opinion better than doing nothing, but it is still a very unsatisfactory solution.
Luckily, there are some more refined ways of handling the issue with JavaScript. They'll require a bit more effort and skill, but can deliver both the navigation menu and the correct page to the user at the same time. While these scripts have their own problems, such as not being 100% valid HTML code, they're far superior to any other solutions I've seen. So, if you're using frames and want to offer a satisfying experience to those of your users who arrive through the search engines, using them instead of that simple script I showed you is really the way to go.
To sum it up, by implementing the above suggestions, you can create frame-based sites that get along with search engines a lot better than they would normally do. They won't be perfect, but what in this world really is?
Cascading Style Sheets
Search engine-wise, using CSS to create your layout is probably the best possible solution. In addition to being more flexible than frames and tables, CSS also gives you the possibility to easily arrange your source code. This is a helpful ability, because you can use it to ensure that the spiders always read the most important and well-optimized content on the page first without having to make changes to the layout itself.
Even though it has many excellent properties, it feels like a CSS layout is a bit ahead of its time at the moment. While it is completely possible to implement, it will cause problems with older browsers, for example with Netscape Navigator 4. CSS is likely to ultimately become the layout method of choice, but for now it is still better to stick with tables.
Avoid non-HTML filetypes
Due to the great success of Adobe's Acrobat and Microsoft's Word and Excel, many sites now make parts of their content available in files created with these programs. While this may be the fastest and easiest way to post content on the Web, it can make getting your information listed on the search engines very difficult.
Although the search engines are continuously becoming better in their task of finding and indexing information, most of them can't read .PDF (Acrobat), .DOC (Word) or .XLS (Excel) files. Google is ahead of the rest in this area, as it supports all of these filetypes. Another major player, FAST, is able to index .PDF's, but not Word or Excel documents. If you want your file to be found on the rest of the engines, you're going to have to stick with HTML.
However, it must also be noted that even plain old HTML pages may cause trouble with search engines if they are generated dynamically, for example with a CGI script. There are several good ways of taking care of these problems without having to sacrifice the flexibility of generating HTML dynamically, but it's important to be aware that they do exist.
Conclusion
In order to get your pages listed at the search engines and get them to rank well, you'll have to do more than just add META tags and get a couple of links to point to your site. By designing and constructing your site correctly, you're building a solid foundation on which is it possible to apply various optimization techniques in the future.
Changing an existing site structure to one that works better with the search engines can feel like a large task, and it often is one. However, if you're planning to make improvements, it's better to start your website optimization project as quickly as possible. Sites tend to become larger and more complex with age, so the job is unlikely to get any smaller as time passes.