Top quality web design, professional SEO internet Marketing optimization, content management, links campaigns, expecialized E-Marketing, PPC Pay Per Click campaigns, WebPro developments

Thursday, November 16, 2006

Avoid Search Engine Blacklisting

The best way to avoid being blacklisted by the search engines is to avoid using some questionable techniques that were once popular to gain high rankings. Even if your website is not blacklisted by using some of the techniques below, it may be penalized (buried in the rankings) so your traffic will suffer all the same. When a search engine blacklists a website it will throw your listing off their site and block your site from coming aboard again. This can be done by blocking the domain name, the IP address or both.

Here are a few techniques to avoid, so that your site will not be blacklisted:

Mirror Websites

Mirror websites are sites with identical content but different URL's. This was once a method used to gain high rankings in the search engines, but since search engines are smarter now, this will only get you penalized or blacklisted.

Doorway (gateway) Pages

Doorway pages are pages with little real content for your visitors that are optimized to rank highly within the search engines. These pages are designed so that visitors will move deeper into the website where the real content lies. Navigation to the doorway pages are usually hidden from the visitors (but not the SE robots) on the homepage.

Invisible Text and Graphics

Using invisible text (text the same or a very similar color to the background) was once used to spam a homepage and some inside pages with non-stop keywords and keyphrases. Also links to doorway pages and hidden site maps can be done with invisible text (or invisible graphics). Some designers will create a graphic link with a 1 pixel by 1 pixel raster image and link this to a hidden inner page such as a hidden site map.

Submitting Pages Too Often

Submitting the same pages to the search engines within a 24 hour period can get you penalized and may delay your website from being listed in the rankings. Some search engines believe that pages submitted sooner than every 30 days is too much. The 30 day rule is a good rule to follow when submitting to multiple search engines.

Using Irrelevant Keywords

Using irrelevant keywords in a website's metatags and / or body copy in order to achieve high rankings will most certainly backfire. Search engines now want to see parity between these two areas and if your site is thought to be spamming with irrelevant keywords, you site will be penalized or blacklisted.

Automated Submissions to the Major Search Engines

Using an automated service or software to submit your website to the search engines can be extremely counterproductive. Most of the major search engines and directories accept manual submissions but do not like to be spammed with the automated ones.

Cloaking

Cloaking is the practice of deceiving both the search engine and the visitor by serving up different pages for each. The visitor sees a nicely designed and formatted page and the search engine robot scans a page of highly optimized text. Any practice that is deceptive should be avoided and the downfall of cloaking is that, if caught, the website can be banned permanently.


Using a Cheap or Free Web Host

Using a cheap or free web host can hurt in the search engine rankings. Frequent downtime, pages taken down for exceeding the bandwidth deter robots from indexing your site. If a robot cannot access your site often enough, your site will be dropped from the search engines. Hosting is cheap, so if you are serious about your website get your own domain name and host not one like: geocities.com/yoursite.

Sharing an IP Address

Sharing an IP Address even from a legitimate web host can get your site in trouble. If you have cleaned up your website from all of the techniques mentioned above and your website still does not get relisted by the search engines in a couple of months, check with your host to see if you are sharing an IP address with other sites. If so, you may consider moving your website to a new host who will give you your own IP address or at least one that is not shared with another company who has had their IP address (an yours) banned by the search engines.


Copyright 2005 Carmosa ww

Tips for Targeted Traffic

Here are 4 of the most easiest and simplest things to implement into a site for a very generous boost in traffic. These can all be implemented within a very short time. Of course depending on the Search Engine you most likely will have to wait for them to spider your site, but when they do,... the Targeted Traffic will come.

Ok,...

-- Tip 1 --
It's a good idea to insert a link to a sitemap into your index page at the bottom, if this is the page you paid for inclusion with.

Why?

This is so, when the search engine spiders crawl your site they will ALSO spider your sitemap thus indexing ALL of your other webpages via the sitemap. Thus, indexing your whole site into the search engine(s). And having each page in the engines is very good and beneficial for ya, thus giving you more coverage with your site.

-- Tip 2 --
Try to restrict your website to a minimal number of directories as possible.

Why?

Search engines like to see a URL or web address of like:



http://www.yoursite.com/page.html



NOT one that looks like



http://www.yoursite.com/this_dir/that_dir/another_dir/page.html

So try to restrict your site to as few directories as possible and try to put your most important pages as close to the root directory as possible too. So the spiders don't have to "fish through" directory after directory to find your pages.

-- Tip 3 --

When creating your website, implement a "reverse pyramid style" construction.

Why?

Search engines like to see a broad to specific layout. Your index page, would be your broad or wide theme about your whole site, and your interior pages should focus in or "zero-in" to a specific interest. For example, if your site is on dogs, a more specific area or page would be show dogs and/or working dogs etc.

and finally

-- Tip 4 --
When submitting your website to the engines, do a bit of research and seek out the search engine(s) that "feeds" the other search engines and thus submit to the one or ones that feed the most to other engines.

Why?

Many search engines get their results from other search engines. For example, Google, supplies results or feeds results to AOL, Netscape, and iWon. So, if you get indexed within Google, you will most likely get into the indexes of AOL, Netscape, and iWon since these other engines get their results from Google. You get the idea? So even though you might only be submitting to one engine, you are "indirectly" submitting to many more.

These 4, small but very powerful ideas implemented into your site will bring you a big boost of Targeted Traffic. Because don't forget search engines are THE most largest source for Targeted Traffic online today.

This is because, whoever goes searching on a Search Engine is looking for one thing and usually one thing only. So if your site is present within the top pages of the niche or topic that visitor/searcher is looking for, you are the most likely candidate and target for him/her to hit by having your website right there in front of them.

Anatomy Of A Top Ranking Web page

Optimizing web pages for high rankings in the search engines involved two main processes. Firstly there is the on-page factors which include what keywords you place where on the page itself. The second, and more important process is getting the off-page factors right - incoming links.

This article explores mainly the on-page factors. As the competition for a keyword phrase increases, off-page factors become more important to good rankings and these often mask the effects of on-page factors making it impossible to see what on-page factors are important. For this reason, I am going to look at a high ranking page with low levels of competition in Google.

First, let's consider what we mean by competition.

There are two ways to look at competition in Google. There is the competition a page has when you type the phrase with quotes, and the competition when you type the words without quotes. The number of results returned by Google in each case is YOUR competition.

The main differences between these two types of search are as follows:

Search with Quotes - this returns only those pages that have been "optimized" for the exact phrase.

Search without Quotes - this returns all pages that have been "optimized" for the words making up the phrase.

e.g. (in simple terms)

a) If you search Google for

alsatian dog

Google returns 41,000 competing pages.

b) If you search Google for

"alsatian dog"

Google returns 6,390 competing pages.

In (a) above, there are 41,000 pages that refer to alsatian AND dog, but not necessarily to alsatian dog.

In (b) above, there are 6,390 pages that refer to the exact phrase alsatian dog.

Now, if you want to rank well for the term "alsatian dog" on Google, you only have to compete with 6,390 other pages for this exact term.

However, there are 41,000 - 6,390 = 34,610 other pages that are related to this search, and might still beat you if Google sees them as more relevant than your page.

We have discussed before the importance of link reputation and PR in ranking. It is possible for a high PR page to rank well for a term like alsatian dog, even if it does not have the exact phrase on the page.

This fact clouds the issue somewhat, and so although I recommend searching with quotes to find the real competition, I also recommend that you look at the top few results in Google (as searched without quotes) to determine how important those "partial match" pages are.

A quick search at:

http://www.prsearch.net/

for alsatian dog, shows me that the top pages for this search without quotes have a low PR (0-3) and many of those pages have 0 incoming links.

The same search at PRSearch.net using quotes around the phrase show very similar results. The competing pages for the exact term have low PR and low incoming links.

This phrase should be easy to target and get top rankings if done properly.

A word of warning: Because the PR reported on the Google toolbar is out of date (see earlier), you cannot be 100% sure of the PR of the pages, even using a site like PRSearch. They will use the same formula that the toolbar uses, and so will be equally out of date. Only Google knows the exact PR it is using in its ranking for any one page.

A second check I often do is to check what the PR of the homepage of the site that is ranking well, as this gives me an indication of how important the site as a whole is. For the phrase alsatian dog (with or without quotes), the top page is:

http://www.castleofspirits.com/stories02/alsatian.html

The homepage

http://www.castleofspirits.com

has a PR of 6 - quite an important site.

However, there is no link to the alsatian page on the homepage, so the PR 6 homepage wont directly help towards the high ranking of the alsatian dog web page.

Doing a backward links check on Google does not help since there are no backlinks listed for this top ranking page.

OK, putting on my detective hat, I see a link at the bottom of the Alsatian page called "March 02 Ghost Stories". There is another link to "Ghost Story Page".

Clicking on the link to Ghost Story Page, I am taken to a PR 5 page:

http://www.castleofspirits.com/storypg.html

where I find a link to March 2002 Ghost Stories. Clicking that link takes me to a PR 3 page:

http://www.castleofspirits.com/stories02/mch2002.html

And on this page I find a link to Ghostly Alsatian dog.

So, the top ranking alsatian dog page has one link I know of from a PR 3 page. I might assume that this site also has a sitemap (although I cannot find one) where it contains a second link to the alsatian dog page. That means a total of 2 links, both internal.

I can assume from this that the alsatian page with a PR 2 is probably the correct PR, and the page itself has very few incoming links. I am confident that if I targeted the phrase alsatian dog, I would easily get a top ranking.

The phrase alsatian dog is therefore an EASY phrase to target.

As a final check I went to the searchguild difficulty tool mentioned in section 6 of this newsletter and typed my phrase into that. The Search Guild rates this term as EASY.

With relatively few off-page factors contributing to the high ranking of this page, I can only assume that the on-page factors are what makes this page stand out from the rest and rank at number 1 on Google.

There are a variety of tools available for calculating density, but I use a tool I wrote for myself and is not available for purchase.

Running this URL through my tool tells me a lot of useful information.

Density of the phrase "alsatian dog" on the page is 0.49%

The keyword is found ONCE in the title (11.11%), and TWICE in the main text on the page (a density of just 0.34%).

The keyword is not found in any header or meta tag!

As a second check I always look at what I call the partial density. That is the sum of the densities of all words that make up the phrase.

e.g. the phrase "alsatian dog" is made up of two words - alsatian AND dog. I look at the density of alsatian, and the density of dog, and combine the two densities.

This is useful because it tells me the density on the page of the words that make up the phrase (remember it is possible to rank well without the exact phrase on the page) - a kind of simplified page reputation.

The partial density of this page is 3.09%, made up of 7 occurrences of alsatian, and 12 occurrences of dog. This page is obviously about alsatians and dogs!

Let's look at the prominence of this phrase on the page. First an explanation of what prominence means.

Prominence is a measure of where on the page a word exists.

A prominence of 100 would mean it was the first word on the page.

A prominence of 1 would indicate it was the last word on the page.

A prominence of 50% would indicate it was the middle word on the page.

If the phrase was the first word (100% prominence) and the last word (1% prominence) on the page, the average prominence on the page would be about 50%. That means the keywords are well spread out on the page. As prominence increases, the keyword is found higher up the page, as it decreases, it is found lower down the page.

For analysis of top ranking pages, I look at not only the average prominence of ALL occurrences of the phrase on my page

i.e. how the keywords are spaced out on the page,

but also the prominence of the first occurrence on the page.

i.e. how close to the start of the document is the phrase first found?

The prominence of the first occurrence of the phrase alsatian dog is 99.67%. That means it is almost the first phrase on the page (only the word ghost comes before it).

The average prominence of the whole page for this term is 62.62%. That means that the keywords are distributed more in the upper portion of the page. Haven't I always told you that it was important to get your main keyword in the top one-third of the page?.

This page is a good one to study. It shows a top ranking page for a low competitive keyword phrase. Because of the low competition, incoming links and PR are less important (though if you have both, you could dominate this phrase), while on-page factors will make or break the ranking.

Even though the exact phrase is only found 3 times on the page, the fact that the phrase is in the title of the document and in the body text seems to be enough. This low density is backed up by using the words that make up the phrase several times on the page. Google will be in no doubt what this page is about.

A final help to the ranking of this page is the filename. Notice that part of the keyword phrase is found in the filename - alsatian.html

Twelve Steps to Higher Search Engine Placement

Recent studies suggest that more than 80% of new visitors to any web site get there as a result of a search engine query. If this study is to believed, it certainly suggests that working to get high rankings in the search engines might be the most effective thing you can do to bring traffic to your site.

The following 12 design tips will help you get started in optimizing your site's search engine placement.

1. Design for Specific Search Engines - there are hundreds of different search engines, but for best results you should design your site to take full advantage of the search criteria of the big three - Yahoo, Google and MSN. If you can get high rankings in these three, you won't need to worry about the other search engines. Knowing how these search engines rank sites (as well as why they will penalize a site) is important. The rules change often, but the tips below are the most current.

2. Know your target audience. Before you apply any of the tips below, do some research and find out what are the most likely key words and phrases your target audience will be searching for. In most cases, the key words or phrases won't be your site name, but will be something related to the solution to a specific problem or the answer to a specific question. Knowing the question that will be asked is half the battle.

3. Use Meta tags. By now just about everyone knows about Meta tags. These are commands you can place in the html on your web page to help the search engines categorize what your page is about. The two most important Meta Tags are 'Keyword' and 'Description'. The description Meta tag should describe what is on the particular page, and the keyword Meta tag should include the important key words from the page. Avoid using 'fluff' words and phrases as these will be ignored by the site.

Warning: If the keywords in the Meta keyword line are not found within the text on the web page, some search engines will penalize the page or simply not list it. This is done to prevent 'meta tag' spoofing.

My advice - have a different Meta description tag on every page. And be sure that keywords in the keyword tags are used on the page.

4. Optimize your Title tag. Many search engines give considerable weight to the html title tag on the page. It is the first element the search engine will scan and weight. Not including a title instantly reduces the search engine ranking your page will receive. When yo create a title tag, include keywords and write it to catch the attention of the users be scanning lengthy lists of titles in search engine results. Check out our Webmaster Tools for an easy way to create great title tags and meta tags.

For higher ranking, make sure the title tag matches headline text on the page. And be sure to use a different title tag for each page on your site. (Pages with the same title tag will often be ignored.)

5. Use Keywords in page headlines. Page headlines are important - to your visitor as well as to search engines. Use short keyword phrases, including hot button words and phrases. Avoid 'fluff' and generic words.

My advice - use a strong headline on the page, and use the same headline in the title tag.

6. Use interesting text. Search engines actually count all the words on a webpage, then rank those words by frequency of use. The more often you use a word or phrase(up to a point), the higher you will rank with that word or phrase in the search engine. For that reason, be sure to include words or phrases that are likely to be searched for on your pages.

My advice - Keep your text short, on topic, and packed full of keywords. Avoid useless and meaningless words, and certain phrases that will place you in the penalty box.

7. Use the AlT tags on all images. Search engines are starting to index sites by the images found on the site. They accomplish this by looking at all the image tags on the page, and cataloging the ALT tags accompanying the image. Obviously if you don't use the ALT tag, then images on your site won't be properly cataloged. When using the alt tag, be sure to use a keyword or phrase describing the content of the image.


8. Use the Title tag on links.
Search engines look at all text on the site, including the title tag on the links on your site. Most sites still don't use the link title tag, so when you do, you gain an advantage. The link title should be a short keyword or phrase.

My advice - Check out how the pages on my site have a left navigation menu filled with department names. I try to make these names keywords for my site, and the link to the departments all make use of the title tag. Doing it this way means that the search engine ranks the department names twice. Once as text, and again as a Link Title Tag.

9. Provide a Link Trail. Search engines coming to your site follow the links on the front page that lead into your site. These links should provide a 2 level trail to all pages on your site. If you don't provide a link trail, the search engines probably won't find all your pages. (And even if you do provide a 'link trail' - if you use the same title tag and Meta tags on your pages, the search engine may ignore all the pages beyond the first one.)

My advice - check out how every page on my site has a one click link trail to any department on the site. You are never more than two clicks away from any page. Plus every page has at least 30 different link trails (through the departments) making it easy for visitors as well as search engine spiders to move through the site.

10. Avoid the Penalty Box. Search engines are getting smarter every day, and they will penalize a site if it violates search engine rules. These rules include:

Keyword spoofing - using keywords not related to site content

Keyword spamming - pasting hundreds of copies of the keywords on the page just to get high ranking

Numerous doorway pages - using hundreds of index pages that do nothing but point to the site

Link Spamming - submitting links to the 500,000 link submission services

Page Redirects - not necessarily a major penalty, but can cause loss of ranking

Frames on Main Page - not necessarily a major penalty, but can cause loss of ranking

Flash Movie on Main Page - not necessarily a major penalty, but can cause loss of ranking

My advice: Keep in mind that search engines are intelligent software. When they visit a page they try to determine what the page is about, relying primarily on the titles, headlines, text, links, and images on the page. That's why it is important to focus on those elements, and avoid the ones that can put you in the penalty box.

11. Check for errors. Before you submit your page to the search engines, run the page through an html checker and a spelling checker. Search engines do check and take into consideration spelling and html errors, and will penalize a page that has too many of either.

My advice - take the time to do it right. If you get a poor ranking on a search engine it might be six weeks before the search engine comes back to re-rank you. Get it right before you submit to the search engines, and then keep it right so when the search engine returns, you will continue to get high rankings.

12. Manually submit the site. Don't be tempted to use an automatic site submission program. They don't work, and can get you penalized. Better to manually submit your site to the top search engines. Yahoo, Google, MSN.

My advice - Start with Google.com, then yahoo.com, and then MSN.com. Each has a place to register your site with their search engine.

This may seem a lot of work, but if you do it right and get high rankings, it will pay off.

Link Building: To Link, or Not to Link, That is the Question

Lately, there have been a lot of heated discussions regarding link building. Is it ethical to create a link building campaign? Does Google or any other search engine penalize for “link farms” (a bunch of non-related links created for the SOLE purpose of increasing search engine ratings)? Is the “link building era” over?

Link Farms

Many webmasters claim that Google penalizes websites for link farms. If this is so, why are a link farm sites that have a page rank of 5 on their link farm page? Let me give you an example, go to Google.com, type in any popular keyword(s) and append the word “links” to it. For example, “ABCD Links.” Notice you’ll get tones of websites that are linking to unrelated sites, but still have a great PR. I do not encourage link farming in any way or form, but building valuable partnerships is something I do encourage. If you’re a jewelry site and you’re exchanging links with an antiques site, I think this is a great thing. It’s very possible that someone searching for jewelry may have an interest in antiques. I own a few sites and the hits I receive from partially related sites are phenomenal. Most of my websites are targeted towards webmasters so I receive hits from all sorts of sites. Remember, many webmasters claim to know the truth regarding search engine algorithms, but no one knows the EXACT TRUTH except the programmers at the search engines. Webmasters may have been penalized in the past for link farms and doorway pages (pages created around certain keywords for the sole purpose of redirecting surfers to the main page), but who’s to say why some HAVEN’T been penalized.

Who Should I Create Link Partnerships With?

1. Create link partnerships with sites related to your site.

2. Create link partnerships with sites that have an audience that may be interested in your products

3. Exchange homepage links with websites that fit in your genre. You’ll receive a link on a page with a good ranking and you’ll get quality, targeted hits.

How Do You Find Link Partners?

1. Signup with an automated links management provider that organizes your links and helps find link partners. Automated tools are the best because they help you save time. Uploading and downloading your pages for every link exchange can be a huge pain and very time consuming. Automated tools are NOT link farms because they give you complete control of who you exchange links with. If a member exchanges links with every website, he/she might be abusing the tool. If a member uses the tool responsibly, it can be very effective.

2. This is a great little exercise I would recommend to anyone.

a. Write down 20 search terms pertaining to your website

b. Go to http://www.google.com and type in a search term written in step A

c. Contact as many websites as possible listed under that search term and ask them to exchange inner page or homepage links with you. If they're NOT interested, ask whether they would be interested in selling advertising space. If your website is fairly new, you may want to ask sites for advertising prices. Free hits are the best hits but sometimes you might have to go the extra lap and pay for service. I wouldn't recommend spending over $100 / month for advertising UNLESS it's a high traffic site with your site audience. The best deals are the ones that offer $10-25 / month. There are sites out there that will sell SUPER cheap advertising, and others that charge an arm and a leg.

EVERYONE gets rejected sometime or another so keep your head up and keep moving along! The internet is comprised of Billions of websites. If you're rejected 500 times, you still have MANY business opportunities left! This is a continuous job so set aside time every week to do this. If you don’t have time for this, then you don’t have time to make money.

3. Hire a link building firm (this can be quite expensive but if you have money, its minimal work for you)

PageRank

Pagerank is a great tool but don’t use it as your ONLY measurement for success. I can’t explain everything about pagerank but a good explanation is given at http://www.google-watch.org/pagerank.html.

Conclusion

In conclusion, link building is not only for increasing link partners and search engine rankings, it’s also used to create partnerships and establish profitable relationships with other webmasters. Link building is a tedious process, but one of the best for getting free (sometimes not), targeted hits from other websites.


Copyright 2005 Carmosa www.G

What is PageRank

Chances are you have been on the Internet and have been surfing in and out of websites looking for valuable information pertaining to a favorite topic or researching a subject for school or work. As you type in keyword(s) you match the information you are searching for on Google, you come up with 10,000 pages of information. It’s virtually impossible to go through every one, so you refine your search by adding more exclusive keywords. Voila the number of pages reduces to around 1,000. Still this is a lot of pages, but you start looking through the information to find what you want.

As you go through the first 10 links on the page, WHAM! The information you needed to find was in the first or second in order of PageRank. You wonder how did they get such a high rank on Google? You may think it was very expensive to get that site at the top of the heap. The funny thing is, with a little know how and about $75 you too can go for the top.

Search Engine Optimization or “SEO”, has become a standard in the web design industry, every customer of a good web designer wants to be number one in their keyword and may be willing to pay the extra money to get there. A good web designer will dress up a web sites home page to match the requirements of their client on specific keywords. The client will also pay more for the exclusivity to remain there untouched. SEO has become a niche for a lot of web companies. They know if they can get the company to the top fast, the word of mouth will be helpful toward their business.

Through specialized META tags (hidden group of keywords) the web designer will strategically place keywords multiple times in the title bar, keywords, and even as hidden text. Some search engines have figured these tricks of the trade out and have banned certain websites from their indexes. Google has become the engine of choice for a lot of people today. There is a different logic Google uses to calculate page rank and keywords is only a portion of it.

Google actually uses a specialized mathematical equation to place your site in a predetermined order. First things first, if your website is a keyword, that does not automatically give you a top spot. It will take time to move up the ranks and you should register with Google as soon as possible to drive your rank upwards. But just having the right URL (Universal Resource Locator) doesn’t guarantee the top spot either. You must also be swapping or reciprocating links with other Google users. The more you use Google websites that are indexed the faster and higher your site will go in the ranks.

A Google robot will visit your site frequently so continue to modify your code and keep checking its rank and status. Eventually, your site will drive up the ranks and land on top. It may take time and work, but you will get the hang of keeping it there once you employ the right mix of keywords with links. Some companies can charge up to $1,000 for the top spot, they employ the same techniques, even though they don’t want you to know this. Keep your META tags, title, keywords and content in line with your keywords and continuously look to optimize them. Under no circumstances take another persons keywords off of their code; this is potentially dangerous as you could be violating copyright laws.

Friday, April 21, 2006

Top Ten Mistakes in Web Design

1. Bad Search

Overly literal search engines reduce usability in that they're unable to handle typos, plurals, hyphens, and other variants of the query terms. Such search engines are particularly difficult for elderly usersbut they hurt everybody.
A related problem is when search engines prioritize results purely on the basis of how many query terms they contain, rather than on each document's importance. Much better if your search engine calls out "best bets" at the top of the list -- especially for important queries, such as the names of your products.
Search is the user's lifeline when navigation fails. Even though advanced search can sometimes help, simple search usually works best, and search should be presented as a simple box, since that's what users are looking for.


2. PDF Files for Online Reading

Users hate coming across a PDF file while browsing, because it breaks their flow. Even simple things like printing or saving documents are difficult because standard browser commands don't work. Layouts are often optimized for a sheet of paper, which rarely matches the size of the user's browser window. Bye-bye smooth scrolling. Hello tiny fonts.
Worst of all, PDF is an undifferentiated blob of content that's hard to navigate.
PDF is great for printing and for distributing manuals and other big documents that need to be printed. Reserve it for this purpose and convert any information that needs to be browsed or read on the screen into real web pages.


3. Not Changing the Color of Visited Links

A good grasp of past navigation helps you understand your current location, since it's the culmination of your journey. Knowing your past and present locations in turn makes it easier to decide where to go next. Links are a key factor in this navigation process. Users can exclude links that proved fruitless in their earlier visits. Conversely, they might revisit links they found helpful in the past.
Most important, knowing which pages they've already visited frees users from unintentionally revisiting the same pages over and over again.
These benefits only accrue under one important assumption: that users can tell the difference between visited and unvisited links because the site shows them in different colors. When visited links don't change color, users exhibit more navigational disorientation in usability testing and unintentionally revisit the same pages repeatedly.


4. Non-Scannable Text

A wall of text is deadly for an interactive experience. Intimidating. Boring. Painful to read.
Write for online, not print. To draw users into the text and support scannability, use well-documented tricks:
subheads
bulleted lists
highlighted keywords
short paragraphs
the inverted pyramid
a simple writing style, and
de-fluffed language devoid of marketese.


5. Fixed Font Size

CSS style sheets unfortunately give websites the power to disable a Web browser's "change font size" button and specify a fixed font size. About 95% of the time, this fixed size is tiny, reducing readability significantly for most people over the age of 40.
Respect the user's preferences and let them resize text as needed. Also, specify font sizes in relative terms -- not as an absolute number of pixels.


6. Page Titles With Low Search Engine Visibility

Search is the most important way users discover websites. Search is also one of the most important ways users find their way around individual websites. The humble page title is your main tool to attract new visitors from search listings and to help your existing users to locate the specific pages that they need.
The page title is contained within the HTML "tittle" tag and is almost always used as the clickable headline for listings on search engine result pages (SERP). Search engines typically show the first 66 characters or so of the title, so it's truly Page titles are also used as the default entry in the Favorites when users bookmark a site. For your homepage, begin the with the company name, followed by a brief description of the site. Don't start with words like "The" or "Welcome to" unless you want to be alphabetized under "T" or "W."
For other pages than the homepage, start the title with a few of the most salient information-carrying words that describe the specifics of what users will find on that page. Since the page title is used as the window title in the browser, it's also used as the label for that window in the taskbar under Windows, meaning that advanced users will move between multiple windows under the guidance of the first one or two words of each page title. If all your page titles start with the same words, you have severely reduced usability for your multi-windowing users.
Taglines on homepages are a related subject: they also need to be short and quickly communicate the purpose of the site.


7. Anything That Looks Like an Advertisement

Selective attention is very powerful, and Web users have learned to stop paying attention to any ads that get in the way of their goal-driven navigation. (The main exception being text-only search-engine ads.)
Unfortunately, users also ignore legitimate design elements that look like prevalent forms of advertising. After all, when you ignore something, you don't study it in detail to find out what it is.
Therefore, it is best to avoid any designs that look like advertisements. The exact implications of this guideline will vary with new forms of ads; currently follow these rules:
banner blindness means that users never fixate their eyes on anything that looks like a banner ad due to shape or position on the page
animation avoidance makes users ignore areas with blinking or flashing text or other aggressive animations
pop-up purges mean that users close pop-up windoids before they have even fully rendered; sometimes with great viciousness (a sort of getting-back-at-GeoCities triumph).


8. Violating Design Conventions

Consistency is one of the most powerful usability principles: when things always behave the same, users don't have to worry about what will happen. Instead, they know what will happen based on earlier experience. Every time you release an apple over Sir Isaac Newton, it will drop on his head. That's good.
The more users' expectations prove right, the more they will feel in control of the system and the more they will like it. And the more the system breaks users' expectations, the more they will feel insecure. Oops, maybe if I let go of this apple, it will turn into a tomato and jump a mile into the sky.
The Web User Experience states that "users spend most of their time on other websites."
This means that they form their expectations for your site based on what's commonly done on most other site. If you deviate, your site will be harder to use and users will leave.


9. Opening New Browser Windows

Opening up new browser windows is like a vacuum cleaner sales person who starts a visit by emptying an ash tray on the customer's carpet. Don't pollute my screen with any more windows, thanks (particularly since current operating systems have miserable window management).
Designers open new browser windows on the theory that it keeps users on their site. But even disregarding the user-hostile message implied in taking over the user's machine, the strategy is self-defeating since it disables the Back button which is the normal way users return to previous sites. Users often don't notice that a new window has opened, especially if they are using a small monitor where the windows are maximized to fill up the screen. So a user who tries to return to the origin will be confused by a grayed out Back button.
Links that don't behave as expected undermine users' understanding of their own system. A link should be a simple hypertext reference that replaces the current page with new content. Users hate unwarranted pop-up windows. When they want the destination to appear in a new page, they can use their browser's "open in new window" command -- assuming, of course, that the link is not a piece of code that interferes with the browser’s standard behavior.


10. Not Answering Users' Questions

Users are highly goal-driven on the Web. They visit sites because there's something they want to accomplish -- maybe even buy your product. The ultimate failure of a website is to fail to provide the information users are looking for.
Sometimes the answer is simply not there and you lose the sale because users have to assume that your product or service doesn't meet their needs if you don't tell them the specifics. Other times the specifics are buried under a thick layer of marketese and bland slogans. Since users don't have time to read everything, such hidden info might almost as well not be there.
The worst example of not answering users' questions is to avoid listing the price of products and services. No B2C ecommerce site would make this mistake, but it's rife in B2B, where most "enterprise solutions" are presented so that you can't tell whether they are suited for 100 people or 100,000 people. Price is the most specific piece of info customers use to understand the nature of an offering, and not providing it makes people feel lost and reduces their understanding of a product line. We have miles of videotape of users asking "Where's the price?" while tearing their hair out.
Even B2C sites often make the associated mistake of forgetting prices in product lists, such as category pages or search results. Knowing the price is key in both situations; it lets users differentiate among products and click through to the most relevant ones.

Thursday, February 23, 2006

Database of Web Robots


Database of Web Robots, Overview




  1. ABCdatos BotLink
  2. Acme.Spider
  3. Ahoy! The Homepage Finder
  4. Alkaline
  5. Anthill
  6. Walhello appie
  7. Arachnophilia
  8. Arale
  9. Araneo
  10. AraybOt
  11. ArchitextSpider
  12. Aretha
  13. ARIADNE
  14. Arks
  15. ASpider (Associative Spider)
  16. ATN Worldwide
  17. Atomz.com Search Robot
  18. AURESYS
  19. BackRub
  20. Unnamed
  21. BBot
  22. Big Brother
  23. Bjaaland
  24. BlackWidow
  25. Die Blinde Kuh
  26. Bloodhound
  27. Borg-Bot
  28. BoxSeaBot
  29. bright.net caching robot
  30. BSpider
  31. CACTVS Chemistry Spider
  32. Calif
  33. Cassandra
  34. Digimarc Marcspider/CGI
  35. Checkbot
  36. ChristCrawler.com
  37. churl
  38. cIeNcIaFiCcIoN.nEt
  39. CMC/0.01
  40. Collective
  41. Combine System
  42. ConfuzzledBot
  43. CoolBot
  44. Web Core / Roots
  45. XYLEME Robot
  46. Internet Cruiser Robot
  47. Cusco
  48. CyberSpyder Link Test
  49. CydralSpider
  50. Desert Realm Spider
  51. DeWeb(c) Katalog/Index
  52. DienstSpider
  53. Digger
  54. Digital Integrity Robot
  55. Direct Hit Grabber
  56. DNAbot
  57. DownLoad Express
  58. DragonBot
  59. DWCP (Dridus' Web Cataloging Project)
  60. e-collector
  61. EbiNess
  62. EIT Link Verifier Robot
  63. ELFINBOT
  64. Emacs-w3 Search Engine
  65. ananzi
  66. esculapio
  67. Esther
  68. Evliya Celebi
  69. nzexplorer
  70. FastCrawler
  71. Fluid Dynamics Search Engine robot
  72. Felix IDE
  73. Wild Ferret Web Hopper #1, #2, #3
  74. FetchRover
  75. fido
  76. Hämähäkki
  77. KIT-Fireball
  78. Fish search
  79. Fouineur
  80. Robot Francoroute
  81. Freecrawl
  82. FunnelWeb
  83. gammaSpider, FocusedCrawler
  84. gazz
  85. GCreep
  86. GetBot
  87. GetURL
  88. Golem
  89. Googlebot
  90. Grapnel/0.01 Experiment
  91. Griffon
  92. Gromit
  93. Northern Light Gulliver
  94. Gulper Bot
  95. HamBot
  96. Harvest
  97. havIndex
  98. HI (HTML Index) Search
  99. Hometown Spider Pro
  100. Wired Digital
  101. ht://Dig
  102. HTMLgobble
  103. Hyper-Decontextualizer
  104. iajaBot
  105. IBM_Planetwide
  106. Popular Iconoclast
  107. Ingrid
  108. Imagelock
  109. IncyWincy
  110. Informant
  111. InfoSeek Robot 1.0
  112. Infoseek Sidewinder
  113. InfoSpiders
  114. Inspector Web
  115. IntelliAgent
  116. I, Robot
  117. Iron33
  118. Israeli-search
  119. JavaBee
  120. JBot Java Web Robot
  121. JCrawler
  122. AskJeeves
  123. JoBo Java Web Robot
  124. Jobot
  125. JoeBot
  126. The Jubii Indexing Robot
  127. JumpStation
  128. image.kapsi.net
  129. Katipo
  130. KDD-Explorer
  131. Kilroy
  132. KO_Yappo_Robot
  133. LabelGrabber
  134. larbin
  135. legs
  136. Link Validator
  137. LinkScan
  138. LinkWalker
  139. Lockon
  140. logo.gif Crawler
  141. Lycos
  142. Mac WWWWorm
  143. Magpie
  144. marvin/infoseek
  145. Mattie
  146. MediaFox
  147. MerzScope
  148. NEC-MeshExplorer
  149. MindCrawler
  150. mnoGoSearch search engine software
  151. moget
  152. MOMspider
  153. Monster
  154. Motor
  155. MSNBot
  156. Muncher
  157. Muninn
  158. Muscat Ferret
  159. Mwd.Search
  160. Internet Shinchakubin
  161. NDSpider
  162. NetCarta WebMap Engine
  163. NetMechanic
  164. NetScoop
  165. newscan-online
  166. NHSE Web Forager
  167. Nomad
  168. The NorthStar Robot
  169. ObjectsSearch
  170. Occam
  171. HKU WWW Octopus
  172. OntoSpider
  173. Openfind data gatherer
  174. Orb Search
  175. Pack Rat
  176. PageBoy
  177. ParaSite
  178. Patric
  179. pegasus
  180. The Peregrinator
  181. PerlCrawler 1.0
  182. Phantom
  183. PhpDig
  184. PiltdownMan
  185. Pimptrain.com's robot
  186. Pioneer
  187. html_analyzer
  188. Portal Juice Spider
  189. PGP Key Agent
  190. PlumtreeWebAccessor
  191. Poppi
  192. PortalB Spider
  193. psbot
  194. GetterroboPlus Puu
  195. The Python Robot
  196. Raven Search
  197. RBSE Spider
  198. Resume Robot
  199. RoadHouse Crawling System
  200. RixBot
  201. Road Runner: The ImageScape Robot
  202. Robbie the Robot
  203. ComputingSite Robi/1.0
  204. RoboCrawl Spider
  205. RoboFox
  206. Robozilla
  207. Roverbot
  208. RuLeS
  209. SafetyNet Robot
  210. Scooter
  211. Search.Aus-AU.COM
  212. Sleek
  213. SearchProcess
  214. Senrigan
  215. SG-Scout
  216. ShagSeeker
  217. Shai'Hulud
  218. Sift
  219. Simmany Robot Ver1.0
  220. Site Valet
  221. SiteTech-Rover
  222. Skymob.com
  223. SLCrawler
  224. Inktomi Slurp
  225. Smart Spider
  226. Snooper
  227. Solbot
  228. Speedy Spider
  229. spider_monkey
  230. SpiderBot
  231. Spiderline Crawler
  232. SpiderMan
  233. SpiderView(tm)
  234. Spry Wizard Robot
  235. Site Searcher
  236. Suke
  237. suntek search engine
  238. Sven
  239. Sygol
  240. TACH Black Widow
  241. Tarantula
  242. tarspider
  243. Tcl W3 Robot
  244. TechBOT
  245. Templeton
  246. TitIn
  247. TITAN
  248. The TkWWW Robot
  249. TLSpider
  250. UCSD Crawl
  251. UdmSearch
  252. UptimeBot
  253. URL Check
  254. URL Spider Pro
  255. Valkyrie
  256. Verticrawl
  257. Victoria
  258. vision-search
  259. void-bot
  260. Voyager
  261. VWbot
  262. The NWI Robot
  263. W3M2
  264. WallPaper (alias crawlpaper)
  265. the World Wide Web Wanderer
  266. w@pSpider by wap4.com
  267. WebBandit Web Spider
  268. WebCatcher
  269. WebCopy
  270. webfetcher
  271. The Webfoot Robot
  272. Webinator
  273. weblayers
  274. WebLinker
  275. WebMirror
  276. The Web Moose
  277. WebQuest
  278. Digimarc MarcSpider
  279. WebReaper
  280. webs
  281. Websnarf
  282. WebSpider
  283. WebVac
  284. webwalk
  285. WebWalker
  286. WebWatch
  287. Wget
  288. whatUseek Winona
  289. WhoWhere Robot
  290. Weblog Monitor
  291. w3mir
  292. WebStolperer
  293. The Web Wombat
  294. The World Wide Web Worm
  295. WWWC Ver 0.2.5
  296. WebZinger
  297. XGET
  298. Nederland.zoek

    Nine things you can do to make your web site better

    1 Conceive, design and organize your site to be exactly what it is: a web site. -
    The Web is not print. While this may seem like an overtly obvious statement, designers, programmers and users trip up on this very issue every day. It's a common misconception, which branches off to notions like "a site author can control how a site looks to the pixel" and "a well-written web page will look exactly the same on all browsers." Let go of these ideas from the very start. Accessing a web site is a client-server interaction which varies in ways dependent upon several variables, not the least of which include connection speeds and client hardware, software and configuration.

    So, as you make decisions about your site, carefully consider and exploit the medium. Make no assumptions about the user, because a dizzying array of configurable clients can access your site. Not everyone has Javascript and cookies enabled, or is sitting in front of that great 22-inch monitor that you are, or is using IE on Windows. Accept that your "pages" will not look exactly the same to everyone. Remember that search engines will run indexing software on your site and this software only understands text (for now). Don't focus on visual design, concentrate on making your site as usable as possible. Users should be able access your site quickly and intuitively and even bookmark (a misnomer that points back to the Web=print misconception) specific documents on it; so go nuts and facilitate this.

    The rest of these recommendations aim to help achieve this end.

    2 - Validate your markup.
    There are rules which specify how to create documents renderable by web clients. Follow them. Markup is either correctly written or it's not. If you're using pure CSS and XHTML or just plain ol' HTML, make sure your markup is correctly written by using a validator: software which checks markup for mistakes. This one and this one work just fine. If your markup is valid it has the best chance of rendering on the widest array of clients. If you have invalid markup, don't assume that just because your browser is forgiving that everyone else's is.

    3 - Avoid frames and splash pages.
    Frames on a web site are not ideal for lots of reasons. Frames prevent the user from being able to bookmark individual documents on a site. They present related information in separate documents, which keeps search engines from associating related information. They require that a browser make more than one document request per document, which increases client-server connections and eats server CPU cycles, network bandwidth and users' time. Frames are also, coincidentally, being deprecated.

    When I say "splash page", I am referring to a welcome page with one link on it to "enter" the site. Splash pages are unnecessary and meaningless. The first time a user goes to your site, it might seem like a nice effect. But every time after that, a splash page just gets in the way. For a search engine it makes the bulk of your site another needless step into the hierarchy. Don't make users, robots and your server work harder than necessary to deliver the content on your site.

    4 - Optimize your site to be as small a download as possible.
    Making a user wait for your site to download is the best way to get him or her to go elsewhere. While creating your site, remember that more than half the web surfers in the US in February of 2003 used a 56k or less dial-up connection. Entire books and web sites are dedicated to the subject of how to optimize a site, so I won't even attempt to cover the subject here.

    5 - Make your site URLs as short, descriptive, static, technology-inspecific and permanent as possible.
    Remember that your site's navigation URLs can be totally independent of the physical file system on your server. What I mean is, if you have a

    /about_this_site/index.htmlfile on your server, the URL to the about section does not have to be (and should not be)
    /about_this_site/index.htmlDecide on your site URL structure before you begin creating the documents which will present the information. Make them as short and descriptive of the content as possible, and avoid any indicators of the technology behind them. Avoid file extensions (like .php, .htm, .html, .asp) and don't expose query string parameters. Google specifically recommends using "static" (querystring-less) links to every document on your site. For example, if you have a section which describes the staff of a company, don't use
    /staff.htmlto point to the staff page. Use
    /staff/instead. Then use
    /staff/joesmith/for Joe Smith's page, instead of
    /staff.asp?firstname=joe&lastname=smithOnce you've determined the URLs for your site, use server-side technology to make them work.

    Finally, once you create a URL which points to a section on your site, stick to it. If you follow these suggestions from the start and then re-organize your site, your URLs don't have to change. However, if you absolutely must change a URL, make sure the original URL redirects or points to the new section, so that cached search engine referrals and bookmarks still work.

    6 - Make the information on your site textual, and offer non-Javascript-dependent navigation.
    Images on a web document, while meaningful to human eyes, are actually just a collection of 1's and 0's to search engine indexing software and non-graphical browsers. Make sure all of the information on your site exists in a text format. For example, if your site has a masthead which is an image that contains the title of your site in it, make sure you set the alt attribute to describe the content of the image. You should even ensure that the most relevant information on a page appears first in your markup, and make other elements (navigation, etc) follow.

    Short of installing a text browser like Lynx, a good test to see what your site looks like to an indexing robot or a non-graphical browser is to turn off images in your browser. If you're using Internet Explorer, to do this, in the Tools menu choose Options, and on the Advanced tab go to Multimedia, and uncheck "Show pictures." In Mozilla, go to Tools, Image Manager, and choose "Block Images from this Site." Then view your site, and make sure that without images, all information is adequately represented. This same concept applies to all objects (like Flash movies and Java applets.)

    Additionally, remember that search engine robots do not execute Javascript. If any navigation elements on your site use Javascript, set the onclick attribute of the a element to the Javascript call, and the href attribute to the destination of the link. This way Javascript-enabled browsers will execute the script, and the link will still be usable to non-Javascript-enabled clients.

    7 - Actively direct search engine indexing robots.
    Search engine robots want instructions on how to correctly index your site, so give 'em to 'em. Read up on search engine guidelines and features (like caching site text and image search). Determine how and what areas of your site should be indexed. The use of meta tags and the robots.txt file are the most common methods of directing robots to your content. Use this robots.txt validator to ensure your robots.txt file is correct.

    For example, Google has been Scribbling.net's biggest referrer since day one, but I noticed that often users from Google would land on pages that weren't the most relevant to their search terms. So I checked out how the Googlebot indexes sites. I wanted robots to index only the permanent locations of posts, but not the front page (as it constantly changes to show the latest post). I don't want any of the images or text cached and presented out of context. I also have a page or two that I don't want anyone to find via a search at all. So here's my robots.txt file which lays out some of these instructions. Additionally, the robots meta tags on my front page say "noindex,follow,noarchive", which effectively tells robots to follow links but not to index or archive the front page. The same tag on any post page says "index,follow,noarchive" which tells robots to index the content on that page but not to archive it.

    This way, if the day the Googlebot indexes my site is the day I have a post on the front page about a dog, with a link to the dog post's permanent URL, the Googlebot will index only the permanent location of the dog post. Four days later, when my front page has a post on it about a cat, and someone searches for site:scribbling.net dog, the only pages returned should be the dog post (and any associated documents) and not the front page.

    8 - Serve "friendly" error messages.
    The most unhelpful, dead-end message you can get from a web server is:

    404 Not Found The web server cannot find the file or script you asked for. Please check the URL to ensure that the path is correct. Please contact the server's administrator if this problem persists. A usable web site does a lot better than that. Hook up friendly error messages which include navigation to documents that do exist or don't throw an error, a search box and/or a contact email address. Get creative.

    9 - Don't "click here."
    I think I've already said enough about this.

    Happy Blogging

    Meta Tags Keywords in Diferent Languages

    A common use for META is to specify keywords that a search engine may use to improve the quality of search results. When several META elements provide language-dependent information about a document, search engines may filter on the lang attribute to display search results using the language preferences of the user. For example:

    <-- For speakers of US English -->
    "<"meta lang="" name="keywords">en-us" content="vacation, Greece, sunshine">

    <-- For speakers of British English -->
    "<"meta lang="" name="keywords">en" content="holiday, Greece, sunshine">

    <-- For speakers of French -->
    "<"meta lang="" name="keywords">fr" content="vacances, Greece, soleil">

    <-- For speakers of Portuguese -->
    "<"meta lang="pt" " content="ferias, Grecia, sol" name="keywords">

    Happy Blogging

    Wednesday, February 22, 2006

    Blogging

    A blog is a personal journal that is kept on a website for everyone to read. Traditionally a blog contains personal writings and links to other websites or blogs. I am no expert on blogs, but I am interested in them.
    Blogging is what you do to create a blog. Blogger is what you are when you have a blog. The term blog comes from “web log”.