Recently in Search Engines Category

Sign at the Googleplex

Image via Wikipedia

It's been a while since I've heard of new meta tags being introduced (or maybe I just wasn't aware of any)

Google has just introduced a new tag:

<a meta name="google" value="no translate">

You can use it to mark pages on your site that you DO NOT want the Google translator tools to translate.
They've also introduced a method of marking text within a page as not be translated:

 class=notranslate

You can read more about this on the official blog

Reblog this post [with Zemanta]

Dealing With Bad Bots

| | Comments (0)
Architecture of a Web crawler.

Image via Wikipedia

Most search engines' spiders obey the robots.txt commands.

Basically you can instruct a search engine to not index certain parts of your site or disallow some spiders from accessing your site entirely.

Unfortunately some search engine spiders are either badly written or intentionally evil and totally ignore any commands you might try to pass them via the robots.txt

One such robot is Voila.

Voila identifies itself with the UserAgent string:

VoilaBot BETA 1.2

Depending on the type of site you have you're probably best advised to block it entirely.

If you have access to iptables then you can simply issue a series of commands similar to this one:

iptables -I INPUT -s 81.52.143.15 -j DROP

I'm trying to get a full list of the IP ranges used by Voila, but so far I've found two which you could block. They are:

193.252.148.0/23
81.52.142.0/23

On one server the VoilaBot had caused the sites to become completely unresponsive with the load average climbing constantly!

Reblog this post [with Zemanta]

Google PageRank Update

|
It looks like the latest pagerank update is currently underway.

Based on previous updates this could take several days to complete and stabilise.

Some people will probably be very happy, while others won't be ....

Such is the way

Verifying GoogleBot

|
I've seen a few posts from people on forums over the last few weeks complaining that GoogleBot is misbehaving and chewing up silly amounts of bandwidth.

Unfortunately it's all too easy to pretend to be GoogleBot (or any other bot for that matter) simply by spoofing the UserAgent string ie. telling the world that you are something that you aren't.

So how can you tell if a bot really is from Google?

One of the official Google blogs has the answer - DNS lookups:


Basically you need to check that the forward and reverse DNS entries are valid ie. that the A and PTR records are from Google...

That's a bit of a pain, but will save you headaches if you're seeing strange activity in your logs...

SEO Quake - SEO Browser Plugin

|
seo-quake.gif
Both Internet Explorer 7 and Firefox support extensions which add functionality to the browser.

Personally I prefer Firefox and since I don't use Windows that much I don't have the option of using IE7 that often.

One of the more interesting extensions that I've come across recently is called SEO Quake.
The plugin is available for both Firefox and Internet Explorer and gives a webmaster, marketer or SEO access to a wealth of information about the sites you are visiting, search engine results and whole lot more.

For a full range of features and an explanation of how they work have a look at the article on the developer's site
Losing traffic through laziness or silly mistakes is simply unacceptable.

What am I referring to?

If your site resides at  www.domain.tld and you've been marketing it successfully both online and offline people will forget about the "www" part. It's only natural.

While back in the mid 90s most sites were ONLY available via www.domain.tld that's no longer the case.
(there was some odd RFC that a lot of people referred to for this reason)

So the first thing you should do is check that both www.domain.tld and domain.tld point to your site.

There's no technical reason why your hosting provider can't set that up for you. If they tell you that they can't then you should really look elsewhere.

The one possible problem that you might face is that robots might treat the "www" version of your site and the non-www one as two separate sites. They shouldn't, but it can happen.

The simple solution to this is to force people (and spiders) to use either one or the other using a redirect.

Richard has a case in point with regard to the NCH in Dublin that could so easily be fixed! Others have fixed theirs already.

Search Terms Reflect User Base?

|
While a lot of SEO experts tend to diss other search engines, such as MSN and Ask, I've always felt that it was a bad idea to ignore their users. A customer (visitor) is still a customer regardless of how they ended up on your website. A few months ago I did a bit of not too scientific analysis of browsers etc., that were being used to access a number of sites that I manage. The results reinforced my suspicions. But what about the search engines? What are people actually looking for? Is there a significant difference between a Google user and a Yahoo! user? The 2006 Search Wrap-Up would seem to suggest that there are some significant differences. Google users tend to be more technical it seems, while Yahoo! users are into popular culture if the top search terms of 2006 are anything to go by. From an SEO perspective I guess that means that you still can't ignore Yahoo! et al :)

Ask! Now With An X

|
Ask.com are currently working on a new "top secret" interface currently called Ask X. Like so many of these betas it's not a particularly well kept secret :)
Thanks for stumbling upon Ask X, our double-secret sandbox for testing Ask experiences of the future. In today's version of Ask X, you're not just getting back a list of links, but a slick, new three-panel interface (much like the new AskCity), combining great time-saving features like: * Left: A search control panel that stays with you, complete with Zoom Related Search and Search Suggestions that update as you type. * Middle: Results front and center to provide clutter-free information without having to scroll down the page, and Binoculars to preview results. * Right: A preview of other types of search results, including video, news, images, blogs, shopping, encyclopedia and more.
It's not that different to some of the other interfaces its competitors have been playing with, but it's still quite nice

Google AJAX API

|
Google has released an "experimental" AJAX API:
The Google AJAX Search API is an experimental API that lets you integrate a dynamic Google search module into your web pages so your users can mash up Google search results with other content on your site or add search results clippings to their own content.
If you could integrate that with adsense it would be really tasty :) Ajax Hacks Google Hacks

Will .eu work for SEO?

|
The launch of the .eu domain for the European Union countries has led to a certain degree of controversy, as many domains may have been squatted. But how will Google and other search engines deal with this new TLD? Up until now ccTLDs, such as .ie (Ireland), .co.uk (United Kingdom) etc., have been used, in part at least, by Google and others to provide "local" results. Will this lead to a new Google search option? Instead of choosing "pages from Ireland" will we have the option to choose "pages from Europe"? Maybe we'll see Google.eu launching to deal with this at some point in the future, however it won't be immediately, as the current application is still pending with Eurid (like so many others) Of course a lot of this is quite academic. What will be of real interest is how the .eu will affect a site's page rank and general search result positioning. Unfortunately it is too early to tell, as the domain has only been "live" for a few weeks.

About this Archive

This page is a archive of recent entries in the Search Engines category.

programming is the previous category.

seo is the next category.

Find recent content on the main index or look in the archives to find all content.

Powered by Movable Type 4.2-en