Google’s AutoLink Does Evil

You may have read about the Google Toolbar’s new AutoLink feature. You may wonder what the big deal is?

I was wondering too. At first glance, they are just linking maps. This may be a little annoying to Yahoo, because it wants you to use their maps, but not the end of the world.

But the real problem comes with books. For instance, there are authors out on the net promoting their books via their web pages. They make modest amounts of money each year doing this. But, if you install the Google Toolbar, Google replaces these links with links of their own! If the end user clicks those links, then the commission for selling the book bypasses the author and instead goes to Google! Woa! That is really unfair, unjust, and unright.

Rogers Cadenhead writes a better article than this one about why its just not right for Google to do this.

Here are some quick screenshots (see the circled red area for Google’s links)

Before Google After Google

Anyway, the really funny part of all this is that Microsoft tried this a few years ago with a product called SmartTags. Opposition was so strong to SmartTags, that after being blasted for a while, Microsoft pulled the feature. Lots of people are noticing this now. Dan Gillmor, Robert Scoble, and mikel.

To Google – if you are listening – I love your company and products, you are doing a lot of good stuff. But this feature has to be removed. This will be the impetus to really stop using your products if you do not. (Boy will I miss adwords!)

Firefox security vulnerability

There’s a Phishing technique in play now where crooks register international domain names with special characters that *look* like letters but are really not what they seem.

Its so popular for everyone to complain about Microsoft security problems. Interestingly, this bug is present in Firefox and NOT in Internet Explorer. Firefox fans would probably say, “its just because Microsoft doesn’t support the latest standards for international domain names”. Perhaps that is true, but at the end of the day, this is a bug in Firefox, and not IE.

The problem lies with Unicode. In the early 90s, you couldn’t use two character sets at the same time. If you were working in both Japanese and Chinese, you had to pick one character set to use at a time, which made it very tricky to use applications in multiple languages. Unicode was invented to solve this problem. Unicode defines basically all characters, in all languages, and has worked very well. It also, however, unleashes this particular bug. For instance, the letter “a”, has a unicode value. However, it turns out that unicode character &#1072, also LOOKS like “a”, but is really a different character.

This allows the bad guys to create Urls like this one:

Paypal.com

To the user, this looks like a real link to paypal. But its NOT. Its a fake. Go ahead, and test your browser. If you are using IE, you’ll see a page-not-found error. If you see a page, then your browser is vulnerable.

I hope that the domain registrars jump in to help fix this. Seems like we should be able to pretty easily spot bogus domain name registrations. There are probably a lot of combinations of forged addresses, but it should be detectable and prohibited from the root.

Or maybe you should be using mod_deflate

OK – I know I just posted a couple of days ago about the wonders of mod_gzip. I had picked mod_gzip because I had read about it at the yahoo blog, and also because it was easy to get going on my Apache 1.3 server. But, I’ve just upgraded to Apache 2.0, and after reading around, mod_deflate is really the compressor of choice there. Its a little more updated and there are some articles from folks that really think its better. From my perspective, though, it seems about the same as mod_gzip. Either one you choose, your website should be running a lot smoother!


I figure I should color this entire blog green to indicate that its now running environmentally friendly…

You really should be using mod_gzip

What is mod_gzip? Its an extension to the Apache Web Server to implement compression on the server side. With it installed, every page that leaves the web server gets compressed. For text pages, this is usually about 70%. So a 10KB page reduces to about 3KB. Thats a pretty good savings!

Its astonishing to me that most web servers don’t implement gzip across the board for all pages being served. Its been about 10 years since I worked on performace optimizations on the Netscape Enterprise Server, and even back then we recognized the value of really enabling gzip. But somehow, along the way, it just never was valued very highly. With broadband adoption rates going through the roof, it just never seemed to matter that much. But it really does. We should all be using it.

The good news, however, is that the browsers have all mostly implemented support for compressed pages. IE 5 & 6, Firefox, and Mozilla 5+ all support compressed pages pretty well. And for browsers that don’t support it, well, you can still just not compress.

I just installed the gzip module on my server here. The time to install took about 1 hour. That included downloading the source, compiling it, installing into my apache server, and setting up the config. Most of that was learning all the various configurations that mod_gzip offers. They do a good job of explaining the basics in their documentation, and I’ve posted a copy of those docs here.

So far, it seems to work great. I haven’t seen any problems yet, although let me know if this site is not working for you!

Lastly, here are some real numbers about why this is so important. Everyone wins when we use compression – end users get faster responses, and the overall internet bandwidth goes down too.

Most pages on my site are around 10KB. Some are 20KB, and a few are as high as 35KB. Here is a chart showing the speed improvements for various connection types:

No Compression

With Compression
page size

56Kbps modem, uncompressed

Typical DSL ~384Kbps), uncompressed

56Kbps modem, mod_gzip

Typical DSL, mod_gzip
10KB

1.43s

0.21s

0.48s

0.07s
20KB

2.86

0.42

0.95

0.14
35KB

5

0.73

1.67

0.24
100KB

14.29

2.08

4.76

0.69

A9’s Video Yellow Pages

A9 launched a new service today that they simply call Yellow Pages. But these Yellow Pages are not your ordinary business directory lookups! They actually have photos of the storefronts of the businesses, and you can actually “line up” these photos as a virtual street walk! Its really neat!

The service itself has yet to prove real value to me, but I do like the idea so much that I promise to continue giving it a real shot. But I really enjoyed reading about how they built this service. They actually put together some GPS-ready cars with a rooftop mounted camera and pounded the streets manually. Thats fantastic!

In case you haven’t heard of A9, they are a technology focused spinoff created and owned by Amazon. In case you haven’t heard of Amazon, um, well, how the heck did you find my blog?

Google/Browser Wars again

Browser wars are coming back. Google still denies that they are going release a browser, but they are definitely creating one. There are two clear factors:
1) They hired the lead developer from firefox.
2) When you install firefox, the home page is set to this google web page. As you can see, the page is hosted at Google, and yet branded specifically for Firefox and Google.

On the whole, I believe competition is a great thing. So I’m really glad to see another browser in the market.

At the same time, however, my selfish developer side is lamenting the end of the days of only one browser. This will unfortunately trickle down to end users with watered-down web applications again.

The reason that web applications will suffer is because web developers will now start building for yet-one-more browser to be compatible with. Remember back to 1997 when there were literally about 6 different browsers that every website had to support? (3 versions of netscape, 3 versions of IE) Writing websites was a horrible experience. Each site ended up picking the minimalist set of browser features just so that the application would work everywhere.

Well, every browser has bugs, even firefox. As I’m playing with it, I see them all the time (especially around javascript event handling and such). As the market splits, web application developers will find themselves increasingly running into potholes where code *should* work in both places, but only works in one.

And of course, the browser vendors are trying to lock users into their own applications too. They do this by providing tools and features which they simply know won’t work well on the other browser. For example, ASP.NET has some code which works great on IE, but deliberately doesn’t try to render on non-IE browsers (such as the javascript-form-validation code). This makes it hard for developers to use those features without having to consciously not support the other browsers. Google will do this type of thing too.

For end users, it may not be too bad this time – if there are just two versions. You’ll need two browsers – some applications will only work in IE, some only in Firefox. Many applications today already deployed only really support IE (its fair, since they were deployed before firefox was born), and its unlikely that the sites which created them will switch to supporting Firefox for quite some time. Some of them are very popular web applications like OWA (Outlook’s Web Access).

But, we know that Microsoft will eventually have a new version out too. In a year or so, I wouldn’t be surprised if websites had to navigate through the bugs of Firefox 1.0, Firefox 2.0, IE 6.0, and IE 6.5!! (Don’t forget safari!)

Anyway, don’t get me wrong; I very much think having more browser choice is a great thing! Its just a bit sad to have development get splintered again.

American Chopper and Software Engineering

Like many people, over the last 6 months I’ve become a huge fan of American Chopper, a TV series on the Discovery Channel about Paul & Paul Teutul’s Orange County Choppers. If you haven’t seen this show, it chronicles some of the goings on at a father-son custom motorcycle building shop. They build some very cool bikes with a lot of style; and along the way we viewers get to watch them conflict and get angry with each other due to their radically different work styles.

If you know me, you probably would think this show is pretty out of character for me. I feel that way a bit myself! After all, I don’t ride bikes and I’ve never really had much interest in them. I’m just a geek that builds software for a living. But I’ve gotta tell you, this show is great. And more surprising to me is that I’ve recently found myself at work quoting American Chopper as examples of how to build software! Am I mad? (probably) I’m sure Paul Sr would be a bit surprised to hear that there is much in common between building software and building bikes!

Its all in the dynamics of the show. Paul Sr is always Mr. Responsible. He’s furious when the shop is not clean and organized, and even more furious when bike’s aren’t getting built to schedule. Paul Jr, on the other hand, is pretty casual about schedules. He’s very creative, and always has yet-one-more-idea for how to improve the bike. But that creativity comes at a cost, and usually its his father’s frustration that deadlines won’t get met. Paul Jr says his dad does nothing but gripe and sit back with his “size twelves” up on the desk. Paul Sr says his son, who he affectionately calls “numb-nuts”, wouldn’t ever get a bike finished if it weren’t for Sr’s constant monitoring.

Well, if you can envision it, its a lot like software. Depending on who you ask, building software is considered part art, part science. I think those that went to “engineering” school really want it to be a science. (I’m one of those!) But in fact, its not really. For any given problem, there are an infinite number of ways to write code which will solve the problem. A scientific problem always has the same right answer. But there is definitely no “right” answer in software. Likewise, building a bike is part science – a lot of physics and mechanics go into engineering a bike that operates smoothly and safely. But, the custom bikes have as much art to them as science, and there is no right answer either. Because of this creative element, schedules on these types of tasks are hard to do. If it were a science (like building a car on an assembly line), you know exactly how long it will take to build. But when you are building something artistic, something never built before, how long will it take??

This is the core give-and-take of software engineering.

So, in the course of a 60 minute American Chopper show (actually they usually build the bikes over the course of two episodes), you can watch the complete dynamics of the software lifecycle! It starts with the problem to solve – either building a Spiderman themed bike, or an IRobot bike for Will Smith, or a bike to dedicate to the firefighters of 9/11. This is the requirements gathering phase. Then it moves into the design phase, where they figure out what to do with the bike. Then the team usually does a bit of bonding in the form of some out-of-work event before moving into the “heads-down” building phase. After building, it goes into final assembly, then a short test cycle, and then its done!

But what is remarkable to me is to watch the tradeoffs between Paul Sr and Paul Jr. Sr is all about schedules, business, and getting things done. He’s the one that built the shop from nothing, and really understands the value of being efficient, clean, and professional. Jr is younger and a bit sloppier with his work habit, but relentless in applying detail to making great bikes. He cares about schedules, of course, but if he comes up with a way to make the bike better late in the game, he’ll take the risk to make the change. Fortunately, the two are in constant battle with each other. Ironically, if they agreed on everything, they’d probably make bikes that weren’t half as good as they are!

Software is the same way. You’ve got folks on the team that are trying to keep things moving forward, on schedule, and consistent. There are other folks on the team who’s job is to build the product and be creative, making sure that its high quality, innovative, and as good as it can be. Sometimes, we don’t realize what the “best” way to implement is until we’ve already started. Unfortunately, the only way to fix it is to iterate on the design while in the implementation process. If you don’t do it, you won’t have great software. And if you do it, you’ll risk your schedule and possibly miss the date!

Anyway, the overlaps are amazing. If you are planning to be a software manager soon, watch this show. You’ll learn about how to balance creativity vs business, about team morale, about coaching (see their youngest in the crew – Cody), and about conflict. Its really cool.

There are some differences, though. In the virtual world of software, we sometimes crash. On the Chopper show, they have less tolerance for that.

Well, thanks for listing to my abstract thoughts on software…. Maybe I just like watching TV and calling it work.

Rojo Newsreader

If you read a lot of RSS, you should definitely check out a new service from a company called Rojo. Its “RSS Mojo”, they say.

They are still in beta, and they took a lesson from Gmail for how to gain users – you need to be part of the “in” crowd and get an invite from someone that is already a user!

But once you are in, there are several cool features. First, its a server side aggregator; but more than that, it allows you to tag and comment on articles which are interesting. You can also create a network of friends and share your content with them and they with you. Rojo’s engine then works hard to present to you the most relevant articles that others have found useful as well. Its a cool concept – I look forward to having more buddies using it so we can really get a try for the network effects of sharing RSS commentary.

Rojo was co-founded by Chris Alden, who is the co-founder of Red Herring magazine.

MSN Search Products

In case you are wondering what the MSN team (which I am part of) is working on, here is some info.

You may or may not know that we at Microsoft launched our new MSN Search Beta this week. There has been a lot of press coverage on it, of course. Many people still like Google, but the new MSN product looks reasonably compelling too. There is a cool blog from Microsofties which you can read for the latest info on the web search.

So what’s so great about the beta search?

Whats not as good?

  • Still can’t find decent Microsoft APIs. OpenMsgStore
  • Every so often reports “temporarily unavailable”