Google’s ClickFraud Report

Google published their findings after investigating several allegations of Click Fraud.

The report is good, and Google rebuts that the ClickFraud detection companies are doing sloppy detection.  Naturally, hearing Google claim that ClickFraud is bogus sounds like they are being defensive.  But, if you read the paper, Google clearly did some solid engineering to investigate these claims.

Google found that the ClickFraud-detection companies are just making basic errors in their detection of fraud.  According to ArsTechnica, at least one of the ClickFraud claimants (ClickFacts) has agreed that Google did identify real problems in the ClickFact’s detection logic.

Google also researched the results from an AdWatcher report.  In the report, AdWatcher told the advertiser that they had been a victim to ~12,000 fraudulent clicks.  However, during that period, the customer was only billed by Google for ~6,000 clicks.  This is just impossible; obviously the fraud count can’t be larger than the total clicks billed.

If you know even a little about how http, referrers, and web-browsers work, you should check out the Appendices in the Google report.  Frankly, it’s shocking that the ClickFraud companies are making such basic errors in their reporting.  Obviously, their business is hinged on proving that ClickFraud exists, but they need to do a lot more diligence in their engineering before making their claims.

It’s refreshing to see real engineering research being done on this stuff rather than the marketing claims based on fuzzy data that we usually see. 

Thoughts on Niall leaving Microsoft

I’ve had some time to think about Niall Kennedy’s announcement that he’s leaving Microsoft.  I met him once for probably less than a minute, so I don’t really know him.  I have read his blog a fair bit and have a lot of respect for him.  But….

I’m a little disappointed with Niall too.  He started work at Microsoft only 4 months ago.  Frankly, he should have known that Microsoft could be like this.  To quit after such a short period of time, and then to declare the company of being in “general paralysis” seems unfair to Microsoft, his colleagues, and his readers. He did generate a lot of press for himself though (ZDNet, Information Week, CNet, Seattle Post, SoftPedia).

There is no doubt that Microsoft is undergoing major changes in order to compete going forward.  This type of metamorphasis is one that Microsoft has successfully done before, but it takes a long time.  The change can take a huge toll on employees while the company gets priorities sorted out.  Coming from small companies myself, I definitely sympathize with Niall’s frustration that Microsoft and Windows Live is moving too slowly!  It’s fair, but any seasoned engineer should expect this when they start with Microsoft (and then work like a dog to make it faster/cheaper/better).

Obviously, each of us faces our own career/life decisions and sometimes the best thing to do is to leave.  In making his decision, I’m sure Niall did the right thing.  I’m just a little disappointed at the damning words that he left for Microsoft on his exit.  He’s a smart guy, but when you only had the patience for 4 months of it, you didn’t earn the right to conclude the whole thing is just screwed up.

Anyway – Niall – best of luck to you with your new adventure!  I’m sure you’ll go far.

Default Search Wars and a new OS feature

I read this article today about how Google’s Toolbar takes an aggressive stand against someone changing the “default search” provider in IE. Wow – this is pretty aggressive!

googblock.jpg

Now, I’m sure people will start yelling at Microsoft, and maybe that is deserved. For the record though, Microsoft’s software doesn’t do this (yet).

Anyway, this problem of “software wars” is not new. And the OS should protect the poor user from it.

It all starts with a Cold War
During the Cold War, two competing programs will, invariably, put an option into their installers to override the competitor’s settings. Engineers, like me, at each company will complain that this is terrible and should never be done, but we are always overridden by business folks that are smarter.

Once the cold war starts, it begins to escalate. First, one side will make it so that the software *always* overrides the competitor’s settings, without giving the user a choice. Then, the other will respond with something equally clever, like additional popup warnings or something silly like that.

Eventually, the cold war turns into an all out war, with users as the casualties. In this phase, the product is installing executables on your system which run constantly just to monitor for the competitor’s software and do something different. Sadly, Google’s Toolbar has entered this phase.

How to Fix?
The operating system really ought to protect against this more readily. Why is it that one program can alter another program’s config without the user even knowing about it? It shouldn’t be allowed. So when any competitor changes the default IE settings, the OS ought to tell the user that this has happened, and let the user deal with it. It prevents any surruptitious altering of configurations, and educates the user at the same time. If we had this, the cold war probably never would start- because if either side instigated, the users would be able to knowingly protest immediately.

Your browser’s cache is full and may interfere with your Gmail experience.

A while back, I wrote an article about Gmail’s cache complaint message. In order to believe the Google claim, you’d have to believe that IE’s cache is implemented in such a way that it doesn’t know how to clear space automatically when it gets full. Every cache I know of does that, so it seems pretty hard to believe. I thought it might be voodoo. Since I never heard an answer, I finally stopped being lazy and investigated myself.

Turns out that Google’s right. IE6 really is that lame.

To test this, I first started up Fiddler, a great little utility for tracing HTTP requests. Then, I navigated to my favorite brother-in-law’s website (Don’s Drug), and loaded two pages. Here is the trace:

www.donsdrugs.com
Page1:
/ 200 1027 bytes
/styles-site.css 200 4075 bytes
/c/header-dd.jpg 200 26774 bytes
Page2:
/archives/cat_news.html 200 868 bytes
/styles-site.css 200 4075 bytes
/c/header-dd.jpg 200 26774 bytes

Ack – see that? The second two pages got reloaded with HTTP 200 responses! Those were static and should have been cached, or at the very least we should have seen “304 not modified” responses from the server (which saves us from having to redownload all the bytes).

Then I cleared the cache and loaded another two pages (my cache was set to the max size of 32GB, so it took about 5 minutes to empty):

Page1:
/archives/cat_drugs.html 200 813 bytes
/styles-site.css 200 4075 bytes
/c/header-dd.jpg 200 26774 bytes
Page2:
/c/about.html 200 546 bytes

Aha! So this time the static content that we had previously fetched (styles-site.css and header-dd.jpg) were now fetched from the cache. So, Google’s right. Web browsing will really suck if you leave your IE6 cache full for too long.

What does this mean? Well, it means that if your cache is full, and you are browsing a site like “CNN”, every page has to keep downloading the content which is common to all CNN pages. And, this can be the bulk of what you download. You could easily see 2-3x faster web browsing by clearing your cache manually.

The IE7 team has been doing a lot around caching and performance, so I bet this is fixed. I’m going to hit them up to make sure.

NoSpyMail Revisited

It’s been a couple of years now since I wrote NoSpyMail, and I haven’t really done much with it. Each month I get a few nice emails from users that are using it – and I’m always pleasantly surprised to hear that they still like it. It’s a simple little utility, probably written more out of anger than anything else.

But today I got a friendly email from a guy that is upgrading to Vista, and he reported that it wouldn’t install for him. OK – so I fixed that (I think!) for him. But I asked why he used it when he’s using Outlook 2003. After all, Outlook 2003 already has html image filtering built in.

His answer was that he still needs it. Even though we may filter out *most* of those HTML emails, how many do we still click on because they are from our “legitimate” places, e.g. costco, fandango, ticketmaster, etc. These are emails that we want to receive, but even these “legitimate” mass-email-senders are using trackers to spy on who’s clicking.

He’s got a valid point. He also thinks he gets a lot less spam as a result of using NoSpyMail. Hard to say, but I hope he’s right.

So, after having not used NoSpyMail myself for quite a while, I brought it back into my software lineup. Works great (it ought to – I wrote it! :-), and really doesn’t tax you in any way except that it filters out nasty HTML trackers. I was a little annoyed by the default settings because you get notified *so much* about the spymail. So I quickly unchecked the box to “Notify me when Spymail is discovered” (available via the Options). I don’t need to be notified – just clean it up and let me read my mail safely.

Sorry for the blatant plug.

Patents- the only way to win is not to play

In the software industry, our employers sometimes ask us to patent stuff.  The usual claim is that it is for “defensive purposes” in case your company gets sued.  Of course, as loyal employees, we want our company to safe from greedy lawyers seeking bogus patent infringements, so we blindly believe, agree, and patent like mad.

This is a fallacy, of course, and all of us at the rank-and-file levels of our companies should resist patenting anything.

The problem is that eventually patents are used for offensive purposes rather than defensive purposes.  It’s just a matter of when it economically makes sense to use the patent.  Eventually, your company will struggle financially, and eventually an energetic young lawyer will come to the senior management with a solution to the shareholders’ woes –  enforcement of patents.  This has happened too many times to count.  It is the inevitability of patents.

So, if you are a technologist, don’t file software patents.  Software patents are for lawyers that like to destroy other businesses for their own personal gain.  The USPO is not capable of differentiating a worthy software patent from mathematically impossible one. Your company may try to bribe you with incentives to get you to “help”.  Your company will claim that the patents are only for “protection”.  It’s not your company’s fault.  All successful companies need lawyers, and lawyers tell them to do this.  The company always starts out with the best of intentions.  But, mark my words, if the patent proves useful monetarily, your patent will someday be used to tear apart someone else’s hard work.

Until the law changes, the only way to win is to not play.