Obama’s Wars

obamaswar I voted for Obama for one reason:  I thought he was more likely to get us out of Iraq.  I was wrong.  Obama’s stance was clear: he just needed 18 months to do the job.  That deadline is Tuesday.

There are currently still 85,000-100,000 US troops in Iraq.  All other countries have already pulled out.  Just as under the Bush administration, the objective of our presence in the region is unclear.

And although his campaign promises were to reduce the US war efforts, Obama has also more than doubled US troop levels in Afghanistan in the past year.  If you think this is Bush’s war, you’re wrong.  Obama did this.

Obama’s failure to hold to his campaign promise is just one more reason why he will never again get a vote from me.

Chrome: Cranking Up The Clock

Over the past couple of years, several of us have dedicated a lot of time to Chrome’s timer system. Because we do things a little differently, this has raised some eyebrows. Here is why and what we did.

Goal
Our goal was to have fast, precise, and reliable timers. By “fast”, I mean that the timers should fire repeatedly with a low period. Ideally we wanted microsecond timers, but we eventually settled for millisecond timers. By “precise”, I mean we wanted the timer system to work without drift – you should be able to monitor timers over short or long periods of time and still have them be precise. And by “reliable”, I mean that timers should fire consistently at the right times; if you set a 3.67ms timer, it should be able to fire repeatedly at 3.67ms without significant variance.

Why?
It may be surprising to hear that we had to do any work to implement these types of timers. After all, timers are a fundamental service provided by all operating systems. Lots of browsers use simpler mechanisms and they seem to work just fine. Unfortunately, the default timers really are too slow.

Specifically, Windows timers by default will only fire with a period of ~15ms. While processor speeds have increased from 500Mhz to 3Ghz over the past 15 years, the default timer resolution has not changed.  And at 3GHz,15ms is an eternity.

This problem does affect web pages in a very real way. Internally, browsers schedule time-based tasks to run a short distance in the future, and if the clock can’t tick faster than 15ms, that means the application will sleep for at least that long. To demonstrate, Erik Kay wrote a nice visual sorting test. Due to how Javascript and HTML interact in a web page, applications such as this sorting test use timers to balance execution of the script with responsiveness of the webpage.

John Resig at Mozilla was also wrote a great test for measuring the scalability, precision, and variance of timers. He conducted his tests on the Mac, but here is a quick test on Windows.

In this chart, we’re looking at the performance of IE8, which is similar to what Chrome’s timers looked like prior to our timer work. As you can see, the timers are slow and highly variable. They can’t fire faster than ~15ms. 

timers.IE

A Seemingly Simple Solution
Internally, Windows applications are often architected on top of Event Loops. If you want to schedule a task to run later, you must queue up the task and wake your process later. On Windows, this means you’ll eventually land in the function WaitForMultipleObjects(), which is able to wait for UI events, file events, timer events, and custom events.  (Here is a link to Chrome’s central message loop code) By default, the internal timer for all wait-event functions in Windows is 15ms. Even if you set a 1ms timeout on these functions, it will only wake up once every 15ms (unless non-timer related events are pumped through it).

To change the default timer, applications must call timeBeginPeriod(), which is part of the multimedia timers API. This function changes the clock frequency and is close to what we want.  Its lowest granularity is still only 1ms, but that is a lot better than 15ms. Unfortunately, it also has a a couple of seriously scary side effects. The first side effect is that it is system wide. When you change this value, you’re impacting global thread scheduling among all processes, not just yours. Second, this API also effects the system’s ability to get into it’s lowest-power sleep states.

Because of these two side effects, we were reluctant to use this API within Chrome. We didn’t want to impact any process other than a Chrome process, and all of the possible impacts of the API were nebulous.  Unfortunately, there are no other APIs which could make our message loop work quickly. Although Windows does have a high-performance cycle counter API, that API is slow to execute1, has bugs on some AMD hardware2, and has no effect on the system-wide wait functions.

Justifying timeBeginPeriod
At one point during our development, we were about to give up on using the high resolution timers, because they just seemed too scary.  But then we discovered something. Using WinDbg to monitor Chrome, we discovered that every major multi-media browser plugin was already using this API. And this included Flash3, Windows Media Player, and even QuickTime.  Once we discovered this, we stopped worrying about Chrome’s use of the API.  After all – what percentage of the time is Flash open when your browser is open?  I don’t have an exact number, but it’s a lot. And since this API effects the system globally, most browsers are already running in this mode.

We decided to make this the default behavior in Chrome.  But we hit another roadblock for our timers.

Browser Throttles and Multi-Process
With the high-resolution timer in place, we were now able to set events quickly for Chrome’s internals.  Most internal delayed tasks are long timers, and didn’t need this feature, but there are a half dozen or so short timers in the code, and these did materially benefit. Nonetheless, the one which matters most, the timer stall for the browser’s setTimeout and setInterval functions did not yet benefit. This is because our WebKit code (and other browsers do this too) was intentionally preventing any timer sustaining a faster than 10ms tick.

There are probably several reasons for the 10ms timer in browsers. One was simply for convention. But another is because some websites are poorly written, and will set timers to run like crazy.  If the browser attempts to service the timers, this can spin the CPU, and who gets the bug report when the browser is spinning? The browser vendor, of course.  It doesn’t matter that the real bug is in the website, and not the web browser, so it is important for the browser to address the issue.

But the 3rd, and probably most critical reason is that most single-process browser architectures can become non-responsive if you allow websites to loop excessively with 0-millisecond delays in their JavaScript. Remember that browsers are generally written on top of Event Loops.  If the slow JavaScript interpreter is constantly scheduling a wakeup through a 0ms timer, this clogs the Event Loop which also processes mouse and keyboard events. The user is left with not just a spinning CPU, but a basically hung browser.  While I was able to reproduce this behavior in single-process browsers, Chrome turned out to be immune – and the reason was because of Chrome’s multi-process architecture. Chrome puts the website into a separate process (called a “renderer”) from the browser’s keyboard and mouse handling process.  Even if we spin the CPU in a renderer, the browser remains completely responsive, and unless the user is checking her Task Manager, she might not even notice.

So the multi-process architecture was the enabler. We wrote a simple test page to measure the fastest time through the setTimeout call and verified that a tight loop would not damage Chrome’s responsiveness.  Then, we modified WebKit to reduce the throttle from 10ms to 1ms and shipped the world’s peppiest beta browser: Chrome 1.0beta.

Real World Problems
Our biggest fear with shipping the product was that we would identify some website which was spinning the CPU and annoying users.  We did identify a couple of these, but they were with relatively obscure sites. Finally, we found one which mattered – a small newspaper known as the New York Times. The NYTimes is a well constructe site – they just ran into a little bug with a popular script called prototype.js, and this hadn’t been an issue before Chrome cranked up the clock. We filed a bug, but we had to change Chrome too. At this point, with a little experimentation we found that increasing the minimum timer from 1ms to 4ms seemed to work reasonably well on most machines. Indeed, to this day, Chrome still uses a 4ms minimum tick.

Soon, a second problem emerged as well. Engineers at Intel pointed out that Chrome was causing laptops to consume a lot more power. This was a far more serious problem and harder to fix.  We were not concerned much about the impact on desktops, because Flash, Windows Media Player, and QuickTime, were already causing this to be true.  But for laptops, this was a big problem. To mitigate, we started tapping into the Windows Power APIs, to monitor when the machine is running on battery power. So before Chrome 1.0 shipped out of beta, we modified it to turn off fast timers if it detects that the system is running on batteries. Since we implemented this fix, we haven’t heard many complaints.

Results
Overall, we’re pretty happy with the results.  First off, we can look at John Resig’s timer performance test. In contrast to the default implementation,  Chrome has very smooth, consistent, and fast timers: 

timers.chrome

Finally, here is the result at the Visual Sorting Test mentioned above.  With a faster clock in hand, we see performance doubles. 

clock

Future Work
We’d still like to eliminate the use of timeBeginPeriod.  It is unfortunate that it has such side effects on the system. One solution might be to create a dedicated timer thread, built atop the machine cycle counter (despite the problems with QueryPerformanceCounter), which wakens message loops based on self-calculated, sub-millisecond timers. This sounds trivial, but if we forget any operating system call which is stuck in a wait and don’t manually wake it, we’ll have janky timers. We’d also like to bring the current 4ms timer back down to 1ms. We may be able to do this if we better detect when web pages are accidentally spinning the CPU.

From the operating system side, we’d like to see sub-millisecond event waits built in by default which don’t use CPU interrupts or otherwise prevent CPU sleep states. A millisecond is a long time.

1. Although written in 2003, the data in this article is still relatively accurate: Win32 Performance Measurement Options
2. http://developer.amd.com/assets/TSC_Dual-Core_Utility.pdf
3. Note:  The latest versions of Flash (10) no longer use timeBeginPeriod.
NOTE: This article is my own view of events, and do not reflect the views of my employer.

Democrats Agree – Democrat Policies Caused California Financial Crisis

brown Willie Brown, former State Speaker and also former Mayor of San Francisco, made a surprising interview this week.  In an unusual interview,  he acknowledged that his Democratic political agenda is largely responsible the the financial meltdown here in California.

His quotes in the interview are jaw-dropping.  Not only does he admit that the spending policies threw California into massive debt, but he also admits that he, and the other legislators did this without really having tried to analyze the effects of the laws they were implementing.

Here is a nice quote,  "I had actually participated in moving legislation to reduce the retirement age for teachers and I did it with great pride and I created it in my resume as one of my great achievements… Nobody took the time to do the analysis that would have persuaded us we needed to add money to make it work”

Well, that’s good to know.

He goes on to be fairly self-critical, “I may have been one of the key architects of many of the things that have created a challenge for my successors.”

So there you have it – even Willie Brown knows that Democrats are the architects of our problems.  But Brown couldn’t do it alone – it took all of them – much like Obama is doing now – to systematically ignore logic and common sense in favor of grabbing votes by promising to spend money on programs we know we cannot afford.

More Bandwidth Doesn’t Matter (Much)

This is a part of a report I wrote last month; the full report is available here.

Mike Belshe – [email protected] – 04/08/10

When choosing an Internet connection, it is intuitive to focus on network bandwidth – “Speeds up to 5Mbps!”  Bandwidth is important, but it isn’t the only factor in performance.  And given the way that HTTP uses short, bursty connections, it turns out that the round-trip-times dominate performance more than bandwidth does.

Of course, bandwidth speeds are very important for video, music, and other large content downloads.  These types of content, unlike web pages, are able to utilize the network because they are large, whereas web pages are comprised of many short connections.  This report is only concerned with the effect of bandwidth and RTT on the loading of web pages.

Two Factors in Web-page Downloads:  Bandwidth and Round-Trip Time

If we make an analogy between plumbing and the Internet, we can consider the bandwidth of the Internet to be like the diameter of the water pipe.  A larger pipe carries a larger volume of water, and hence you can deliver more water between two points.

At the same time, no matter how big your pipe is, if the pipe is empty, and you want to get water from one point to the other, it takes time for water to travel through the pipe.  In Internet parlance, the time it takes for water to travel from one end of the pipe to the other and back again is called the round trip time, or RTT.

While it is easy to install a bigger pipe between two points to add more bandwidth, minimizing the round trip time is difficult because physics gets in our way.  If your server is located in London, for example, and you are in San Francisco, the time it takes to send a message is gated at least by the speed of light (even light would take ~30ms to travel that distance, or 60ms for a full round-trip).  On the internet today, round trip times of 150ms to 250ms is common.  And no matter how big your pipe is, the RTT will take this long.

Test #1:  Vary the Bandwidth

The first test we ran was to see the effect on web-page download times when you change the bandwidth of the network.  In this test, we’ve chosen fixed values for packet loss (0%) and round-trip-time (60ms) so that we can isolate the variable under test, bandwidth.  In this test we’re downloading a simulation of 25 of the web’s most popular web pages.

Here are the raw results, using HTTP to download the web page:

Bandwidth

Page Load Time via HTTP

Figure 1: This graph shows the decrease in latency (page load time) as the bandwidth goes up.  As you can see, there are diminishing returns as the bandwidth gets higher.   According to the Akamai Internet Report Q2’09, the average bandwidth in the US is only 3.9Mbps.

Figure 2: Here we see the incremental percentage decrease in latency as we add more bandwidth.  An increase from 5Mbps to 10Mbps amounts to a 5% improvement in Page Load Times.

Figure 3: This graph shows the effective bandwidth of the web page download as the raw bandwidth is increased.  At 1Mbps, web pages can be downloaded at ~69% of the total bandwidth capacity.  At 10Mbps, however, web pages can only be downloaded at ~16% of the total capacity; and we’re tapering off.

Test #2:  Vary the RTT

For this test, we fix the bandwidth at 5Mbps, and vary the RTT from 0ms to 240ms.  Keep in mind that the worldwide average RTT to Google is over 100ms today.  In the US, RTTs are lower, but 60-70ms is still common.  

RTT

Page Load Time via HTTP

Figure 4:  In this graph, we see the obvious effect that Page Load Time decreases as the RTT decreases. Continued reductions in RTT always helps, even when RTT is low.

Figure 5: This graph shows the % reduction in PLT for each 20 milliseconds of reduced RTT.  Unlike improvements to bandwidth (figure 2), where benefits diminish, reducing the RTT always helps the overall PLT.  In these tests, with bandwidth fixed at 5Mbps, each 20ms yielded 7-15% reduction in PLT.

Figure 6: This chart shows the increased overall effective bandwidth as RTT is reduced.  With high RTT, bandwidth of a page load was as low as 550Kbps, which is a little more than 10% of the 5Mbps theoretical throughput.  With low RTT, web page downloads still only achieve ~54% of the total bandwidth available.  This is due to other factors of how web pages are loaded.

Conclusions

As you can see from the data above,  if users double their bandwidth without reducing their RTT significantly, the effect on Web Browsing will be a minimal improvement.  However, decreasing RTT, regardless of current bandwidth always helps make web browsing faster.  To speed up the Internet at large, we should look for more ways to bring down RTT.  What if we could reduce cross-atlantic RTTs from 150ms to 100ms?  This would have a larger effect on the speed of the internet than increasing a user’s bandwidth from 3.9Mbps to 10Mbps or even 1Gbps.

Visual Studio signed/unsigned Comparison Warnings

I use Visual Studio 2005 for my Chromium builds.  Because Chromium is a cross-platform product (Windows, Mac, and Linux), we’ve built tools (robots) which can automatically build our code changes on the other platforms before we ever commit the change.  When a pending change fails on the robots for other platforms, it creates a headache for me because I have to rework the patch and retest.

Unfortunately, on Windows we use warning level 3, which treats Visual Studio warning 4018 as a problem but not Visual Studio warning 4389.  4018 treats signed and unsigned comparisons for greater-than or less-than as warnings, but not equality comparisons.  4389 treats signed and unsigned equality comparisons as warnings.  Why Microsoft split these into two different warnings, I don’t know.  They should be part of a single warning, in my opinion.

g++, which we use on Linux and Mac builds, does not differentiate these signed/unsigned comparisons in the same way.  And this is annoying, because it means that a signed/unsigned equality comparison will seem to work on my Windows machine, and then fail on Linux and Mac (“warning: comparison between signed and unsigned integer expressions”!).

If you have a similar problem, I recommend using Visual Studio’s option for “/we4389”, which will include warning 4389.  And alas – your builds on all platforms will treat unsigned/signed comparisons the same way.  Phew!

Schwarzenegger’s Tax Reform

Last year, Governor Schwarzenegger issued executive order S-15-09 establishing a commission to analyze California’s tax options.  The report came back last September.  It looks very promising to me. 

Highlights:

  1. Addresses state revenue stability and broadens the tax base for California.
  2. Cuts personal income tax nearly in half for almost all Californians.
  3. Cuts sales tax in half.
  4. Implements a new tax system (BNRT) which would be ~1.5-4% of net receipts for all business.  This system replaces the current corporate tax of up to 8.8%, which is eliminated.
  5. (Optional) Establish a rainy-day fund for dealing with economic fluctuations to further reduce annual financial variances.

Obviously, everyone likes the tax cuts.  But who pays with this BNRT (Business Net Receipts Tax)?  Everyone pays it.

The problem is that if you look at state revenues, we keep increasing income taxes (now the highest in the nation) and sales tax (also the highest in the nation).  But these taxes don’t cover all consumption.  And as consumption trends change over time, the revenues from these streams change.  The commission notes correctly that this is why the government has such feast-or-famine income streams each year.

staterevenuesFirst, we can look at state revenues.  You can see the rapid growth of the income tax.  You can also see that sales taxes, as a percentage of revenue are dropping.

 

 

salestaxNext, just what percentage of sales are subject to taxes?  Over time, you can see most business in California is not subject to the tax.  This creates a disproportionately high tax on some industries while there is no tax on others.  And, you can see that spending patterns among Californians change over time.  This is why the current tax system hasn’t been able to keep up with the times.

The critics are upset because the new tax side-steps their current favorite tax-breaks.  Any massive tax change is going to make someone unhappy.  But is the new tax proposal broad and fair?  Absolutely. Nonetheless, critics still cite that the new system gives breaks to the rich.  But that isn’t true.  Who consumes the most in California?  The rich.  The rich will far-and-away pay the most under this system, just as they do today.

I do have a couple of criticisms.  First, I don’t understand the elimination of the corporate tax.  The corporate tax today discourages businesses from choosing California – so eliminating the tax is commendable.  But, I don’t think it needs to be completely eliminated – reducing to 1 or 2% would pacify critics and still leave California competitive.  Second, the exact rate of the tax is a little dodgy.  I’m worried that both the tax rate and the ability to collect are still too unknown.

By the way, lawyers hate the proposal.  Why?  Because the services they sell, which are currently untaxed, would now be taxed.  Don’t be surprised if a lawyer says he hates it.