Guesstimate - Ajax Patterns

Guesstimate

From Ajax Patterns

Evidence: 1/3

Tags: Approximate Estimate Guesstimate Prediction Probabilistic


Contents

In A Blink

Browser/computer thinks about server state. (fuzzy depiction of actual state)


Goal Story

On a browser-based floor plan, Frank is watching a radio-tagged product move through the factory. In reality, the browser is only receiving the product's location every 10 seconds, but the display depicts the motion as a smooth flow.


Problem

How can you cut down on calls to the server?


Forces

  • To comprehend system activity and predict what might be happening next, it's useful to have frequent updates from the server.
  • It's expensive to continuously update from the server.


Solution

Instead of requesting information from the server, make a reasonable guess. There are times when it's better to provide a good guess than nothing at all.

One type of Guesstimate is based on historical data, and there are several ways the browser might have access to such data:

  • The browser application can capture recent data by accumulating any significant observations into variables that last as long as the Ajaxian application is open.
  • The browser application can capture longer-term data in cookies, so they are available in subsequent session.
  • The server can expose historical data, for interrogation by the browser application.

Equipped with historical data, it's possible to extrapolate future events, albeit with some imprecision. Imagine a collaborative environment where multiple users can drag-and-drop objects in a common space, something like the Ajaxian Magnetic Poetry. Using a Periodic Refresh of 1 seconds, users might see an erratic drag motion, with the object appearing to leap across space, then stay still for a second, then leap again. A Guesstimate would exploit the fact that the motion of the next second is probably in the same direction and speed as that of the previous section. Thus, the application can, for that second, animate the object as if it was being dragged in the same direction. Then, when the real position becomes apparent a second later, the object need not leap to that position, but a new estimate can be taken as to where the object's moving, and the object can instead be moved in that direction. Dragging motion is an example where users would likely favour a smooth flow at the expense of some accuracy, than an erratic display that is technically flawless.

How about longer-term historical data, stretching over weeks and months instead of seconds and minutes? Long-term data can also be used for a Guesstimate. Imagine an application showing weather on a world map, for the user's favourite locations. The technically correct approach would be to initially show no weather and gradually populate the map as weather data is received from the server. But the Guesstimate spirit would suggest relying on historical data for a first-cut world map. In the worst case, it could be based on the previous day's results. Or it might be based on a more sophisticated statistical model involving several data points.

Historical data is not the only basis for a Guesstimate. It's also conceivable the browser performs a crude emulation of business logic normally implemented server-side. The server, for example, might take 10 seconds to perform a complex financial query. That's a problem for interactivity, where the user might like to rapidly tweak parameters. What if the browser could perform a simple approximation, perhaps based on a few assumptions and rule-of-thumb reasoning? Doing so might give the user a feel for the nature of the data, with the long server trip only required for detailed information.


Decisions

How often will real data be fetched? How often will Guesstimates be made?

Most Guesstimates are made between fetches of real data. The point of the guesstimate is to reduce the frequency of real data, so you need to decide on a realistic frequency. If precision is valuable, real data will need to be accessed quite frequently. If server and bandwidth resources are a higher priority, there should be fewer accesses.

Also, how often will a new Guesstimate be calculated? Guesstimates tend to be fairly mathematic in nature, and too many of them will impact on application performance. On the other hand, too few Guesstimates will defeeat the purpose.

How will the Guesstimate be consolidated with real data?

Each time new data arrives, the Guesstimate somehow needs to be brought into line with the new data. In the simplest case, the Guesstimate is just discarded, and the fresh data adopted until a new Guesstimate is required.

Sometimes, a more subtle transition is warranted. Imagine a Guesstimate occurs once every second, with real data arriving on the minute. The 59-second estimate might be well off the estimate at one-minute. If a smooth transition is important, and you want to avoid a sudden jump to the real value, then you can estimate the real value at 2 minutes, and spend the next minute making Guesstimates in that direction.

The ITunes demo below shows another little trick. The Guesstimate for a counter is deliberately underestimated, so when the real value arrives, the jump is almost guaranteed to be upward, as the user would expect. Here, there is a deliberate effort to make the Guesstimate less accurate than it could be, with the payoff being more realistic consolidation with real data.

Will users be aware a Guesstimate is taking place?

It's conceivable that users will notice some strange things happening with a Guesstimate. Perhaps they know what the real value should be, and the server is showing something completely different. Or perhaps they notice a sudden jump as the application switches from a Guesstimate to a fresh value from the server. These experiences can erose trust in the application, especially as users may be missing the point that the Guesstimate is for improved usability. Trust is critical for public websites, where many alternatives are often present, and it would be especially unfortunate to lose trust due to a feature that's primarily motivated by user experience concerns.

For entertainment-style demos, Guesstimates are unlikely to cause much problem. But what about using a Guesstimate to populate a financial chart over time? The more important the data being estimated, and the less accurate the estimate, the more users need to be aware of what the system is doing. At the very least, consider a basic message or legal notice to that effect.

What support will the server provide?

In some cases, the server exposes information that the browser can use to make a Guesstimate. For example, historical information will allow the browser to extrapolate to the present. You need to consider what calculations are realistic for the browser to perform, and ensure it will have access to the appropriate data. A Generic Service exposing generic history details is not the only possibility. In some cases, it might be preferable for the server to provide a Specialised Service related to the algorithm itself. In the Apple example below, for example, recent real-world values are provided by the server, and the browser must analyse them to determine the rate per second. However, an alternative design would be for the server to calculate the rate per second, lightening the work performed by each browser.


Real-World Examples

This pattern is largely speculative, with only a couple of similar examples available.

Apple ITunes Counter

As its ITunes Music Store neared its 500 millionth song download, Apple decorated its Apple homepage with a counter that appeared to show the number of downloads in real time. The display made for an impressive testimony to ITunes' popularity and received plenty of attention.

In practice, the counter was based on a Guesstimate, receiving server data only once a minute. The code example below explains how it's done.


GMail Storage Space

The GMail homepage shows a message like this to unregistered users:

 2446.034075 megabytes (and counting) of free storage so you'll never need to delete another message.

But there's a twist: the storage capacity continuously increases each second. Having just typed a couple of sentences, we're up to 2446.039313 megabytes. Google is providing a not-so-subtle message about its storage capability.

Andrew Parker has provided some analysis of the homepage. Essentially, the page is initially loaded with the storage capacity for the first day of the previous month and the current month (also the next month, though that's apparently not used). When the analysis occurred, 100MB was being added per month. Once you know that, you can calculate how many megabytes per second. So the algorithm determines how many seconds have passed since the current month began, and it can then infer how many megabytes have been added in that time. Add that to the amount at the start of the month, and you have the current storage capacity.


Code Examples

Apple ITunes Counter

The ITunes counter relies on a server-based service that updates every five minutes. On each update, the service shows the current song tally and the tally five minutes prior. Knowing how many songs were sold in a five-minute period allows for a songs-per-second figure to be calculated. So the script knows the recent figure and, since it knows how many seconds have passed since then, it also has an estimate of how many songs were sold. The counter then shows the recent figure plus the estimate of songs sold since then.

There are two loops going on:

  • A quiet loop (doCountdown()) calls the server every minute to get new song stats.
  • A vigorous loop (runCountdown()) uses rate estimation data to morph the counter display every 100 milliseconds.

The key global variables are rate and curCount:

  • rate is the number of songs purchased per millisecond, according to the stats in the XML that's downloaded each minute.
  • curCount is the counter value.

So each time the server response comes in, rate is updated to reflect the most recent songs-per-millisecond figure. And curCount is continuously incremented according to that figure, with the output shown on the page.

Now, let's look at the code. The stats change once every five minutes, though doCountdown() pulls down the recent stats once a minute to catch any changes a bit more quickly:

   //get most recent values from xml and process
   ajaxRequest('http://www.apple.com/itunes/external_counter.xml',
               initializeProcessReqChange);
   //on one minute loop
   var refreshTimer = setTimeout(doCountdown,refresh);

What it fetches is the counter XML:

<code>
    <root>
      <count name="curCount" timestamp="Thu, 07 Jul 2005 14:16:00 GMT">
        484406324</count>
      <count name="preCount" timestamp="Thu, 07 Jul 2005 14:11:00 GMT">
        484402490</count>
    </root>
  </code>

setCounters() is the function that performs the Guesstimate calculation based on this information. It extracts the required parameters from XML into ordinary variables, for example:

    preCount = parseInt(req.responseXML.getElementsByTagName
               ('count')[1].childNodes[0].nodeValue);
  

When a change is detected, it re-calculates the current rate as (number of new songs) / (time elapsed). Note that no assumption is made about the five-minute duration, hence the time elapsed (dataDiff) is always deduced from the XML.

<code>
    //calculate difference in values
    countDiff = initCount-preCount;
    //calculate difference in time of values
    dateDiff = parseInt(initDate.valueOf()-preDate.valueOf());
    //calculate rate of increase
        ((songs downloaded in previous time)/time)*incr
    rate = countDiff/dateDiff;
  </code>

Next, a little adjustment is applied. As mentioned in one of the Decisions above, the adjustment ensures the guesstimated counter rises a bit slower than we'd expect a real counter to rise. An underguesstimation is a good thing, because it means when we next perform a real update using server-side data, there's not much chance the counter will drop. In general, we'd expect it to jump up a bit after this alteration, but that's okay, because the Guesstimate is not perfect, so it will always jump one way or the other anyway. The 80% adjustment just about guarantees the jump is northward, which is more in line with user expectations.

<code>
    rate = rate*0.8;
  </code>

As well as the once-a-minute server call, there's the once-every-100-milliseconds counter repaint. <tt>runCountdown() handles this. With the rate variable re-Guesstimated once a minute, it's easy enough to determine the counter value each second. incr is the pause between redisplays - 100ms. So every 100ms, it will simply calculate the new Guesstimated song quantity by adding the expected increment in that time. Note that the GMail counter example discussed above calculates the total figure each time, whereas the present algorithm gradually increments it. The present algorithm is therefore a little more efficient, although more vulnerable to rounding errors.

<code>
    //multiply rate by increment
    addCount = rate*incr;
    //add this number to counter
    curCount += addCount;
</code>

And finally, the counter display is morphed to show the new guesstimate. The show was all over when the tally reached 500 million songs, so the counter will never show more than 500 million.

<code>
    c.innerHTML = (curCount<500000000) ?
      intComma(Math.floor(curCount)) : "500,000,000+";
</code>


Related Patterns

Periodic Refresh

A Guesstimate can often be used to compensate for gaps between Periodic Refreshes.

Predictive Fetch

Predictive Fetch is another performance optimisation based on probabilistic assumptions. Predictive Fetch guesses what the user will do next, whereas Guesstimate involves guessing the current server state. In general, Guesstimate decreases the number of server calls, whereas Predictive Fetch actually increases the number of calls.

Fat Client

Guesstimates require some business and application logic to be calculated in the browser, a characteristic of Fat Clients.

Visual Metaphor

A marathon runner receives precise distance values each he reaches a checkpoint. A high-tech counter can be strapped to his body to provide a good estimate of the distance since the last checkpoint, allowing the overall distance to be estimated.

Want to Know More?

Full walkthrough of the ITunes Counter on Michael Mahemoff's blog.