Approximate, Estimate, Extrapolate, Fuzzy, Guess, Guesstimate, Interpolate, Predict, Probabilistic, Sloppy, Trend
Devi's producing a taxi tracker so the head office knows where its fleet is at any time. The taxi's only transmit their location every 10 seconds, which would ordinarily lead to jerky movements on the map. However, Devi wants the motion to be smooth, so she uses interpolation to guess the taxi's location between updates.
To comprehend system activity and predict what might be happening next, it's useful to have frequent updates from the server.
It's expensive to keep updating from the server.
Instead of requesting information from the server, make a reasonable guess. There are times when it's better to provide a good guess than nothing at all. Typically, this is pattern relates to dynamic situations where the browser is periodically grabbing new information, using “Periodic Refresh”. The aim is help the user spot general trends, which are often more important than precise figures. It's a perfrormance optimisation because it allows you to give almost the same benefit as if the data was really arriving instantaneously, but without the bandwidth overhead. For this reason, it makes more sense when using “Periodic Refresh” than “HTTP Streaming”, because the latter doesn't incur as much overhead in sending frequent messages.
One type of Guesstimate is based on historical data, and there are several ways the browser might have access to such data:
The browser application can capture recent data by accumulating any significant observations into variables that last as long as the Ajax App is open.
The browser application can capture longer-term data in cookies, so it's available in subsequent sessions.
The server can expose historical data for interrogation by the browser application.
Equipped with historical data, it's possible to extrapolate future events, albeit imprecisely. Imagine a collaborative environment where multiple users can drag-and-drop objects in a common space, something like the the Magnetic Poetry Ajax app. Using a “Periodic Refresh” of one second, users might see an erratic drag motion, with the object appearing to leap across space, then staying still for a second, then leap again. A Guesstimate would exploit the fact that the motion of the next second is probably in the same direction and speed as that of the previous section. Thus, the application can, for that second, animate the object as if it was being dragged in the same direction the whole time. Then, when the real position becomes apparent a second later, the object need not leap to that position, but a new estimate can be taken as to where the object's moving, and the object can instead move smoothly toward the predicted location. In other words, the object is always moving in the prediction of its current predicted location. Dragging motion is an example where users would likely favour a smooth flow at the expense of some accuracy, over an erratic display that is technically correct.
How about longer-term historical data, stretching over weeks and months instead of seconds and minutes? Long-term data can also be used for a Guesstimate. Imagine showing weather on a world map. The technically correct approach would be to initially show no weather and gradually populate the map as weather data is received from the server. But the philosophy here would suggest relying on historical data for a first-cut map, at least for a few indicative icons. In the worst case, the Guesstimate could be based on the previous day's results. Or it might be based on a more sophisticated statistical model involving several data points.
Historical data is not the only basis for a Guesstimate. It's also conceivable the browser performs a crude emulation of business logic normally implemented server-side. The server, for example, might take 10 seconds to perform a complex financial query. That's a problem for interactivity, where the user might like to rapidly tweak parameters. What if the browser could perform a simple approximation, perhaps based on a few assumptions and rule-of-thumb reasoning? Doing so might give the user a feel for the nature of the data, with the long server trip only required for detailed information.
There are a few gotchas with Guesstimate. For one, the Guesstimate might end up being an impossible result, like "-5 minutes remaining" or "12.8 users online"! If there's a risk your algorithm will lead to such situations, you probably want to create a mapping back to reality; for instance, truncate or set limits. Another Gotcha is with an impossible change, such as the number of all-time website visitors suddenly dropping. The ITunes example below provides one mitigation technique - always underestimate, which will ensure the value goes up upon correction. With Guesstimate, you also have the risk that you won't get figures back from the server, leading to even greater deviation from reality than expected. At some point, you'll probably need to give up and be explicit about the problem.
Most Guesstimates are made between fetches of real data. The point of the Guesstimate is to reduce the frequency of real data, so you need to decide on a realistic frequency. If precision is valuable, real data will need to be accessed quite frequently. If server and bandwidth resources are restricted, there will be fewer accesses and a greater value placed on the Guesstimate algorithm.
Also, how often will a new Guesstimate be calculated? Guesstimates tend to be fairly mathematical in nature, and too many of them will impact on application performance. On the other hand, too few Guesstimates will defeat the purpose.
Each time new data arrives, the Guesstimate somehow needs to be brought into line with the new data. In the simplest case, the Guesstimate is just discarded, and the fresh data adopted until a new Guesstimate is required.
Sometimes, a more subtle transition is warranted. Imagine a Guesstimate that occurs once every second, with real data arriving on the minute. The 59-second estimate might be well off the estimate at one-minute. If a smooth transition is important, and you want to avoid a sudden jump to the real value, then you can estimate the real value at 2 minutes, and spend the next minute making Guesstimates in that direction.
The ITunes demo below includes another little trick. The algorithm concedes a jump will indeed occur, but he Guesstimate is deliberately underestimated. Thus, when the real value arrives, the jump is almost guaranteed to be upward as the user would expect. Here, there is a deliberate effort to make the Guesstimate less accurate than it could be, with the payoff being more realistic consolidation with real data.
It's conceivable that users will notice some strange things happening with a Guesstimate. Perhaps they know what the real value should be, and the server is showing something completely different. Or perhaps they notice a sudden jump as the application switches from a Guesstimate to a fresh value from the server. These experiences can erode trust in the application, especially as users may be missing the point that the Guesstimate is for improved usability. Trust is critical for public websites, where many alternatives are often present, and it would be especially unfortunate to lose trust due to a feature that's primarily motivated by user experience concerns.
For entertainment-style demos, Guesstimates are unlikely to cause much problem. But what about using a Guesstimate to populate a financial chart over time? The more important the data being estimated, and the less accurate the estimate, the more users need to be aware of what the system is doing. At the very least, consider a basic message or legal notice to that effect.
In some cases, the server exposes information that the browser can use to make a Guesstimate. For example, historical information will allow the browser to extrapolate to the present. You need to consider what calculations are realistic for the browser to perform, and ensure it will have access to the appropriate data. A “Web Service” exposing generic history details is not the only possibility. In some cases, it might be preferable for the server to provide a service related to the algorithm itself. In the Apple example below, recent real-world values are provided by the server, and the browser must analyse them to determine the rate per second. However, an alternative design would be for the server to calculate the rate per second, reducing the work performed by each browser.
As its ITunes Music Store neared its 500 millionth song download, Apple decorated its Apple homepage with a rapid counter that appeared to show the number of downloads in real time (Figure 1.39, “ITunes Counter”). The display made for an impressive testimony to ITunes' popularity and received plenty of attention. It only connected with the server once a minute, but courtesy of a Guesstimate algorithm described below, the display smoothly updated every 100 milliseconds.
2446.034075 megabytes (and counting) of free storage so you'll never need to delete another message.
But there's a twist: the storage capacity figure increases each second. Having just typed a couple of sentences, it's up to 2446.039313 megabytes. GMail is providing a not-so-subtle message about its hosting credentials.
Andrew Parker has provided some analysis of the homepage. The page is initially loaded with the storage capacity for the first day of the previous month and the current month (also the next month, though that's apparently not used). When the analysis occurred, 100MB was being added per month. Once you know that, you can calculate how many megabytes per second. So the algorithm determines how many seconds have passed since the current month began, and it can then infer how many megabytes have been added in that time. Add that to the amount at the start of the month, and you have the current storage capacity each second.
The ITunes counter relies on a server-based service that updates every five minutes. On each update, the service shows the current song tally and the tally five minutes prior. Knowing how many songs were sold in a five-minute period allows for a songs-per-second figure to be calculated. So the script knows the recent figure and, since it knows how many seconds have passed since then, it also has an estimate of how many songs were sold. The counter then shows the recent figure plus the estimate of songs sold since then.
There are two actions performed periodically:
Once a minute, doCountdown() calls the server to get new song stats.
Once every 100 milliseconds, runCountdown() uses a rate estimation to morph the counter display.
The key global variables are rate and curCount:
rate is the number of songs purchased per millisecond, according to the stats in the XML that's downloaded each minute.
curCount is the counter value.
So each time the server response comes in, rate is updated to reflect the most recent songs-per-millisecond figure. And curCount is continuously incremented according to that figure, with the output shown on the page.
The stats change on the server every five minutes, though doCountdown() pulls down the recent stats once a minute to catch any changes a bit sooner:
//get most recent values from xml and process ajaxRequest('http://www.apple.com/itunes/external_counter.xml', initializeProcessReqChange); //on one minute loop var refreshTimer = setTimeout(doCountdown,refresh);
What it fetches is the counter XML:
<root> <count name="curCount" timestamp="Thu, 07 Jul 2005 14:16:00 GMT"> 484406324 </count> <count name="preCount" timestamp="Thu, 07 Jul 2005 14:11:00 GMT"> 484402490 </count> </root>
setCounters() is the function that performs the Guesstimate calculation based on this information. It extracts the required parameters from XML into ordinary variables, for example:
preCount = parseInt(req.responseXML.getElementsByTagName ('count').childNodes.nodeValue);
When a change is detected, it re-calculates the current rate as (number of new songs) / (time elapsed). Note that no assumption is made about the five-minute duration, hence the time elapsed (dataDiff) is always deduced from the XML.
//calculate difference in values countDiff = initCount-preCount; //calculate difference in time of values dateDiff = parseInt(initDate.valueOf()-preDate.valueOf()); //calculate rate of increase // i.e. ((songs downloaded in previous time)/time)*incr rate = countDiff/dateDiff;
This is the most accurate rate estimate possible, but the next thing the script does is to reduces it by 20%. Why would the developers want to deliberately underestimate the rate? Presumably because the developers have accepted that the rate will always be off, with each browser-server sync causing the counter to adjust itself in one direction or another, and they want to ensure that direction is always upwards. Making the estimate lower than expected just about guarantees the counter will increase - rather than decrease - on every synchronisation point. True, there will still be a jump, and it will actually be bigger on average because the adjustment, but it's better than having the counter occasionally drop in value, which would break the illusion.
rate = rate*0.8;
As well as the once-a-minute server call, there's the once-every-100-milliseconds counter repaint. runCountdown() handles this. With the rate variable re-Guesstimated once a minute, it's easy enough to determine the counter value each second. incr is the pause between redisplays, 100ms. So every 100ms, it will simply calculate the new Guesstimated song quantity by adding the expected increment in that time. Note that the GMail counter example discussed above calculates the total figure each time, whereas the present algorithm gradually increments it. The present algorithm is therefore a little more efficient, although more vulnerable to rounding errors.
//multiply rate by increment addCount = rate*incr; //add this number to counter curCount += addCount;
And finally, the counter display is morphed to show the new Guesstimate. The show was all over when the tally reached 500 million songs, so the counter will never show more than 500 million.
c.innerHTML = (curCount<500000000) ? intComma(Math.floor(curCount)) : "500,000,000+";
A Guesstimate can often be used to compensate for gaps between “Periodic Refresh”.
“Predictive Fetch” is another performance optimisation based on probabilistic assumptions. “Predictive Fetch” guesses what the user will do next, whereas Guesstimate involves guessing the current server state. In general, Guesstimate decreases the number of server calls, whereas “Predictive Fetch” actually increases the number of calls.
If you know what time it was when you started reading this pattern, you can Guesstimate the current time by adding an estimate of your reading duration.
 An expanded analysis of the ITunes
Counter can be found on my blog at
http://www.softwareas.com/ajaxian-guesstimate-on-download-counter. Some of the comments and spacing has
been changed for this analysis.