From Worst to Best: How to Report Metrics & Measures

Bloggers and consultants sometimes take bold contrarian positions to separate themselves from the pack. I am as guilty as anyone, I admit. Sometimes we (as in the royal I) should have sympathy for our readers (as few as they are) and recognize that all this helps make analytics confusing and perplexing when we pundits take opposing views. From time to time we need to help explain how all these different views can be reconciled and what are the true nuggets of truth. No issue is more controversial and yet more fundamental than visitors vs visits.

Sameer Khan on his blog Key Metrics and Web Analytics recently posted:

Most marketers and analyst are too concerned about the best metrics to focus on [how] to get actionable insights. Obsession for metrics does not always guarantee results. It is equally important to exclude the worst metrics from your analysis. Yes, you heard it right, these metrics can suck your time and lead you nowhere.
Top 3 Worst Web Analytics Metrics & Reports posted by Sameer, 9 April 2010

With this statement Sameer proceeds to list and discuss reasons for the top 3 worst metrics being unique visitors, bounce rate, and session duration. Just recently I also discussed these exact same metrics in Visitors vs Visits and argue that unique visitors are fundamental. How do I reconcile my views with Sameer’s?

I agree that we should focus on insights rather than metrics. But then the article turns to dissing specific metrics. We will return to his original statement but first I must deal bad metrics or more specifically what makes bad and good metrics.

The points he makes concerning metrics I for most part agree in particular his points concerning bounce rate and session duration. How then can I reconcile this with positions I have already taken when the top worst metric is unique visitors or as Google Analytics refers to them – Absolute Unique Visitors? I am already on record for raking GA over the coals for having an implicit definition of visitors and only recently making unique visitors available in custom reports [here]. This is a problem that is endemic of most of the WAP offerings not just GA.

I have discussed at length how visitor metrics (read unique visitor metrics) are superior to visit base metrics for tracking and attributing true visitor value [here]. In fact, I have made identifying the unique visitor and user-agent as a fundamental axiom for web analytic data collection processing [here]. After all, the reason that web analytics is useful and necessary is that we assume that we are tracking an individual not a cookie or IP-Address that smudges everyone together. The closer we come to the assumption that we are tracking individuals the more valuable the data.

So how do I reconcile this position with statements like this:

Third [alleged] benefit is your websites unique visitor number could change if you change the date range. You will have a different unique visitor value for day/week/month/yearly unique visitor. What is the point of using a metric that creates confusion instead of providing insights? … My advice is to use visits metric report with new and returning visitor segment instead. This will provide you more information than any form of unique visitor report can. [emphasis mine] Sameer, ibid.

It seems that Sameer has major support on this point. This from Matt Belkin, the VP of Best Business Practices at Omniture:

Ironically, even though I’ve heard the most pushback from vendors who regrettably base much of their analytics on Unique Visitors, these inaccuracies are not vendor-specific; rather, they are largely manifestations of the Internet itself as I highlighted in my recent post, 15 Reasons why all Unique Visitors are not created equal.

I’ve also suggested that due to these limitations, I often recommend that web analytics professionals use Visits or Sessions in their baseline analysis, as it provides a more accurate and dependable view of conversion, persuasiveness, and ultimately Return on Investment. And to clarify some recent feedback I’ve received, this includes both macro and micro conversions. Measuring unique people with web analytics… posted by Matt Belkin, 11 May 2006 [emphasis mine]

Or more recently,

Before joining ClickTracks, I felt that Unique Visitors were a pretty important, if not the most important, stat to pay attention to. But, I soon learned the error of my thinking. First of all, that metric, depending on which type of reporting you are using (log files vs. java script) tends to be wildly inaccurate (based on IP addresses in log files and cookies in java script).

But, accuracy aside, it is an even more misleading stat, as Jakob Nielsen explains, “Chasing higher unique-visitor counts will undermine your long-term positioning because you’ll design gimmicks rather than build features that bring people back and turn them into devotees and customers.” [emphasis mine]
Unique Visitors Are Not Everything posted by Joy Brazelle – ClearSaleing, 28 January 2009

Or even more recently from the head guru of Web Analytics DeMystified, Eric T Peterson:

It is about time that we all agreed that “Unique Visitor” reports coming from census-based technologies [such as web analytics] frequently have no basis in reality. Further, we should all admit that cookie deletion, cookie blocking, multiple computers, multiple devices, etc. have enough potential to distort the numbers as to render the resulting numbers useless when used to quantify the number of human beings visiting a site or property.

Yes, before you grieve on me with your “but they are probably directionally correct” response I agree with you, they probably are, but fundamentally I believe that advertising buyers are at least as interested in the raw numbers as they are the direction they are moving. I say “probably are” because if you’re not taking the IAB’s advice and reconciling census-based data with data derived directly from people, well, you’re never sure if that change in direction is because your audience is changing, technology is changing, or there is a real and substantial increase or decline. [emphasis mine]
Unique Visitors ONLY Come in One Size Eric T. Peterson, Web Analytics Demystified, 3 April 2009

So with this mounting case of evidence and findings against Unique Visitor metrics and towards Visit metrics, it looks like I have got some “splaning” to do. One cannot dispute that something is being measured whether unique cookies or smudged IP-Addresses. What is being disputed here is: when do these measures provide any value or insight. The core of the problem is not with the metrics themselves but in their use and interpretation and how they are reported. To address these challenges we need to go beyond how measures and metrics are defined to how they should be reported and when they do provide actionable insights.

The Shock of Reality

First need to deal with the hyperbole of the above quotes concerning unique visitors. “Wildly inaccurate!” “No basis in Reality!” “Render the resulting numbers useless!” Then a growing consensus around visits or sessions as the preferred focus for metrics. “Visits or Sessions [provide] a more accurate and dependable view of conversion, persuasiveness, and ultimately Return on Investment” Huh?

There is a hypocrisy in these feigned declarations of shock and dismay. To be able to form sessions or distinguish new from returning visits, one needs a method for identifying the visitor. So if the visitor counts are wildly inaccurate with no basis in reality and otherwise useless then visit metrics are just as inaccurate and useless.

So I am not sure if the intent was to attack the foundation of web analytics. To be fair most of these statements where applied as arguments against specific applications of unique visitors as a KPI [Brazille], as a sometimes misleading and confusing metric [Khan] or as reality based count of actual visitors for IAB Advertisement attribution metrics [Peterson]. These will be dealt with later.

Belkins challenge and conclusions hang out there as an alternate reality that needs to addressed. If left to stand then all of the other points are mute. The primary basis for understanding useful behavior for web analytics stems from identifying individuals. Right?

Faults in Reality

As I have discussed before [here] the “reality” available to web analytics from log files is not the same reality that marketers and managers operate within. It is as though the two realities are separated by a great fault line indicating a discontinuity between these views. The behaviors of individual customers is not directly observed but inferred from the evidence in the logs. This is analogous to how trackers follow foot prints and droppings to determine the path and intent of their query.

However the differences in realities are even more marked and fundamentally different so as to leave the process (and industry) open to hyperbole attacks such as the ones above and Belkin’s in particular. The world observed via HTTP log files is an open system that includes the anomalies of the world wide web as well as the idiosyncrasies of the individuals and spiders that operate on the web. On the other hand, marketers and managers operate in an assumed closed system reality where individuals can be identified and “segmented” into set dimensions of “reality”. One can question which is more real than the other, but the point here is that they are different.

So the primary question that has to be addressed (and primarily by vendors) is: how does one present data from an open system in manner that can be understood by users that understand and operate within closed system views? One way is to state the fundamental axioms from which a closed system analysis can be presented and then attempt to be faithful to these assumptions in the processing of the data. These are axioms (assumptions) and not facts because as can be easily pointed out: tracking cookies and user agents is not entirely the same as tracking individuals.

However there is another question that can be posed: How well do web analytic logs reflect or correlate to actual human behavior? In this case, we are willing to accept that the data comes from an open system and reflects individual behaviors. What we focus on is the signal to noise – does the correlated behavior (trends) rise above the noise (confusion) of arbitrary and anomalous actions by either the technology or the individuals. Fortunately there are techniques and algorithms that can deal with this without necessarily assuming closed system actions or linear cause and effect. There are numerous use cases in search bid optimization, landing page optimization, behavior targeting and user experience optimization that have more than validated the value of the data and its ability to support data based decisions and incremental optimization. But even in these cases, there needs to be a way of bring closed system linear explanations to aspects of the open system non-linear process that is being “optimized”.

Can Wildly Be Quantified?

So if a closed system view cannot be achieved, how far off the mark is the web analytic view? Can “wildly” be quantified and “useless” a measured limit? The answer is yes! If one is concerned with number of browsers not allowing cookies to be set, there are techniques to test this situation at the server and set a flag on the request that cookies have been disabled. These can be treated as IP-Address / UserAgent visitors and separated from cookie visitors. The resulting ratio of the two counts indicates degree of concern. Another useful ratio is the number of visitors that allow session cookies over permanent cookies to verify that session metrics might be stronger than visitor metrics. Don’t be surprised if these numbers of much less than the hyperbole would imply for if the hype were true then it would unlikely that any analysis would be useful.

If you are concerned about cookie deletions and churn, then measure it. One way is if you have registered visitors to your sight then measure cookie id churn for these visitors – it is likely to be similar to overall visitors to your site. Better yet use the registered visitor id to link together the anonymous visitor ids. Another interesting metric is the number of visitors that delete their permanent cookies during a visitor session. One would not expect a big number, but what would happen if you offer a browser cleaning service or software? Or what happens with private browser sessions?

There will always be questions that need to be asked and often will not be answered by a single measure or metric. It is much like postures in yoga. One does not immediately do a back bend twist but first goes through several transitions that set arms and legs to stabilize and constrain the posture so the twist is isolated and the focus of the movement. The focus of a report may be a metric but there are other measures and metrics that most also must be used to isolate and qualify the insights that can be derived from the primary metric. The analyst has to determine whether all this should be presented to a user in a report or presented as more qualitative indicators of efficacy and confidence.

Feynman Diagrams and Web Analytics

If we can quantify wildly what can be done about useless? How do we answer Belkin’s “15 Reasons why all Unique Visitors are not created equal“. Or Peterson’s particularly alarming statement:

… you’re never sure if that change in direction [of unique visitor counts] is because your audience is changing, technology is changing, or there is a real and substantial increase or decline. [emphasis mine]
Unique Visitors ONLY Come in One Size Eric T. Peterson, Web Analytics Demystified, 3 April 2009

There is nothing like a good audit to put these concerns into perspective. An audit is where log files from two different sources are compared and reconciled. The exercise exposes a great deal of what audience and technology are doing together to generate the raw data upon which much of the web analytics is derived. In the early years when customers were transitioning from server side log analytics to client side Web analytics as a Service (WAAS) offerings (Coremetrics, Keylime, and latter WSS and WebTrends), performing audits became rather routine to explain the difference in server and client side counts. Later more formal audits were performed by IAB to reconcile why DoubleClick seemed an apt name for differences between ad system counts and customer counts of ad clicks. I performed many of these audits including IAB audits for our customers and as recently as 2007 participated in the process of converting Yahoo! to client side counting of ad impressions in compliance with IAB standards that audits years earlier had helped to establish.

In the case of Yahoo! there were some properties that had a significant drop in ad impression counts, directly effecting near term revenue. However longer term the value of these properties increased because the traffic quality improved and customer perception of value. Even though marketers and managers may eschew open systems, finance and business thrive on open non-linear systems, attempting to exploit fluctuations to their competitive benefit and eventually determine the true value of a lead or investment.

Audits though at times tedious are never boring because you are in the middle of what is actually happening on the web. A good place to start is separate out the log entries in each log that seem to be following the HTTP rules and protocols where landing-URL, referrer-URL and user-agent fields are complete and filled in correctly. This gives you an idea of the users and robots that are playing by the rules. What are left are those that are not playing by the rules and must determine what rules they play by. You might find love notes in the referrer URLs; obscene messages in the user-agent filed; IP-Addresses that change for the same cookie value; or cookie values that change for the same IP-Address. When comparing logs you may find IP blizzards in one log that do not appear in another log or clicks in client logs that have no counter part in the server log. All these variations are possible because as an open system anything is possible even things that you would not think of.

Now to explain this, one develops lists of all the possible things that could happen. This is similar to performing path integrals in quantum field theory. OK – so for many readers this is not a useful frame of reference, but the principle is not that difficult. Feynman who discovered path integrals, developed a technique for keeping track of all the possible paths of particles that needed to be integrated. These became known as Feynman diagrams that captures what happens as a particle moves along and interacts with other particles including itself.

In diagnosing what happens between events recorded by different logs, one has to develop all the possible paths between each event. These begin to look a lot like Feynman diagrams particularly if there is near infinite permutations of virtual paths between events. Most go from point A to point B without any surprises and the sequence between user-agents and appropriate servers is consistent throughout a session. Others will not make sense and require digging to find the reason: a wayward server that strips queries for URLs; a caching proxy that does not follow protocol and notify the original server of the request; or a rogue scrapper that has not properly introduce itself as spider user agent; and on and on. What is gleaned from an audit is prevalence of normal expected activity and how your system responds to unexpected activity.

In the end one has a view of what is actually happening and perspective on the likelihood of the 100s if not 1000s of ways that things can go wrong. From this exercise one can make an assessment that the value of the data and whether or not it is useless. Performing audits has value in itself, but establish processes that integrate and perform continuing audits are even more valuable. This is not an exercise perhaps for a VP of Marketing but for a Director of Marketing Research, or high paid pundit / consultant maybe more so.

If you measure it, they will come … and optimize it.

Now concerning Brazelle’s concern:

… as Jakob Nielsen explains, “Chasing higher unique-visitor counts will undermine your long-term positioning because you’ll design gimmicks rather than build features that bring people back and turn them into devotees and customers.” [emphasis mine]
Unique Visitors Are Not Everything posted by Joy Brazelle – ClearSaleing, 28 January 2009

To many, forming business rules in analytics can be like trying to make a deal with the Devil – no matter how hard you try to get it stated to your benefit in always backfires. Of course any endeavor requires asking the right question. If you get the question right then the Nobel prize is yours.

Defining the right metrics and verifying that the metrics lead to the appropriate insights and decisions is not automatic and requires diligent effort. One approach discussed [here] is to develop a business funnel that captures the progress of your customers from first introduction to all subsequent interactions with the customer. All sales forces and most business define these.

In web analytics, one must not only form a funnel of the significant milestones in the customers path to conversion, but must verify that the funnel is sensitive to changes in visitor behavior. For example, in the case of optimizing unique visitors, one might find the front of the funnel, “new visitor” sessions, increase but no further progression to conversion where the money is. On the other hand, you might find, after verifying your funnel and attempting to improve user transitions throughout the funnel, that the site does handle increase in new visitors and in deed improves revenue in the end.

Now this is rather obvious situation of metric abuse, but I have seen cases where large savvy companies become obsessed with one metric or high-level KPI and cuts off any ability to improve and optimize processes that feed that metric because intermediate milestone return has no value against ROI. The point here is that the metric is not evil but how the metric is used can be. More generally it is how the business processes are measured relative to a set of metrics and how closely these metrics reflect the real operations of the business. The business that has the most correct model, verified by data, WINS.

Choice of Realities

Now concerning the IAB Audience Reach Guidelines that are the topic of Peterson’s comment.

It is about time that we all agreed that “Unique Visitor” reports coming from census-based technologies [such as web analytics] frequently have no basis in reality. Further, we should all admit that cookie deletion, cookie blocking, multiple computers, multiple devices, etc. have enough potential to distort the numbers as to render the resulting numbers useless when used to quantify the number of human beings visiting a site or property.
Unique Visitors ONLY Come in One Size Eric T. Peterson, Web Analytics Demystified, 3 April 2009

First my reading of the IAB guidelines does not call for us to admit that any unique visitor report “has no basis in reality”. What the guidelines suggest are minimum criteria including client side counting for insuring that the audience counts are “at least partially traceable to information obtained directly from people”[section 1.2.4]. Yes one would like a reality where only real individuals are counted, but when is such a reality ever obtainable in marketing?

Do you really believe that 3.7 Million viewers watch the Fox news every night at 8:00 PM. Or that Sarah Palin’s book “Going Rogue” has been at the top of the best seller list for 30 weeks. Or the Rasmussen Daily Tracker has President Obama at 30% approval. Does any one believe that these numbers can not be skewed or gamed? Nelson Ratings are measures of reality because everyone has accepted the ratings as reality not that they reflect reality.

So it comes down to agreement on a reality and the IAB has provided guidelines for certifying a reality. From the discussion above, there is no reason to discount unique visitor counts (even with all of the qualifications) as not representative of a meaningful reality much less claim “no basis in reality”. The truth of the matter is that every on line publisher is or will be using unique visitor counts to report audience reach and will segment these same visitors to provide break down of demographics and behavior. What the IAB guidelines advocate is that the processes for collecting the data for these counts comply as close as possible to the second axiom of web analytics – The client source of the data is uniquely identified.

Presentation of Metrics as Insights

Now at last Sameer Khan’s original concern can be addressed:

Most marketers and analyst are too concerned about the best metrics to focus on [how] to get actionable insights. Obsession for metrics does not always guarantee results. It is equally important to exclude the worst metrics from your analysis. Yes, you heard it right, these metrics can suck your time and lead you nowhere.
Top 3 Worst Web Analytics Metrics & Reports posted by Sameer, 9 April 2010

Instead of excluding metrics let us focus how metrics (including these worst metrics) can be used to provide insights.

One of the problems that is core to all these discussions is that the reports that are generated out of box from most web analytic providers (WAP) actually provide very little in the way of insights. Almost all generate simple reports with one or two measures presented in each report. These require the user to view several reports and often attempting to compute numbers in their head to get a composite metric useful if not insightful.

Since each report is derived from different methods, confusion will arise when it is difficult to get numbers to add up. One example is discussed [here] for reporting visits and visitors. This is a very cheap and quick way for vendors to generate reports but requires the user to do a lot work including downloading data into spread sheets to get actionable reports that combine a number of measures to compute useful metrics. Sameer presents an excellent example for reporting bounce rates.

Another approach generates OLAP or data marts based upon facts centered on unique pages, visits or visitors. These are generated to support quick queries where the user can select a number of dimensions and measures and add rules to compute metrics similar to what is done in spreadsheets. These work fine for generating “custom” reports within the dimensions of the data mart but discrepancies can arise when comparing results in different data marts such as comparing session based measures with visitor based measures. Many times the appropriate set of measures and metrics are not available in one place.

Now one of the points presented by Sameer concerning unique visitor measures is that the counts vary depending upon the reporting period. For example unique visitors for each hour in an hour report period will not add up to the unique visitors in a day report. Why? Because unique visitors may return several times during the same day. We can understand how often unique visitors return by dividing the number of visits associated with unique visitors by the number of unique visitors during the day. Even with this number it is difficult to reconcile the hour totals with the daily total since some of the returning visitors will return within the same hour. One has to accept that the numbers in each report are correct though we cannot derive the numbers from other reports.

The alternative is a visit report where visits are divided between visits by new visitors and those by returning visitors. But this does not completely solve the problem, though it is nice to have a report where two numbers do add up to the total. (Heart warming actually.) Since a visitor can be new only once in there life time, the daily new visitor counts indeed do add to the total new visits for the week. The same occurs for the returning visits but in this case there is no idea the number of actual visitors to the site.

Some will argue that we never know this but that will be dealt with a little latter. The point here is that even the designation of new and returning depends upon identifying unique visitors to know that they are returning, otherwise every visit is new, and every page view a new visit. This is often the case for visitor’s that have disabled their cookies. Hence the motivation of identifying visitors even if they don’t allow cookies to be set.

If you want to know the number of unique visitors in a month look at the monthly unique visitor report. If you want to know unique visitors for a specific day go to the day report for that day. However don’t expect to add counts from daily reports to add up to the same number in the monthly report.

Unique visitor counts do not form an group under addition! That is the mathematical or combinatorial nature of unique. It is the same problem with Fermi statistics and Boson statistics, or distinguishable and indistinguishable states in statistics, differences arise when dealing with distinguishable vs indistinguishable entities. One measure is not worst than another they are just different with different properties.

From these “bad” measures how does one get “good” insights? Given that the daily unique counts will not add up to the weekly unique counts, perhaps the reason why numbers don’t add up is a useful piece of information. In fact between these two measures (and similar pairings – hour to day, week to month) one gets a metric of the general behavior of returning visitors – how often and when do visitors return. Metrics are distributions more than a single number represented as pie charts or bar graphs. Here we are looking at the metric distributed over hours or days.

Does the hour distribution change from day to day or remain the same? Same for the day distribution from week to week. By segmenting the unique visitors my marketing channel how do the distributions differ? If the visitor returns periodically because of subscriptions, one would expect returns to be sync’d to new articles published on a blog and the metric follows the publishing rate for the site.

What would happen if the visitor returned through a search engine or a web portal like Yahoo home page or buzz generated in social media? What happens if there is an unexpected flood of new visitors to the site or you have initiated a new email campaign to current users?

Answering and understanding these different patterns becomes part of understanding normal behavior on your site. What happens when the metric suddenly jumps or drops or distributions are very different from last week or last month? Drilling down to find out the reason will provide insight.

Guidelines for Good Metrics and Insightful Reports

Here are some guidelines for determining good metrics and how to construct actionable insightful reports.

Raw counts or measures are seldom good or insightful unless making submissions to the Guineas Book of Records.
Metrics are normalizations of measures that capture an enduring characteristic of the web site or visitors and are typically presented as distributions over time.
A good metric is one that changes when either the site or visitor behavior changes otherwise does not change.
A better metric is correlated to a Key Performance Indicator and can be verified to represent the normal business processes linked to the KPI.
The best metric is not only correlated to KPIs but verified to be a precursor or predictor of KPI performance such that changes in behavior effecting the metric translate directly to changes later in the KPI performance.
Actionable insights address a specific business question that allows the user to make informed decisions.
Good insights are not generated from single metrics but by detecting novelty in sets of metrics from the normal characteristics of the site or visitors.
Better insights allow the user to drill down into the data to determine cause of novelty and presents the data that is relevant to the issues to be address in a form that allows the user to quickly understand and diagnose cause.
Best insights provide recommendations for action along with the controls to initiate or perform the necessary action.
Actionable insightful reports address a specific business need with only the information relevant to understanding the need and then provides the means and support to initiate actions based upon that information.

3 Responses to From Worst to Best: How to Report Metrics & Measures

Uri says:

April 21, 2010 at 9:56 am

Valid and reliable metrics monitor the progress of the project. Uri

Pingback: Voice of Customer Analysis « Kraft T Thoughts
Pingback: Visits are a Wet Blanket | Mind Before You Mine

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30