Segment or Die: The Mechanics of Segments

Part 1: Basic Survival Skills

If you have not yet discovered that segments are the life blood of analytics then consider the following as a quick review of Web Analytics 101, just in case you fell asleep through the whole course:

Measures seldom provide useful information without normalization
Metrics that normalize measures are meaningless without context
Trends that provide context are pointless without reasons
Reports that provide reasons are useless if they don’t support actions

This pretty much summarizes as briefly and tersely as possible the good, bad and ugly of web analytics. These same “rules” apply to any form of analytics so the following applies to almost any form of data analysis. These statements make perfect sense if one understands what is meant by normalization, context, reasons and actions and how segmentation is a vital tool in each of these analytic processes.

Central Concepts

Normalization is the act of applying a metric or scale to raw counts or measures. This is typically done by comparing two measures at the same time that may be independent but have the same proportional scale to form ratios. However there are many different ways of normalizing data that can be considered for different context.

Context is any additional information that helps to determine if a metric value is something important or not. Usually this is done by comparing what is measured to what is expected in which case the metric characterizes novelty or deviation from the norm. Technically the terms for importance are significance and confidence but also impact and priority.

If a metric deviates significantly from what is expected then one must find a reason for the difference and either modify the model that predicted the change (adapt) or effect changes in the environment that will bring measures in line with expectations (control). If the reason does not support either of these actions then it is useless from the point of aiding the business.

In each of these processes the ability to segment measures by site and visitor characteristics plays a vital and core role in converting useless, meaningless, pointless data into valuable, meaningful, actionable insights. So in this sense we either learn how this is done or pack up and go home.

How Segments Are Used in Analysis

Before we look at how to convert useless measures and metrics into actionable insights, let me explain how segments are used in normalizing measures, providing context for metrics, understanding the reasons for trends, and highlighting insights that are actionable. To illustrate the steps we will look at an example metric that should be familiar to everyone and has been much maligned and debated lately — bounce rate.

Step 1: Determine what is being measured.

Seldom what we can measure is what we are really interested in. What cannot be observed directly are called latent parameters and must be inferred from observed or manifest parameters.

The latent behavior behind a bounce is the use case where visitors seeing immediately that the site is not relevant to their task and not what they expected, immediately leave the site to never return. This has to be distinguished from other latent behaviors where the visitor is satisfied with the visit even though they may have viewed only one page.

This includes visitors that came to the site for specific information and found what they needed and left, likely to return later. Also visitors that spend longer than 30 minutes on a page so that the session times out and are latter picked up by a new session where the visitor continues browsing the site. Finally there is the possibility that the page did not load so that the visitor had no choice but to back step from the site (sometimes called the zero page view session).

All these behaviors can be confused as the same unless we can find a way to separate out and distinguish them from the measurements. In a word – segment.

Step 2: Derive the measures for the latent behaviors.

The measure that is fundamental to a bounce is the count of single page view (SPV) sessions for a given period of time called the report period. If we define session length as the number of page views by a visitor during a standard session, then plotting the number of sessions for each length will generate a distribution that is near zero at zero, rises to a peak at some number N > 0 page views and trails off to towards zero for very long session lengths.

This histogram distribution should very closely approximate a Poisson Distribution, which in the case of measures that must be greater than zero is the expected random distribution equivalent for the Gaussian Normal Distribution. So from a measurement perspective it is not unusual to have a large number of single page view sessions, especially if the peak, which is the mean of the Poisson, is near one. Actually the distribution predicts non-zero number of zero length sessions, which we may not detect but could be associated with page views that do not complete loading or as part of the single page view session count (in the case where on load events are not measured).

So how do we take this measure and apply it to the latent behavior?

First we need to separate sessions that began with internal referrals from those that had external referring domains. The reason is that the only way that a session begins with an internal referrer is if the visitor had previously visited the site from an external referrer and the session was prematurely cut short. Though these are single page view sessions they cannot by definition be considered actual bounces.

The distribution of session length for this segment provides a control group indicating how truncation errors caused by arbitrarily terminating sessions effects session length including false declaration of bounces. This is more an indication of how quickly visitors consume content during a standard session, where single page view sessions indicate points of pausing or interruption, but not bounces.

For external initiated sessions we would like to further segment by visitor state – new & returning visitors. Again the latter segment does not meet the criteria for a bounce. Clearly returning visitors have returned. So this gives another group similar to internal initiated sessions. In this case the prevalent behavior may be visitor frequently returning to the site to consume specific information or using the site as a reference source. To determine if this is true would require some other measures such as return frequency or time between sessions for repeat visitors.

For the population of single page view sessions by new first time visitors, the resulting session length distribution will change over time as new visitors either go to the next page or return for another session. So whatever is measured is an upper limit or overestimate of the number of actual bounces.These have been defined as hard bounces whereas bounces by returning visitors are soft bounces which has been recently discussed by Kevin Willeitner at Omniture Industry Insights.

So now we have four measures of single page view sessions as well as the session length mean for the general population and 3 different visitor segments representing more or less three different behaviors – internal browsing behavior, repeat visitor behavior, and new visitor behavior. Keep in mind that these measures and distributions are the result of observing behavior through our standard session lens that truncates sessions after 30 minutes of inactivity. The internal browsing behavior gives an indication the aliasing in the other measures.

Step 3: Normalize the measures to form a metric.

The counts by themselves are not very informative without understanding what is normal. For example, I could give you bounce counts of 10, 100, 1000 where the latter seems high. But if I scale these numbers by the maximum number of bounces possible (total number of sessions) where 1 indicates all visits bounced and 0 no visits bounced, then normalized bounce counts of 1, .1 and .001 respectively for the same raw counts above presents an entirely different story.

Bounce rate is defined precisely this way by the WAA – number of single page view sessions / total number of sessions within the same reporting period. This scales the rate to an interval between 0.0 and 1.0 as well as provides a metric that is independent of the number of sessions. In fact it is assumed that the bounce rate is intrinsic to what is being viewed (page content) or expectations set by the channel (relevance) and independent of traffic. In this sense, the rate is a likelihood that a visitor will bounce from any given content or marketing channel.

There are however cases where this assumption is not true. Since I am illustrating the notion of normalization and that there are alternative methods for normalizing measures, I will take a different tack in measuring bounce rate than the accepted standard above.

Step 4: Isolate the latent behavior from the observed metrics.

Even if we only compute the hard bounce rate – new visitors with single page view sessions / total number of new visitor sessions. We have not really captured the latent behavior. With new visitors we have only the one visit and the visitor has the potential of continuing the session with an internal referral session or returning later as a returning visitor.

One way to simplify the calculation is to wait a period of time to generate the report such that the likelihood of continuing or returning is very small. What remains is the bounce rate.

However we have more information that has been ignored up to this point – the distributions of session length for internal, returning and new visitors characterized by the mean session lengths. Coincidentally all the statical properties – standard deviation, variance, skew – are computed directly from this mean. We must consider how different are these distributions among the segments.

Poisson Probability Density Distribution

One expects that the distributions will be Poisson probability density functions representing visits of random session length. In other words, the visitors without any preconditions act as Poisson distributed random number generators for session length. Put yet another way, we assume we cannot tell the difference between session lengths of visitors and random numbers generated by a Poison PDF.

If the distribution is not Poisson then either there are several different distributions that need to be separated out through segmentation, or the actual behavior being measured is not random but strongly correlated to a latent non random behavior.

Step 5: Provide the complete context for the metric.

Let us consider the case that new visitors are particularly curious and go through a number of page views in their first visit to find information over customers already familiar with the site and having much shorter length sessions. The expected distribution would be a Poisson distribution for the given mean lengths. The Probability Density Function is given by

P( x | mean ) = mean^x / exp (mean) x !

where x is a given session length conditioned by the mean. Now by subtracting the expected distribution from the observed distribution (assuming the peak is from normal visitor behavior), one finds another distribution. This characterizes a different behavior from the normal expected behavior. This distribution mutates over time into a bounce behavior. The distribution indicates visitors that consume much less content than the average visitor and leave without returning.

If the average session length is much shorter and near 1 then it will be difficult if not impossible to distinguish normal and bounce behaviors even if the length one sessions are numerous. In fact, the bounce behavior is identical to a session mean of 1 and has non-zero counts for session lengths greater than 1. This is the classic resolution problem where signal to noise is characterized by variances equal to 1 and the mean respectively.

With the metric as the normalized bounce rate for hard bounces there are significant qualifications to the metric that depends upon the mean session length. The following table gives the normal expected bounce rate and accumulative probability for different mean session lengths.

Expected rates for zero and single page view sessions for a given mean (peak) session length

Mean	Normal session length rate for 0	Normal session length rate for 1	Accumulative probability < 2
1	.3678	.3678	.7357
2	.1353	.2706	.4059
3	.0498	.1494	.1992
6	.00024	.00148	.00173
9	.000012	.000111	.000123

Step 6: Determine the reason for significant deviations from the expected.

Now that we have provided a context for the bounce rate that includes the mean session length for a given visitor segment and can now assess when a deviation is significant, we must now find a reason when it is significant. In the absence of a predictive model for bounce rate, assume that the rate once determined will remain the same unless there is a direct action (such as changing content) that affects the metric.

If the metric does vary, we must use segmentation to isolate the reasons for the variation. The most likely explanation is that there is a segment of visitors and/or content that change proportions with respect to the general population. Though the bounce rate for this segment is constant, the proportion of the segment varies within the general population. So we are looking for the combination of visitor characteristics and content that “maximizes” the bounce rate and leaves the complement with normal bounce rates (or pure Poisson distributions).

Typically start with rank ordering of page content by bounce rate and then segment visitors by channel characteristics. For example, a page view initiated by a search may have a high bounce rate since the visitor expects only to find an answer and a single page may provide the needed information. This can be verified by seeing if the soft bounces from repeat visitors has a similar rate for the same content.

Through segmentation analysis one eventually has a set of page content and / or marketplace channels that have the highest propensity to bounce visitors. Comparing hard and soft bounce rates for these segments provides a means of determining if the visitor is bouncing because of expectations set within the channel (hard >> soft) or whether the content normally bounces visitors (hard ~ soft) regardless of channel.

Step 7: Report recommendations that are supported by the metrics

The ideal report is the same as the ideal dash-board consisting of a prioritized list of recommendations for presentation to a decision maker. When a decision maker considers a recommendation by clicking on one in the list, a report opens that includes the recommendation with supporting data as well as the actions, which in an ideal system are integrated into the business process management system such that they can be dispatched from the report. This is what I refer to as an actionable report.

When presenting the recommendation the focus is on insight and how the data supports the insight. This is the difference between reporting metrics and illustrating insights. For the bounce metrics we have the bounce rates and session lengths of internal, returning and new visitors as well as expected normal bounce rates and significance. This is the support data for a recommendation.

What is missing is the effect these metrics have on high level business performance indicators. For bounce it is a form of satisfaction (user engagement) indicator derived from visitor behavior – the visitor voted with their “back button”.

To form a mapping between metrics and KPI / KSI goals requires an understanding of the cause and effect relationships and if possible a model derived from deeper financial and statistical analysis. Regardless there must be at least a qualitative understanding of how a decision will affect business objectives. One way this can be done is if the analyst can develop a score or indicator that combines the metrics appropriately to account for weighted impact on top-level performance, and the decision maker comes to trust the indicator. This takes effort and in most cases goes beyond reporting simple ratios to a decider.

Done – not just yet.

This gives the mechanics of how behaviors are measured and through segmentation isolated to metrics that quantify behavior and illuminate the reasons for behaviors deviating from what is expected. With the appropriate processing and reporting of the metrics one can eventually present recommendations supported by insightful views of the business.

Implicit in this discussion as been the assumption that the insights needed to manage a business are supported and attainable from the measures and metrics collected. Once the insights necessary to manage the business have been defined, then one wrangles the data to support these insights by instrumenting sites, defining data schema, and integrating data sources to get the required data.

You may have noticed that I am taking a strong data minder approach to addressing business questions. A data miner would process the data to find correlations and trends that could lead to insights to questions that the business did not even think of asking. Both approaches are useful but minding data takes some additional effort to develop segments that model the business and metrics that are sensitive to business decisions. This will be covered in Part Deux where we establish the link between behavioral and attitudinal segments and segments developed by marketing to define the business customer and marketplace.