Segments, Segments Everywhere

“But not a one I can use.”

So there I am, the consummate technologist, sitting across the desk from the consummate marketer. I am explaining a new feature in our recently released web analytic offering, arguing that it is unique in the burgeoning Web Analytics as a Service (WAAS) industry. I used my experience in AI expert systems to construct a working storage for each visitor to a site that could operate in real-time to track and update the visitor’s status as they were “live” on the site. I could then define rules that would continually process the visitor state and fire when a condition became true. Besides initiating events or actions, this could change the visitor state which in turn could fire new rules. These transitions (rule firings) begin to segment the visitors by their actions on-line.

The moment that I said segment, her brow began furrow. “Those can’t be called segments!” she says. “Marketers have their own way of segmenting visitors by demographics such as age, income, locale, personal preferences. If we call them visitor segments, the primary users of the tool will become confused. We will have to call them something else.” she said as though this was the last and final answer.

A little perplexed (note to self – figure out how marketers think someday) we tried to agree on an alternate name. “Visitor state transitions from rule firings” seemed technically correct but would cause customers to unnecessarily stop and think for a moment about what meaning that would have for them. It had to be term they would get immediately if not understand completely. In the end we agreed on the term “conversion” as long as the tool could attribute conversions to unique visitors and report them as behavior conversion groups. The name eventually caught on but not so sure about the “understand completely” part. In the end conversions are both events and visit / visitor segments.

This is likely a conversation that still takes place today between marketing and analytics. Indeed of all the things in web analytics, segments and the different ways segments are used is the most confusing aspect. Everyone uses the term in different disciplines assuming that the other understands exactly what she or he means. They don’t!

So the gut reaction of my intrepid marketer all those years ago was essentially correct. Different segments must have different names. The methodology must relate all these names to provide an understanding of the different dimensions of a person as related to the different roles the person plays – sometimes the person is a demographic to be targeted and others a behavior to be observed and tracked. Though one can argue whether or not a particular segment is useful, one can quickly become the lost mariner on the analytic sea unless we can see how they all work together. Let’s see if we can work this out now with the perspective of 12 years and having picked up in that time some marketing and business skills both as an individual and as a community.

UPDATE: 05/26/2010 – For completeness I have added a section addressing advanced segmentation by Google Analytics and classification by Omniture.

Marketing Segments

The marketing segments that were the concern of my marketer define attributes of an actual person or household. The gender, age, income, education, profession and residence are grouped under the term demographics and become available to web analytics when a visitor registers on site and provides demographic information during registering. These forms of segments must be applied retroactive to the time of registration.

Demographic segment reports should all be adjusted not just the reports that follow registration. Some confusion may arise when reviewing older reports, the values reported continue to change as the demographics of anonymous visitors becomes known. It is always good practice in these circumstances to present the best information known at any given time and to provide ancillary metrics that in this case characterize the lag time and percentage completeness of the demographics presented in the report.

Another potential point of confusion – especially in this age of mobile computing – is attempting to equate residence with location determined from IP Addresses or GPS location. The location is an observed aspect of the visitor behavior. With GPS the location actually moves with the individual! In general one should be wary of equating web analytic measures with marketing demographics. Demographics deals with what is known about a person and web analytics deals with what is observed.

Marketing and User Experience Personas

Another use of segmentation in marketing is the development of personas. Through marketing or user experience research that includes surveys, in-depth interviews, and laboratory experiments the customers are grouped into classes from which a detailed representative user profile is defined called a persona. These personas have names, jobs, and hobbies along with preferences and needs. The intention is to develop complete persons that product designers and developers as well as content authors can identify with and have empathy. By addressing the needs and preferences of these “persons”, the product will address most of the users of the product.

Personas model the market place and drive scenarios for product design, use cases for user experience, and stories for web development. They are the detailed descriptions of the audience for creative designers and content authors. They are a key aspect of developing a web site, but individuals that would be grouped within a persona are not readily identifiable when they come to the web site. So mapping persona to parameters that can be observed during visits is a major aspect of the targeting problem within analytics. This will be revisited latter but for now it is sufficient to say personas are different from segments derived from web analytics.

Marketing Channels

Another important marketing concept related to segments is channel. Here there is some congruence between marketing and web analytics. A channel is a designation of how the customer came to the site or store, or opportunity to reach new customers external to the site or store. The concept is similar to referrer in web analytics except one must have more than the URL on the referring source.

With channel one must be able to link back to the marketing materials that were used in the channel and to the kind of marketing the individual has been exposed. To make these links requires both referring URL and landing URL where campaign parameters have been added to the landing page URL and the referring source has added parameters such as search term or source ID to provide a complete characterization of the channel at the time of coming on site. The URL that is submitted to a marketing channel that contains the channel parameters is called a tracking URL. The tracking URL is the primary mechanism for communicating information necessary for channel identification.

The exact same approach can be used to track internal channels among properties of a site. Properties are typically sub-domains of the site’s primary domain. By characterizing internal channels such as the home page, internal search, or promotional ads used on other properties, one gets further information about the marketing the individual has been exposed. So one can track the effectiveness of internal referrals to a property as a continuation of the external marketing channels. Again this does not come free out of the box but requires planning and discipline. Done correctly one has a further breakdown of the visitors experience on the site.

In the end channels are a powerful tool for both marketing and analytics, but they are not visitor segments. Each channel presents a different audience, but that audience must still be characterized by either demographics or persona resulting from research. In traditional marketing, channels are selected because of the demographics they reach, for online marketing besides demographics there is the addition of personality. The users of Yahoo! have a different world view than the techno-savvy that are the Google audience. Apple users maybe more insistent on better user experience than to someone who has begrudged approach to computers and the Internet. On the other hand there maybe a little of all these people in all of us, so how do we account for these variations in preferences and personality?

Behavior Segments

As has been said above, Web Analytics is primarily concerned with observing visitor behavior, and all web analytic measurements involves detecting and recording events from an individual. There are times when there are durations, such as page view and session duration, or criteria that involve the customer completing a task over long time intervals, such browse 5 web pages or download 3 white papers or complete a request form for more information. In all these cases there are a start times and end times that represent events.

The reason why this point is important is that the behavior of a visitor is characterized by a time ordered sequence of events with gaps in the sequence when visitors are off site. However each time they return to the site is another opportunity to reach and interact with visitor as they are changing and maturing in their intent and objectives. So the purpose of behavior segmentation is to quickly identify these phases of visitor perception as they move through the business’s customer engagement cycle (aka business funnel).

Conversions as Behavior Segments

To return to the difficulties I had with my head of marketing 11 years ago, how do events in visitor behavior become meaningful segments for customers. In theory every event and variables associated with the event can be considered a viable segment. This includes the page content as well as the order the content is viewed. This is the starting point of path analysis and Customer Experience Management to be discussed latter.

Before going hog-wild mining this data there are more fundamental and immediate questions that need to be addressed by the web analyst. When a visitor has made a purchase or completed a major goal of the web site, what did they do prior to coming to that point? If we had that information, would there be patterns that strongly correlate with completing purchases that would permit more to complete the goal task?

Typically the goal events are easy to identify and become conversions (as in the customer has rung the bell and we have won the purchase or lead). At that time conversion inherits everything we know about the visitor up to that time. The visitor state continues to change but the conversion state does not, it is set. The visitor always has the conversion associated with it so that if the visitor “converts” again then the previous conversion along with its state then is inherited by the new conversion.

Are historical conversions that are inherited by a new conversion, drivers (causal factors) in the conversion? Maybe. Maybe not! At least in this scheme the data is available to do the correlation analysis. In truth any event can be declared a conversion and used to group and segment visitors. The criteria for declaring an event to be a conversion is that the event has indeed a strong correlative link with the goal objective and can act as predictive precursors within a causal model.

You can’t sell unless they come to your site, therefore a new visitor is an appropriate event to track as a conversion event. If they don’t buy during the first visit then they have to return maybe more than once to complete the sale. You can’t sell unless they are willing to put product into a shopping cart and then checkout that cart. This sequence of phase transitions as the customer moves through stages of a purchase is called business sales funnel, or visitor live-stages or customer engagement cycle.

We will deal with funnels in a moment, but here a properly defined funnel should reflect the business’s customer engagement cycle and customer value metrics. It must also be sensitive to changes in marketing message and user experience so that modifications to these reflect causal impact on conversion rates. This is not to say that every change will improve conversion rate but that changes will be reflected in the funnel measurements — not only indicating where conversion rates are impacted but also whether or not the phases are indeed causally linked.

I might have a campaign that brings zillions of new visitors to the site with little change in conversions down stream. Time to reevaluate and tune the funnel or look at changes to the site. In either case useful activities prompted by the data.

One of the confusions related to the implementation of this approach is the concept of attribution, which is the way visitor state is attributed to a conversion event. In the complete picture of a conversion state, we find not only other conversions but all the channels to which the visitor has been exposed up to that point. There is also the visitor state from the inherited conversions. So we could construct a graph that illustrates changes in visitor state from new to returning and every subsequent visit prior to the current conversion.

This is a lot of data – not impossible amount of data – but a lot of data for many vendors to maintain. It is also highly normalized data requiring a great deal of computational resources to construct various views for reporting. This is the reason that conversions are expensive and the number limited in many tools. Most vendors have recognized that marketing data from marketing campaigns is important to attribute to a conversion so at least the last campaign encountered before a conversion is automatically attributed by the tool. As a result conversions can be further segmented by campaign parameters. So we got that going for us. However if one wants to attribute other visitor (or page) parameters to a conversion event (including other conversions) a great deal more effort is involved.

Though each tool has their own methods for handling attribution, this difference becomes an area of confusion on how various attribution methods can be applied to look further down the channel and conversion stacks of a visitor to understand how campaigns and previous conversions assist in reaching business objectives. This is one of the first places that motivates analyst to construct an in-house solution with a data warehouse for the download of the complete conversion state from the vendor and to perform the attribution analysis themselves. One should determine if all the conversion data is collected and retrievable from the vendor even though the reporting from that vendor may truncate to simplify presentation. Something to consider when evaluating vendors.

Advanced Segmentation

In Google Analytics there is Advanced Segmentation that allows the user to select select (query) visitor / visit / page profiles by any combination of dimensions and compute relative metrics for the selected populations. This is a very powerful tool for segmentation but has some limitations that one should be aware of if planning to make it a central component of segmentation analysis.

First is the obvious limitation in the number of custom variables (5). It is likely that you will be using a lot of variables for segmenting visitors as this post has already identified a number that are not automatically tracked by vendors. The number can be increased but then there is the implementation limitation that GA shares with many vendors including Omniture – visitor state in cookies.

The visitor state defined by these variables is stored in a cookie and the size of the cookie has an absolute limit of 4K bytes. Though I don’t usually advocate getting into the technical implementation details when defining analytic requirements. This is an exception because of the impact and limitations on collection. The alternative is visitor state held in the collection servers or presentation layer, which can be expensive to implement.

The cookie has the advantage of making the visitor state immediately available for attribution and targeting on subsequent page views, but has the disadvantage of limiting the state data that can be stored in this manner (as well as making visitor state vulnerable to cookie churn). One might also note this approach is covered by a patent that was awarded to WebSideStory and now owned by Omniture. Obviously to store more information, one wants to make the dimension values as compact as possible.

One approach is to generate a genetic encoding of the visitor state where different bits in the genetic code have specific meaning and value. This becomes the visitor’s DNA so too speak and captures all the segments of the visitor in a form that can be acted upon during run-time such as targeting without necessarily decoding the genetic code!

This is because patterns in visitor behavior that strongly correlate to subsequent conversions are captured in the genetic encoding. This is the basic premise of genetic algorithms and their ability to optimize out-comes (evaluation criteria) by selecting and propagating genetic patterns that are highly correlated to these out-comes.

To implement this approach, you will need to encode the visitor state in the cookie and then decode the visitor state in the back end for the reporting and analysis. For encoding the DNA in the cookie, a script on the client must combine all the visitor parameters on a page and encode and appropriately merge these into the current visitor state in the cookie.

All this can be done with an encoding javascript function that can be added to the instrumentation script of the vendor. You may find that a vendor already such a module that you can configure. The result of this function is assigned to a custom variable (GA) or eVar (Omniture) and will become a dimension of the attribution of events that follow.

For decoding, you will need to replace this genetic code with visitor dimensions and values in the vendor’s data stores or download into a spreadsheet to perform the replacement. For Omniture, I have used SAINT to perform the classification where each genetic code as unique key for a specific visitor state that can be shared by more than one visitor. For eVars, SiteCatalyst performs the appropriate attribution such that these dimensions are applied to the current or next event declared.

For both GA and Omniture the data can be always downloaded into a spread sheet for decoding and analysis. This is case where we are using the mechanism of the vendor to instrument pages to collect the data we need and then back-end APIs to collect this data for analysis. Most advanced offerings provide a mechanism for joining web analytic data with data from business intelligence systems such as CRM and Financials. Through this same mechanism one should be able to add dimensions from genetic codes.

One further caution with respect to this approach to segmentation. The visitor or session state will change with each page viewed by a visitor. The segmentation as a result will likewise change and potentially be applied retroactively in reports. Now reports have their own time frames such that conversions within the time frame are “attributed” to traffic in the same time frame but the prerequisites for a cause and effect relationship is not preserved. For example, in week report the conversions for that week will be attributed to all the paid searches within that same week (regardless if the search took place after the conversion or there were multiple search prior to the conversion).

For the built-in dimensions such as new and returning visitors, the interpretation has to be understood with regards that conversions (by definition) change dimensions and subsequent measures. In this case, a purchase by a new visitor (in their first session) changes the visitor state to a purchaser. So to count new visitor’s that made the transition to purchaser, one has to block or count separately subsequent visits since the next visit will change the state to returning / purchaser. Therefore simply querying current visitor / visit state is not a replacement conversion attribution or funnel analytics.

Funnels and Paths

The differences and uses of funnels and paths has already been discussed in detail here. Let’s summarize that discussion relative to segmentation. A funnel is an ordered set of conversion events that is a representation of the business’s customer engagement cycle and customer value metrics. The cycle gives the stages that the customer must go through to complete the objectives of the business. Later we will want to align this cycle with the objectives and expectations that customers may have for the business, but at this point the cycle should capture the opportunities to engage the customer and desired outcome at each stage.

All business organization and processes must be constructed and evaluated relative to this customer engagement cycle. Customer value metrics identify the total value over the life time of the customer and how the business will manage various valued customer segments. The ultimate objective of the engagement cycle is to mature the customer into its most valued categories.

The business and sales funnels are representations of the business and the customers engagements with that enterprise. One may have a different funnel for B2B processes and B2C interactions. There can be different funnels for on boarding that brings new customers on board to business services and another for continuing service and up-sell once the customer has signed up. All this models how the business and it’s success has been defined but not how the business has been implemented.

Funnels are not channels. Though a channel may start a visitor through a funnel by bring new visitors to the business, there should never be a funnel for a channel of any kind. Instead the measures within a funnel should allow conversions to be segmented by channel to compare the performance of different channels. One might argue that a particular channel provides more highly qualified leads and hence has different measures.

If the assertion of quality is true then the funnel measurements for that channel should out perform in at least the initial transitions all the other channels. Otherwise if every channel has its own funnel metrics it will be difficult if not impossible to perform multichannel performance comparisons and synchronized adjustments. Ultimately channel performance relative to a funnel should be understood relative to the visitor segments that move through various channels and the different marketing content exposed via these channels.

What is true for channels is even more true for paths and web flows that implement business processes on-line. Here again the funnel is an objective measure of successful processes. A company may have hundreds of web flows that can be combined in innumerable sequences. The purpose of path analysis is to evaluate and optimize the web flow sequence for minimal path abandonment (an objective that is more a pipe than a funnel (maybe more like the stem of a funnel)).

Most vendors implement this form of path analysis as “funnel” diagrams and reports. Some tools such as Google Analytics allow you “hedge” the implementation. Most tools treat “funnels” as a sequence of events (typically page content events). If any of these events must be executed in the order they are listed, then the implementation is a path report where exits from path represent path abandonment.

GA allows all the entries to not be tied to time sequence — the result is a funnel where each visitor that completes any or all the events regardless of order will be counted. Even when performing path analysis, a true funnel report for the same sequence will confirm that all visitors completed the sequence in order (but be prepared to be surprised).

To summarize, funnels are not segments but are a complementary extension of the persona as an element of use case analysis, as one determines how each persona would move through the engagement cycle represented by the funnel. Certainly there will be different needs that must be addressed by the product or web design, but all customers should move through the same funnel.

The implication is that funnels do not fail, but the processes that the funnel measures can. If necessary to define a special funnel, then like the case of B2B and B2C customers, separate funnels for monitoring and evaluating performance should be devised that are independent of the business process being measured. In the end the resulting set of funnels is a high level model of the business and becomes a critical component of web analytic measurement and reporting as the business goes through its evolution of implementations.

Attitudinal Segments

When faced with uncertainty in interpreting behavior or the need to know why visitors do what they do, the answer may as simple as just asking. With surveys one not only can get demographic data but responses to specific questions that measure customer’s attitude towards the business. This forms a rich data set from which visitor segments can be derived. It is especially powerful if the surveys can be followed up with actual outcomes that correlate visitor attitudes to business performance.

Add to this the further information that can be gleaned from modern text mining algorithms to derive such things as sentiment or stress of voice, one has then an analytic framework that is dual to behavior analytics. In fact these two forms can be described as the two eyes on customers both are necessary and must be balanced together.

Online Voice of the Customer

Online voice of the customer is that part of the user’s voice that is specific to the online experience and user tasks. Besides wanting to know what they intended to accomplish and whether they completed their task, we would also like to know if they were satisfied with their experience. This satisfaction measures quantify customers expressing their needs in their own words.

Typically surveys of this type are initiated at the start of a visitor session and are completed at end of session to evaluate their session experience. Since visitors are randomly selected for a single visit there is not any follow-up on the user’s experience unless combined with behavioral data to indicate actions in followup sessions or surveys sent later to find out what was the final outcome. The satisfaction scores can now be directly related to business out comes and predict likelihood of conversion. So conversions and conversion rates can be segmented by VoC segments while VoC scores can predict future conversions rates. This is how the left and right brain in analytics work together to “see” and “hear” the customer!

Online vs Total

This exact same methodology and process can be applied to call centers and stores such that one has a “total” view of representative customer samples. Just as in the online case, the data is for a single call or visit. With proper questions and follow-up one should be able to develop a rather complete and hopefully seamless view how these three elements of the business work together to define the overall customer experience. This means that all analytics should encompass the entire experience of the customers and value highly every opportunity to interact with them. This has been discussed in more detail here. Bottom line — We must look beyond the session.

Social Personas differentiated from Customer Personas

As for the other sources of VoC in particular the social media of the blog sphere, YouTube, Twitter, and Facebook. There is the urge to cultivate these into viable marketing channels. Before that can happen these persons must be segmented. Some are actual customers expressing their voice publicly with respect to the business and it’s services. Others are prospective customers that maybe influenced these customers or by opinion shapers or influencers that review the business and propagate their message to their followers. Others act as amplifiers selecting messages that are resent to their followers. In this mix is the business itself that is attempting to become an influencer.

Do we use it to brand, inform, or make a deal, the typical marketing dilemma. Before any of that can be decided, the population relative to the business’s offerings must be known and described similar to the personas that were defined above. There are several tools that help bring this data together for analyst such as Lithium and ViralHeat. In that sense social media is just another marketing channel with its own demographics that needs to be characterized and messaged.

In terms of analytics that can be collected from social media, both Peterson and Sterne have separately proposed methodology and metrics similar to branding metrics in traditional marketing. These proposals do not cover the use of badges and widgets to draw customers into a brand network or whether or not this media should be used for direct sales through special promotions. These represent the active engagement elements afforded by this channel. Typically each new “social” channel has brought its own terminology with its new measures and analytics (and companies to support these analytics).

Regardless, none these proposals can be even considered without an understanding of the various segments in this medium and how they apply to the business. For the problem discussed in this paper, these segments are tailored to the needs of the business and the same as customer and prospect personas in marketing. Just as in the customer personas, these social personas must map to behavioral and attitudinal segments that are available in the run-time data stream (if collected at the appropriate place were decisions are made). Now how is that done?

Customer Experience Management

There is a realm of Analytics that has emerged lately sometimes called Advanced Web Analytics, Beyond Web Analytics, Web Analytics Without Borders, or simply Data Mining in archaic terminology that attempts to consider all data together to find patterns, trends, and segments in the data that are not available when each source is consider separately. The data considered in this higher dimensional space will usually result in strong clusters of visitor behavior/attitudinal traits that may have no relationship at all with personas that were defined at the start of the web development. Further behavior and functional analysis on these clusters is necessary to extract behavioral and attitudinal patterns that can lead to clear recommendations for action.

The personas capture the latent (hidden) aspects of the visitor and represent what we want to know about the visitor as a customer – “Ding an sich” for the more philosophically inclined. What we can observe or ask questions about are the manifest (observable) aspects of the visitor behavior or experience. To this point I have referred to this mapping of manifest measures to latent metrics as similar to a tracker building a tracking profile from observed evidence.

Here the analysis is more formalized and the mapping between manifest and latent variables much more rigorous whether using statistical multivariate analysis, econometric analysis or various non-linear analysis such as fuzzy sets, neural networks, Bayesian, vector or wisdom of the crowd analysis. The core premise in all this analysis is that the actual customer needs and preferences are indeed reflected in the measures we have collected from all the various data streams.

In this process the web analysts play key roles to ensure that processes converge on sound recommendations and support quality marketing and business decisions. Their role starts early in the initial setup of the web analytics and continues through to the eventual establishment of a quality data warehouse and visualization platform. The process is by necessity an iterative one since a business will seldom be able to justify the expensive until it has demonstrated for itself that the data and analysis does impact the business.

Tagging to Differentiate Behaviors

As has been demonstrated from the very beginning here, one cannot track behaviors unless one can tag them. When trying to distinguish and track marketing persona from behavior measures the tagging process is further complicated yet even more crucial. Personas are latent segments that must be mapped (or covered) by one or more observed segments. The match in most cases will not be exact.

As an example, consider the case presented by Brian and Jeffery Eisenberg in “Waiting for your Cat to Bark? Persuading Customers When They Ignore Marketing”. They present two user cases that could be the basis of user experience personas that must be considered in the web design. There are users that want and need a lot of explanation and detail before they commit to a purchase (if you have gotten this far into this article you are likely this type of person). The other needs simply the bullet points and if agrees will quickly commit to an action, also if they see long copy they are out of there and gone. So now the dilemma, what and how should marketing copy be presented given these two personas – call them Brian the driller and Jeffery the decider. If only they would identify themselves at the start of the visit.

A web designer makes a brilliant yet unsubstantiated claim that bullet points for Jeffery could be viewed as a study guide for Brian for initiating his drill down. What is not known is whether the order of the bullet points presented to Jeffery is important and whether that order will be make sense for Brian in his investigation. To verify the designer’s original insight, the web analysis measures when visitors move from topic to topic or drill down on certain topics or all topics.

The results of the first test are mixed – not everyone is pure Brian – investigating every topic in detail – or pure Jeffery – going quickly through each topic and converting. The metric that measures the ratio of topics to depth does form a distribution over all visitors with two humps on an otherwise typical Poison distribution that drops off as more pages must be consumed. However the over all conversion rate has not changed and is not significantly different between the Brian-like and Jeffery-esque visitor segments.

Subsequent tests show indeed conversion rate will increase from Jeffery-esque visitors with the right sequence of topics. Also it found out that Brian-like visitors become Jeffery-like once they have consumed the details on certain topics. These topics are not the same as Jeffery-esque visitors. Also the Almond Joy Rule comes into effect: “Sometimes you feel like a nut, sometimes you don’t”, so we can’t over define the boundaries between Brian and Jeffery, sometimes they switch roles.

A subtle point begins to become evident, though we may have captured the essence of Brian and Jeffery with measures of topic and drill down navigation, there may be other elements of the visitor’s behavior that should also be considered to differentiate Brian-like and Jeffery-like states of mind. For example, where is Brian in the customer engagement cycle and what topics does Jeffery find important to drill down into details.

So when developing the tagging for distinguishing marketing and user experience personas, one must look beyond one or two apparent measures and consider multiple aspects of behavior that can distinguish marketing segments. ForeSee Results attempts to have three different measures for each latent variable. This is a good rule of thumb for behavioral measures to persona characteristics as well.

With respect to the measures themselves, they should have sufficient variation for each visitor to allow for fine grain differentiation of behaviors that can eventually form clusters that map to persona (at least somewhat). This detail data is difficult to collate and report without quantizing the results into gross bins and measures since the raw data seldom presents patterns and trends that the human eye can quickly absorb. It is like looking at waves on the ocean, a nice scene but what does it mean? However to detail correlation and trend analysis, the details are critical for the accuracy and precision of the results[*].

Test and Target Segments

Even after all this, there are more segments that have to be defined to be able to map our rich analytic stream to the marketing view represented by personas. With testing there are two additional degrees of freedom. The first is the variation in content and function that different visitors can experience which are designated as treatments. The second is identifying the groups of visitors that will experience the same treatments called test groups or targets. Depending upon the test methodology, treatments are constructed algorithmically to cover the permutations of different elements (called factors) that make up the treatment. The targets, at least initially are random selected test groups that are representative of the general population of visitors. By representative is meant that the proportion of segments found in the general population are the same for each test group. Then assigning a test group for each treatment, one can collect data for each test group that can be compared and evaluated on the back-end against performance metrics.

At the completion of a test, one can choose the treatment that provides the best overall performance across all visitor segments or one may find that a specific treatment provides significant improvement in performance for a particular visitor segment. This latter becomes the target segment for the treatment and requires that the presentation layer be able to identify the visitor as belonging to the target segment at run-time such that the appropriate treatment can be delivered to that visitor. This is what is referred to as behavior targeting since the criteria is typically behavioral characteristics known at content serving. However the criteria can include demographic as well as attitudinal information that may be known about the visitor at content serving.

Mapping Target Segments to Personas

Marketing and User Experience Design develop most of the content and function that make up the treatments. Other times business policies concerning services define what is necessary and product management must provide information and collect the necessary data to both implement the policy and optimize the acceptance of the policy. All of these are developed from a perceptive of the competitive marketplace that inbound marketing has constructed from research, interviews, surveys, and analysis. A product of these efforts, either formally or informally are personas with user scenarios and use cases.

Essentially the site – whether it’s a web site, call center, or store front – is developed and constructed based upon these persona, as well as all the outbound marketing materials that position, message, and promote the business during operation. So how does one confirm this marketing view and make necessary adjustments as the business goes into operation?

At least initially, almost all the function and content is directly related to personas. If one was to form test groups for each persona treatment, run the test, and evaluate the results, one would expect in the back-end analysis to separate visitor behavior segments that were optimized for each treatment and therefore identified a set of visitor characteristics that mapped to each persona.

This would not be an optimal test strategy since it assumes that there will always be a set of visitor characteristics that will correlate strongly with each persona. It may be that the persona does not have strong correlates in the behavior stream (can’t track unless it can be tagged) or visitor characteristics may not be strong predicates for future action or performance. This approach does begin to outline and carve out the problem domain that must be addressed in mapping visitor characteristics to marketing persona.

Path and Cluster Analysis

Eventually the Marketing and User Experience Persona will have to align to what can be observed and measured during run-time. Content authors and creative designers cannot be developing content and themes based upon frequency and recency user measures, or in the user case above drill down depth. These measures are meaningless to understanding the visitor’s intent and perspective. Like write a message for a visitor that has visited the site 10 times over the last 48 hours vs. a message for one who has visited 10 times in the last two weeks. Not quite a well-developed audience description for a creative type but none-the-less a typical differentiation in web analytics reporting. Do these cut-offs even make sense as segments?

An alternative strategy in addressing the mapping problem is to search the visitor profiles for clusters that are significant differentiators or predictors of future behavior. The good news is that these clusters do exist. There is continuing growing set of use cases and studies that demonstrate that the data can be successfully applied to

  1. optimize bid management to CPA objectives [Overture] [eFrontier] [Omniture];
  2. increase relevance in landing page optimization[Omniture Test and Target formerly Offermatica];
  3. use multivariate testing to optimize search creative content [Memetrics now Accenture];
  4. optimize marketing message sequence and content through non-linear adaptation [Touch Clarity now Omniture] [Optimost now Interwoven];
  5. use wisdom of the crowd approach to provide better context for internal search results[Baynote];
  6. use Bayesian classifier to process unstructured visitor content and measures to discover customer relationships and clusters[Interwoven]; and
  7. apply statistical econometric models to predict future performance from customer satisfaction scores[ForeSee Results].

These are offerings and technologies that I have been able to evaluate and have discussed in previous posts. There are many others that can provide sound use cases that continue to confirm that the data properly collected and minded can support statistical modeling and non-linear optimization approaches. Some companies such as Tealeaf, Omniture Discover OnPremise [formerly Visual Science], and SAS are providing the infrastructure to support these detailed analysis.

Summary for the Benefit of Jeffery

After having this post for a couple weeks, I have noticed that most of my readers are Jeffery, so the version that Jeffery would see is now posted as in a separate article that includes updated graphs.  It provides an excellent summary of the material as well as tables for definitions of all the concepts presented here.  Let me know if you agree.


About Timothy Kraft

An accomplished and innovative Web Analytics Professional and Business Intelligence Strategist. Over 10 years experience in development and
This entry was posted in Fundamentals, Web Analytics and tagged , , , , , , , , , , , , , , , . Bookmark the permalink.

3 Responses to Segments, Segments Everywhere

  1. Ophir Prusak says:

    Great (but a bit long) posting.

    I haven’t read the whole thing yet, but a couple of comments on readability:

    I would make the paragraphs shorter. It’s hard to read paragraphs that are 20 lines long (or even 10 lines long).

    I would change to a theme that has a background, or at least some border element to the sides of the text, it’s VERY hard reading content that’s “floating” in the middle of a white page.

    Also, how about a “WIIFM” opening paragraph on why I should care about segments 🙂

    Your content is GREAT but I think there is room for improvement in presentation.

    • Thanks Ophir for the input on readability. Shorter paragraphs – that I can do. I am actually looking for a new theme so your input gives me another criteria and more incentive to do this. This is a rather long post without a clear theme until the more bar – inverting the intro doesn’t make sense but let me see what I can come up with. Thanks for the comments.

  2. Pingback: Segment or Die: The Semantics of Segments | Mind Before You Mine

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s