Desperately Tracking Susan: Online / Offline Behavior

For my next post I was going to write something on non-linear methods for detecting novelty in time series data, when a series of posts popped up on my “twiky” from MineThatData. I have just started following Kevin Hillstrom, who is the mind behind MineThatData. He comes from a long and extensive career in data marketing even before there was web analytics. His forte is Online E-Commerce and Retail Brand Marketing, a rather large and important niche in web analytics. I am just now becoming familiar with his work and opinions on things and am looking forward to learning more. Typically I don’t pick someone out of the cloud and start commenting, but in this case his laconic twitter aphorisms have been needling me to the point that I have to respond. In a way – it is germain to my original intent.

Realizing MineThatData has a focus on retail brands, consider the following sage advise:

minethatdata If you are a web analyst at a retail brand, make it the focus of your career to link offline and online behavior.

minethatdata Web analysts who link online/offline behavior “walk on water”. It’s more important than your actual analysis!

I like the little kick on the last one. Does he mean if you can connect online/offline behavior without analysis you are like a god? Or once declared a god, the analysis that got you there can be forgotten? This little cynical twist seems to be a trade mark of his style. Like this sequence a couple of days ago:

minethatdata: Talk to your CFO, s/he will share with you if your business is ethical: RT @priteshpatel9 Some businesses keep profit info secret. 5:34 PM Mar 30th via web

minethatdata: Ask your CFO if s/he agrees, maybe: @dawescott The value you give up with coupons is what you pay for accuracy of ROI. 5:35 PM Mar 30th via web

I know the answer to the last one. “Show me the profit!” This means developing business models based upon financial data, arguing the fine points of cannibalization rates (fancy term for double counting attribution), and eventually showing connection between the data you want and profit in the form of a P&L statement. Not the typical way one proceeds with analytics, which is often considered a cost center rather than a profit center. However, analysis must be able to show how data or more appropriately knowledge leads to profitable actions.

Then the gauntlet is thrown:

minethatdata: What do u think is the best way to demonstrate social media ROI within retail stores? Don’t say coupons!!

minethatdata: Anytime u use coupons to track ROI, you are making customer do extra work and u r giving up valuable gross margin $. Ugh! 12:02 PM Mar 30th via mobile web

Damn, coupons was the first thing that popped into my head. Often in web analytics we feel very clever working out ways to get cross tags into different channels – a special TV promotion URL, an online coupon that is brought into the store, special numbers to call centers, bar code badges in blogs that can be scanned in from my phone at the store. Sorry the last one has not been invented yet. How clever we can be. But the key point Kevin makes: we are making the customer do extra work and at an additional cost.

So this comes down to work that we analysts can perform to track both online and offline behavior that does not require additional cost besides our salaries, consultant fees and WA resource cost. I think I got it. This is not going to be easy tracking Susan.

Now I realize choosing a variation on a movie title that is 25 years old (gad, has it been that long) has it’s risks. But consider a Want Ad now called a Tweet seeking an allusive enigmatic person, think customer, and searching here and there for Madonna, obsession with connecting online and offline behavior. It seems to capture the gestalt, wouldn’t you agree?

As you can see from my previous posts, I have focused on the basics and fundamentals of web analytics. The primary reason is that there is no agreement on the basics and fundamentals, so I have felt that I needed to deal with these so that everyone knows where I am coming from before getting into the really bleeding edge stuff. Perhaps bleeding edge is over-rated.

minethatdata: Is it me, or have conferences/webinars moved too far away, are too futuristic for the mainstream marketing organization?

I could not agree more. As it turns out, with what has been covered so far – – I can address this issue. But because it requires a great deal of work on the part of the analytic and data marketing efforts, the steps will not be easy.

Tracking Multi-Channel Behavior in 5 Difficult Steps

This is not a problem that is confined solely to retail trying to understand how online behavior effects in store profits. The same problem arises when trying to understand how any mix of different online and offline marketing channels effect sales, or in marketing relating brand and direct product marketing efforts. In all these cases, the processes are open and not subject to strict closed system experimentation.

Though individually there may be rational causal relationships, in aggregate, because of the innumerable variations possible, the relationships among channels is non-causal and any correlations are non-linear. So in the end as data analyst we depend upon drawing and correlating trends in different data streams to understand cross channel behaviors. Getting to a point were we can do this analysis is the hard work.

Step 1: Mind Your Data

Before you can mine the data for insights you must mind the data for consistency and quality. As mom would say, “Mind your manners.” I would say, “Mind your data”. There are two basic personas for data analysts – miners and minders. I like to refer to the latter as data wranglers, who after determining what data is needed go out and instrument the web site, store or call center to collect the data and establish processes to ensure timeliness, completeness, and quality. An extreme form of minders are the paranoids that typically sit in IT to defend the site against attack and sometimes are overzealous in preventing minders and miners in collecting the data they need. However all these roles are necessary and must work together to ensure that not only the site is protected but that quality data can be collected and analyzed.

So the first part of this step is getting these personas and roles identified and coordinated through a data policy and data governance plan.

I came into analytics from AI as a miner ready to perform magic on data collected from user’s interactions first with Window’s applications and then focusing on the browser. At some point the physicists training kicked in and I became concerned about the mechanism for collecting the data – synchronizing server clocks, tracking down connection drops, and detecting and identifying anomalies in the data. Whereas a paranoid would attempt to block unwanted hits, as a data wrangler I wanted to collect the data and as a miner understand the patterns in this anomalous traffic to see if they can be separated from actual user behavior. So before you can do the magic you must do the work.

I have a specific and narrow definition of web analytic data that covers the event streams we can associate with individual users or user-agents. In “The Axioms of Web Analytics” I define the 5 fundamental axioms to which the data is assumed to conform to support the types analysis we claim can be performed. These same axioms can be applied to non-web properties such that we are collecting user based event streams from your store and call centers. If you are really serious about tracking and exploiting visitor behavior, you as the analyst cannot assume that these come to you directly out of the box from a vendor.

The difficult part of this step is that you have to work with different people and disciplines to get to the point where you have the control needed to actually mind the data. Within a large organization this control most likely does not reside with a single individual. One of the elements of the data governance plan will be defining the process of how data collection and processing will be managed and controlled and the interest of various stake holders will be coordinated. The plan and processes will expand as your companies maturity expands so the plan has to have in place the mechanisms for changing.

If you are responsible for instrumenting the web site for data collection either client side or server side, develop a strong and enduring relationship with web development and IT. Let them know that you require an agile and independent means of managing and changing the meta data involved with the content on the web site. They can work out how this can be done, but in the end you will need the controls to make sure that you can get the data you need and continue to make changes as you mind the data.

The best approach is to collect the data from the actual processes and work flows that come together to build and deploy the web site. With dynamic content, this is especially critical. Anytime you have content authors, web developers, or IT engineers performing a special task just for web analytics, it is a task that will be the first task to be cut when schedules are tight and will never be done consistently in the long haul. So you will have insert yourself into everyone’s business, trying to get consistency and discipline in the development to allow extracting data latter after deployment of a web site or enterprise application.

A good place to start is see if you can get a consistent naming convention for each web page served. Good luck with that. If it is an after thought once the site has been already deployed you will be a candidate for instant failure. Once you have an approach agreed to by all parties and implemented, collect the data, track down the violations and go back and get them fixed. Part of minding is being tenacious and insistent on following procedure – in other words – a pain in the butt.

If these processes are not set in place and changes are not agile and independent of the development cycle, then getting consistent quality data will fail … with absolute certainty. Vendors can help set this up but in house methods and procedures will have to be in place to succeed. My approach, which gets some push back from data miners, is to treat the web analytic data with the same data governance and policies used for financial data and business transactions. This typically means Sarbanes Oxley (SOX) compliance such that compliance to stated policy can be enforced and confirmed. The motivation for this is that one may want to use the data as integral to the core business process and will have to comply with SOX since it is customer (all be it anonymous) data. So its a case of planning for success, if you are putting the effort in to collect data why not anticipate that it will be useful data. If not achievable near term, it should remain an objective to guiding policy and process long term.

Step 2: Truly value every opportunity you have to interact with your customer.

“Customer centric” is a term bandy about these days. If you plan to do multi-channel tracking of your customers, then you will have to take customer centric to heart. It is one thing to say, “I have some log files and if I put a persistent cookie on the user-agent perhaps I could get some useful information.” and another to say, “I would like to know what my customers are doing through my online and offline marketing channels and track them to my stores to better promote my brands and increase profits.”

In both statements you are collecting data on individuals so at a minimum you must disclose how the data will be used and how their privacy will be protected. A privacy policy with P3P compliance and enforcement is a given. To work towards the second statement, you will have to be willing to append that statement with “and will bring increased service and value to our customers.” The business will need to reconcile profit and customer service.

The ideal business model would be that increase service to a customer directly increases profit, but this may not be practical in reality. Providing a coupon discount has value to the customer but reduces profit, but you have gotten implicit approval for tracking that customer from online to offline in the store because the customer had tangible value for disclosing that information. There are of course more covert methods for tracking customers that don’t require giving away profit but in the end your customers must trust that the information you collect will not harm them and must in some way benefit them over those that do not disclose or opt-out.

It is important to know the true value of contacts with your customers and how your company provides value to them right up front at the start because you will need to expend resources to not only collect the data, but also enforce the data policy behind the collection. Also you have to be able to communicate both trust and service to your customers. Your responsibilities as an analyst will be to protect both the data and the customer so that the investment is not wasted.

As a data wrangler, you will need to value every opportunity with the customer and collect as much information as possible (consistent with data and privacy policy) with each contact. This means being able to track your customers regardless of the channel or medium of the contact. (This also means tracking every external to internal site crossing not just session entries.) What are all the ways you as a business interact with your customers? At each of these points there must be ways of bringing data concerning that contact into a central database. In most cases, there will need to be tools and processes in place at the contact points to facilitate this collection.

As a web analyst, you will need to move from visits to tenaciously tracking visitors over many visits both anonymously and as registered users. You will have to go even further and deal with cookie churn to relink data that might be disconnected by a discarded cookie. One of the vendors that has been exceptionally good at keeping visitor state linked and exploiting all potential contacts with the customer including call center and register inputs is Coremetrics, whose niche happens to be online / offline retail brands. The down side is that they were also brought to task for doing this, emphasizing the need for a clear policy on privacy and data use. Full disclosure was this volatile situation was defused. See “Visitors vs Visits” for a discussion on how to make this transition in your metrics and “Illustration of Skirting Axioms: Unique Visitors” for how vendor tools can make tracking unique visitors difficult.

The reason why this is important, if not self evident, is that there exists an accumulative knowledge base of customers and visitors such that if a coupon is necessary to link online and offline channels, the information gleaned is not temporary but has much longer lasting value. In P&L terms, the data acquired can be leveraged by multiple marketing campaigns and the cost spread over more revenue opportunities. That is the key of both minding the data and truly valuing customer contacts, their is a consistent long term memory of that customer that can be used for both the detriment and benefit of those customers.

It is the potential ramifications of that memory that have to be addressed directly and enforced to the protection and benefit of the customer. Your most loyal customers will have rather detailed and intimate profiles accumulated over time. Now you will need to work to protect the personal information in that profile while at the same time extract useful segmentation data that can be combined with anonymous behavior data for that same visitor. This will have to be done by clear policy disclosure and enforcement. During the policy formulation, you will need to deal with sticky issues such as correlating anonymous and registered user ids, cross correlating visitor data from different sources and channels, and ensuring that this effort is for the benefit of the customer and not used against them by the business. Not easy but necessary.

Step 3: Establish and understand normal for your business.

Even with the most benevolent intentions and conscientious efforts, it will not be possible to connect customers through all possible channels. For most initiatives you will have to rely on determining how a marketing campaign effects brand and traffic by how it effects your normal business patterns. To do that you have to establish and understand your normal business patterns both online and offline.

An excellent example is Apple, Inc. They have achieved a seamless boundary between their online properties and their offline stores. That is because their stores and call centers are actually online! I go to the web site to set up an appointment with the Genius Bar and latter go to the store and talk with the genius and pickup an Airport Drive while I am there. When the store assistant swipes my credit card, he asks would I like a paper receipt or receipt sent to my email (which he has already because I have done previous online purchases). There I have been seamlessly tracked across 5 different channels and functions and did not get even a penny discount. I am happy because the process makes sense and in a way extends online experience into the store.

Now what about those “I’m a Mac, I’m a PC” or the award winning Silhouette iPod commercials with billboard tie ins? There does not seem to be a way of tracking these to online or stores. In these cases you have to go with correlating effects between marketing channels and business patterns. Since these ads effect both brand as well as generate demand, they have to be evaluated through more traditional methods for brand and demand marketing – e.i. did brand awareness or product demand increase. However in the case of Apple having the understanding of their customers normal behavior patterns must add additional potency to their marketing analysis.

Now we are coming to across point between the minders and miners. Before handing over this step to the miners, ask yourself, “Do I mine my financial database for profits?” The answers is of course no. There are very rigorous accounting procedures that record and verify every penny in and out from which the profit is eventually calculated. On the other hand, you may indeed want to mine your financial data for identifying areas of inefficiency or waste, or determine the effect of specific decisions on cost and profits.

The same relation holds for analytic data. Though the data wranglers may have done an excellent job in the first two steps, these efforts will be of no value unless the data is reported and acted upon by the business managers. Just as managers must know the actual cost and margins effecting profits before they can act on the insights and trends from data mining, the same managers need to understand the measures and metrics effecting key performance indicators (KPIs) of the company before appreciating and acting upon trends and recommendations from data mining.

We will deal with the mining part in the next step, here we want to establish the cycle where data is collected and reported in a way that decision makers can monitor their domain and take actions as necessary as part of the normal business process. This is the primary function of most web analytic tools but again one cannot assume that the vendor offering will solve all problems. The tools today are great at configuring custom reports and managing the work flow for distributing reports. However more times than not, data still has to be downloaded into spreadsheets and combined with other data to get actionable results.

Manual collecting and manipulating data may be sufficient at the beginning but each manual effort will be as grit to a sustained process eventually bringing it to a halt. Eventually these manual operations will need to be automated and data stored and accessed in data warehouses. The reporting will likewise need to be automated starting with moving beyond the dashboard that come with vendor tools to dashboards and drill down reports that combine and integrate all sources of data and business intelligence. We will return to marketing and business dash boards in the next step.

What is essential here is establishing the set of metrics that reflect your business and confirming and tuning these to your actual business cycle. Introduce a new product or service, change a policy, or initiate a marketing campaign, can you observe these effects in your data and are the effects what you expected? If not investigate, modify and improve the process.

I believe that funnels are a simple but eloquent way of incorporating the effect of your customers behavior on your business. Read “Funnels and the Paths They Make” for details on how this can be used to establish and understand your business as measured by your metrics.

All the cool things that one can do – A/B and Multivariate Testing, behavior targeting, optimizing marketing performance, predicative analytics – emerge from a well established foundation of analytics and decisions based upon the analysis. This means that we must not only trust the data quality and timeliness but have verified that the data that is collected reflects the real world behaviors of our customers and that the decisions we make effecting our customers can be measured by our metrics. Until we can establish and maintain this environment, all the cool things we want to do will be for naught and instead quickly find the holes in that trust and work against our objectives.

Once the business is operating on a virtuous cycle of report – action – evaluation and adopting data driven decision processes then you are ready for the next step. I told you these were not easy.

Step 4 – Detect novelty and adapt to change.

Now let the miners loose to reap through the data and find the nuggets of wisdom concealed within. Let them develop the trends and discover the patterns that characterize your customer’s behavior across all channels using Wisdom of the Crowd approaches (Baynote) or statistical analysis for correlations and significance (SAS); Bayesian classifiers and cluster algorithms to discover unexpected visitor segments (Interwoven); neural networks and genetic algorithms to form non-linear behavior models and test potential actions (Certona); or Kalman filters and Markov simulations to set up predictive models and adaptive controls. Go hog wild. All these methods have been demonstrated to be affective in analysis of the data.

Also while your at it, bring in your business analysis and economist to build models of you business from your data and suggest what further data is of value (ForeSeeResults). Make sure that they can give you a value for the data so you have some idea of the budget limits for acquiring the data. In most cases the data will be available from 3rd Parties for a fee, or obtained through surveys, or even laboratory test to confirm theories. Since you have established how data effects your business, these additional request will have at least some value context associated with them.

However the first task to data miners should be: characterize the normal trend and detect changes or novelty from this trend. This can be difficult but usually is not impossible. At first it may be sufficient to characterize your business in fuzzy notions, such as “We do more business during the week than on weekends” – “There are more visitors in the morning than in the evening” – etc. Eventually you will want a more refined model that is more nuanced to your different customers’ habits and needs.

Once establishing the normal trend based upon the ongoing business and marketing cycles, now detect deviations or novelty from this trend and attempt to correlate to actions or behavior changes of the customer. Is the novelty significant? Is it a good change that we want to take action to encourage, or an urgent problem that should be addressed ASAP? All these analyzes are part of what are called recommendation engines that sort through the deviations; determine significance and confidence that the change is real; and then present possible causes and recommend actions in the response to the change.

Returning to the Dashboard. The ideal dashboard would allow the user to quickly assess normal, whatever that is for her particular concern or discipline, and a prioritize list of deviations in the data and recommendations for actions. The ideal report is a single list that allows the user to drill down on each point of novelty and see the supporting data with understandable annotations or graphics and the ability to implement the appropriate action based on the data and in many cases incorporating the data in the action.

A good example is computing a Lift metric for a marketing or business action. One expects a multi-channel marketing campaign to cause some changes in customer perception or behavior. So with lift one is attempting to determine if change from normal behavior can be attributed to the campaign and whether or not the trends in the campaign – reach and coverage – are correlated to the trends in the measured metrics. If our ads are scheduled for 5:00 PM on 20 TV channels, do we see an immediate lift in web site traffic and latter increased sales in the store? For a branding campaign, do we observe more online traffic and positive buzz in the social media channels, which by the way we have been monitoring as part of the normal behavior. Here we are not tracking individuals across all the channels but are correlating actions by prospects and established customers across multiple channels. This is all possible because we have established and understand normal.

Don’t let go of the data wranglers yet, they still have important work to do. First they must work with web dev, marketing and IT to establish an environment that allows testing of variations of content and messaging and even presentation and web flow. Again to make this effort cost effective, you have to separate the various concerns so they can operate independently and in the end allow the testing to proceed independent of deployment – something that will take effort and coordination as well as trust and discipline. Then one can start testing and eventually experimenting with different approaches, without bankrupting the business!

If your business wants to move to more esoteric processes such as behavior targeting or predictive analytics to achieve site optimization, all the above goes into automatic – test and content changes are adapted within automated processes, or at least the changes show up in the recommendation list for approval. So no one is in the backroom sifting through the data, everyone works in the open together to bring it all together.

Step 5 – Rinse and Repeat

Hardest step of them all. Once you have done all that hard work, you go back to step one and do it again. I suppose realizing that there is this step you can coast through your first iteration – get a vendor, tag your site, set up reports, show a dashboard with trends. At some point you will have to do the hard work – mind the data, value the customer, establish normal, detect and act on novelty. This will be rewarding work because you will be able to show some amazing results – 300% improvement in conversion rate, 500% improvement in ROI, 30% increase in profits! The next iteration you will likely not get these gains and you will need to act more shrewdly to get significant gains in ROI and profits.

However as I talk with businesses that made there first couple iterations of this effort and now want to take their business “to the next level”. Don’t expect that this will be done by initiating a magical marketing effort or somehow through clever analysis that leads to doubling the conversion rates. To move the business to the next level, the business model will have to evolve to the next level. There will have to be some business action – introduce a new product, define a new way to bring in revenue, become more responsive to customer need or satisfaction – that is outside of the marketing and analytics.

If the company has truly committed to these steps, then the discipline in marketing and analytics is there to support the transition necessary and bring out the best performance from every channel. So when you think work is done and things can go on automatic – you are back to step 1 and doing it all over again, hopefully with more cooperation and less grieve than the first time.


Tracking Multi-Channel Behavior in 5 Difficult Steps

Step 1: Mind Your Data

Policy: Data Policy

Plan: Data Governance

Compliance: Sarbane-Oxley (SOX)

Focus: The Axioms of Web Analytics

Implement: Data Instrumentation and Collection Processing across entire organization.

Step 2: Value Every Opportunity to Interact with Customers

Policy: Customer Privacy Policy

Plan: Marketing Channel Development Plan

Compliance: Platform for Privacy Preferences (P3P) Policy

Focus: Visitors vs Visits

Implement: Multi-Channel Tracking and Central Customer Database

Step 3: Establish and Understand Normal for Your Business

Policy: Business Policy

Plan: Business Online / Offline Development Plan

Compliance: IAB

Focus: Funnels and the Paths They Make

Implement: Performance Metrics and Business KPI

Step 4: Mine Your Data

Policy: Marketing and Brand Policy and Standards

Plan: Experimentation Strategy and Plan

Focus: non-linear methods for detecting novelty in time series data [Call]

Implement: Dashboards and Recommendation Engines

Step 5: Rinse and Repeat

Policy: Change or Expand Business Operations into new markets

Plan: Site Optimization Plan

Focus: Product and Service Road Map

Implement: Next Phase of Growth and repeat steps 1 through 4.

As the methodology and supporting infrastructure are built out, the execution of the steps should become easier. The primary driver will always be how well the processes driving the business are understood and how closely decisions effecting those processes are driven by metrics. The best data warehouse and premium tools will be for naught if this cannot be achieved. To give Kevin and J Billingsley the last word:

minethatdata: It’s hard to increase sales, best practices can be hollow. RT @jbillingsley Vendors rarely boast about improved sales. Show me the beef! 5:33 PM Mar 30th via web

The beef comes from effort and not vendors. So then imagine my surprise when my world was turned upside down and found that vendors will solve all my problems from this MAJOR ADVERTISING ANNOUNCEMENT. When will I learn not to read blogs on April 1st?


About Timothy Kraft

An accomplished and innovative Web Analytics Professional and Business Intelligence Strategist. Over 10 years experience in development and
This entry was posted in Methodology, Web Analytics and tagged , , , , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s