Forrester reports that, in the USA, the companies that are best at data science …
• are twice as likely to be leaders of their segment
• have significantly higher revenue growth and profits
• and: are most likely to be in the size group 1.000 to 5.000 employees.
Now its tempting to read this an endorsement for data science. Also, to infer that data science is potentially the ‘Secret Sauce’ for improving business results among the Hidden Champions of the Mittelstand. There are two big ‘buts’.
First: Correlation is not the same as causation. Second, we need more information to determine what is cause and what is effect. Did those companies in fact become the best at Data Science because they spent twice as much budget on it than the others?
The goal of data driven marketing is to create clarity and enable action. Revealing patterns, trends, and associations, especially relating to human behaviour and interactions sounds like a great idea. But we can’t take data ownership for granted.
Sunand Menon notes that “many organisations assume that if they collect the data and house it in their systems, it must be their data”. But any processing of personal data in the EU immediately falls within the scope of GDPR. So it pays to evaluate data ownership early on, and if in any doubt at all, to get legal advice from a lawyer or Data Privacy expert before starting.
“It’s critical to treat customers and their data with respect.”
Where to begin?
How can marketers get started with data-driven marketing? Brad Brown suggests that managers first ask themselves “Where could data analytics deliver quantum leaps in business performance?” The next steps are to define a strategy for data analytics and implement it. While this may be a valid approach for large enterprises, it does not seem appropriate for the Mittelstand. This scenario describes a technique in search of a raison d’etre. In mid-size organisations its called “putting the cart before the horse”.
In a brief “how-to” article, Thomas Redman advocates these steps to data analysis:
• formulate a question and write it down
• collect the data
• draw pictures to understand the data
• ask the “so what?” question
The “so what?” evaluation tells us whether the result is interesting or important. Many analyses end at this point, says Redman, because there is no value beyond the “so what?”.
Given that Mittelstand Marketers can afford to waste neither time nor resources, it makes more sense to ask the “so what?” question before we even start collecting the data. That way we can focus on the questions that will deliver result that are both interesting and important.
Common data standards
Before we can analyse the data, we have to get hold of it. Data mining – identifying and using the data you already have – is a sensible place to start. And yet this is where the difficulties begin.
Companies often hold data in multiple systems. Systems that were originally designed to serve distinct business units, departments or organizational functions. These systems were often built without reference to each other. As a result, they frequently use inconsistent data definitions and structures – even for the simplest of attributes.
The international standard ISO-3166 for example, defines several ways to describe countries: Alpha-2, Alpha-3, UN M49, Name. (thus: Germany, DE, DEU, 276). To marry up systems that use different definitions, the structure must first be recognised and then translated into a common format before the data can be combined for analysis. In very old systems however, the programmers may not have used ISO codes for standard dimensions and characteristics, which creates further complications and additional work.
69% of organisations are unable to provide a single, comprehensive customer view
Silos make it hard to manage and analyse enterprise-wide data. This example is just the tip of the iceberg. Suffice to say, integrating data from a variety of silos is slow and resource intensive. Companies that have grown by acquisition will know this situation only too well. Rather than try to integrate two completely different systems, the usual decision is to keep one and close down the other.
There’s another factor at work here. The reality is that old systems reflect old business practices. And this has two important implications, both of which are uncomfortable. On the one hand, the way data is stored and processed today is not necessarily relevant for deciding the processes you need today or tomorrow. The reverse is also possible: there may well be types or categories of data that you need for an analysis, that simply aren’t available in the way you want it, because it has never been collected in that manner.
An example of this is the software vendor who sold a bundle of products using a single contract with a single price on the invoice and a single line entry in the CRM. It proved impossible to analyse accurately the market penetration, revenue attribution or competitive situation for each of the individual software products in the bundle. The information could not even be estimated by analysing customer technical support enquiries. The lack of information and insights at the desired level of detail hindered budget allocation, investment in product development and planning of marketing activities.
72% of companies say that managing multiple CRM systems across geographies/ technology silos is challenging
Next up: you may want or need to combine data from two different systems – only to find that they have been designed on a completely different view of the way the world works.
A classic example is the B2B Marketer who wants a “single view of the customer”. To achieve this, data from the CRM should be combined with data from the Online Marketing system. This is easier said than done. In the B2B world, CRM systems are designed around the fundamental unit of a customer organisation – each of which may have multiple contact persons. By contrast, the basic unit in an Online Marketing system is a contact person – and that contact person record can exist without belonging to a business.
Mapping contacts from the two systems against each other causes headaches whichever direction you try to solve it. The headache is that some data simply cannot be integrated – and is therefore unusable for analysis. Any loss of data in the analysis means a loss of accuracy in the evaluation.
“Rubbish in, rubbish out” has been a mantra of computing since the earliest days; the need for data quality acknowledged and understood. But once again reality gets in the way of the two key characteristics of data quality. Data completeness means that if you decide to add a characteristic to a database, you must add this piece of information throughout the entire database. Similarly, if you’re going to collect data, it has to be accurate – both at the time of collection and later on at the time of analysis.
As we know, the world is in a constant state of flux. Dun and Bradstreet – an organisation that provides credit rating services for business – invests huge amounts of effort in keeping its records of millions of global organisations up to date. The company knows only too well how fast the world is changing.
Each minute of an eight-hour working day:
• 211 business will move
• 429 business telephone numbers change
• 284 executive or business owners will change.
Dun & Bradstreet
Faced with large volumes of data and the rapid velocity of change, it’s clear that a one-time data analysis project is going to have a very limited half-life for supporting decision-making.
Accessibility in real time
When data analysis is repeated on a regular basis, data quality becomes even more important. The data collection, cleansing and integration steps have to be repeated efficiently and effectively.
To improve data quality Thomas Redman advocates establishing a process management cycle. The first step is to measure data quality; the second, to decide which approach to use to improve quality. Redman lists and comments on three choices:
1. Unmanaged – not recommended.
2. Find and fix – resource intensive.
3. Prevent errors at the source – the best option.
“Improving data quality requires a cultural shift within the organization.”
Thomas C. Redman
Preventing errors at the source implies a change in practice. Instead of analysis being implemented as an activity (whether manual or batch), it has to be re-designed to become an ongoing process. A more advanced line of thought is to re-design and implement business processes so that they automatically generate the data that is needed for analysis. Data as by-product of daily business, in fact. This transition from activity to process is a central and recurring aspect of digital transformation in marketing.
Data analysis describes the process of inspecting, cleaning, transforming, and modelling data to gain insights that support decision-making.
By establishing baselines, Marketers can identify patterns such as (say) seasonality. By distinguishing between noise and signal, medium-term trends can be accurately identified, enabling marketers to re-allocate resources more quickly in response to evolving markets.
Scott Neslin advocates analysing the sales recency curve to investigate whether or not to invest marketing budget in a customer. He acknowledges that the results can be ambiguous. The sensible approach, he says it to derive an action plan from the data, test it and measure the results.
“A lot of stories emerge from customer data.
The trick is figuring out which story to listen to.”
Scott A. Neslin
For Harald Fanderl, the greatest value of analysis comes from “pinpointing cause and effect and making predictions”. To improve customer journeys, Fanderl examines just the top three to five that contribute most to customers and the bottom line. “Narrow the focus to cut through the data clutter and prioritize,” he says.
What about sales managers – which metrics should they track? Scott Edinger prefers to measure the process rather than the outcome because managers have control over the process; whereas the outcome is determined by another variable which cannot be controlled (the customer).
“Managing the things you can control, will give you the best chance for success.”
The simplest analytics questions can be enormously powerful. Michael Schrage uses the Pareto principle to ask which 20% of customers generate 80% of the profits. And then he iterates this approach to identify the most profitable segments for future action.
“Learn which customers are profitable and which ones aren’t.
It makes it easier to see the opportunities.”
The goal, says Chris Briggs, is to: “make informed decisions and not let the numbers lead you astray.” Though this is far from easy. As Andrew O’Connell and Walter Frick observe, the numbers don’t lie but: “can be slippery, cryptic, and, at times, two-faced. Whether they represent findings about your customers, products, or employees, they can be maddeningly open to interpretation.”
A good analysis provides a marketer with reference data that have predictive power. These insights are more than just a one-off event; they are patterns that describe baselines, trends, relationships.
There are several types of pattern that regularly appear in both the natural and man-made world: standard distributions; time series; 80:20 Pareto relationships; power curves with longtail distributions; direct and indirect causal relationships. Each of these describe a different context for analysis.
The approach to finding these patterns will depend in part on the available resources. If huge amounts of data, of high quality (completeness and accuracy) are readily available from a small number of systems and require little effort to integrate, then machine learning may be a good option. Machine-learning software identifies patterns in data and uses them to make predictions. So the work sequence is to let the machines identify the patterns in the data; and then test the patterns for their predictive power.
But who exactly is going to do the analysis? And what do you do if your organisation doesn’t have the skills or tools in-house? “Small and medium-size businesses are often intimidated by the cost and complexity of handling large amounts of digital information,” says Phil Simon His solution: hire external data scientists via websites such as Kaggle [www.kaggle.com].
“Kaggle lets you easily put data scientists to work for you,
and renting is much less expensive than buying them.”
If on the other hand, the data is lacking in volume or quality, or if integration from disparate systems requires a lot of time and effort, then the best approach may be to begin by narrowing the scope of the project. Marketers do this by focussing on a clearly defined hypothesis before defining what data is necessary and which analysis will prove or disprove the hypothesis.
“In a world that’s flooded with data, there’s too much of it to make sense of.
You have to come to the data with an insight or hypothesis to test.”
Judy Bayer and Marie Taillard
At the very root of data driven marketing is the ability to ask powerful questions. Asking questions is a skill. It is possible to develop it and get better at it, with practice, over time.
One approach is to reverse-engineer the issue and identify the really powerful questions by starting with a clearly defined goal in mind:
• What decision do you want to make?
• What insights will enable that decision?
• What questions will generate those insights?
• What data do you need to answer those questions?
Managers who have internalised their knowledge of a subject and their experience of a field, know –seemingly intuitively – which questions need to be answered and whether it’s worth investing effort in rigorous data-driven analysis. Perhaps this is why Page 7 of the Forester report states: “48% of companies use intuition over data to guide their decisions”.
Human vs machine
So who makes the better decisions – the human or the machine? Andrews McAfee is one of several writes who has researched this area. In his view, “data-dominated firms are going to take market share, customers, and profits away from those who are still relying too heavily on their human experts”.
“When experts apply their judgment to the output of a data-driven algorithm, they generally do worse than the algorithm alone would,” he reports. “Things get a lot better when we flip this sequence around and have the expert provide input to the model.”
MAfee quotes from Ian Ayres book Super Crunchers: “Instead of having the statistics as a servant to expert choice, the expert becomes a servant of the statistical machine.” In other words, the expert’s job is to ensure that the process is: “quality data in, quality insights out”.
“The single biggest challenge any organization faces in a world awash in data is the time it takes to make a decision.”
Which brings us to the issue of what we actually do with the results. It may be a good idea to listen to Tom Davenport ‘s comment on decisions. In the final analysis, there’s not much point in investing time and effort in data-driven marketing, if your management team can’t or won’t act promptly on the insights.