5 Challenges to Standardised Data Analytics Platforms

sky_of_flames_by_ludo38-d51en4qThe data analytics industry is growing fast with large established players vying for market share with exciting new start up companies. With such a range of services available it is very difficult for businesses to know what is the best approach. Should they use a standardised data analytics platform? If so, which one? Or should they build their own “in-house” analytics team or hire a consultant and develop a bespoke system?

 

These are difficult questions to answer and will generally depend upon the business. In this post I will outline a few potential risks of using standardised data analytics platforms and what, if anything,  can be done to avoid them. Standardised platforms are arguably the largest area of growth in this sector: many platforms are now being bundled with cloud computing services such as Amazon and Microsoft. IBM’s ‘Watson’ is becoming the most renowned of these platforms, having defeated Chess Grandmaster Gary Kasparov and winning the quiz show ‘Jeopardy’; it has since successfully turned its attention to medical services and is now analysing legal services.

 

As an analytics consultant, I am clearly biased towards the benefits of consultancy. However, I don’t believe that any of the following points are particularly controversial.

 

 

1) Data Entry

Most standardised systems will struggle to cope with missing or incorrect data. Unless pre-specified many will just ignore missing data and assume all other data is correct. This is an important issue because it can bias your results leading to erroneous conclusions. Dealing with missing data is a common challenge in statistics and there are sophisticated methods available. However, choosing the correct approach depends upon the problem; it cannot be offered in a standardised platform. Beyond a statistical approach one could also look to enrich and validate the data by using external data sources, but this would also require a tailored solution.

2) Statistical Inference

In order to conduct statistical inference it is imperative that you have an understanding of the problem and the data. For example, if your sample data is not representative of the entire population or the target group, then you will have biased results. This can also occur if you only have a small sample set or one that does not include significant events. In this situation one may wish to use Bayesian statistics, which incorporate expert knowledge of the problem. Unfortunately, it is not possible to employ expert knowledge in a standardised approach. Another risk is when using Machine Learning for prediction. These methods are excellent at modelling what has happened but they are often very poor at predicting regime change. These problems can only really be solved on a case by case basis by statisticians and/or data scientists.

3) Interpretation

This leads back to the previous point that it is important to understand the problem, the data and the method of statistical inference. This is important as it restricts what questions you can ask of the data and under what conditions your inferences are valid. It prevents the business from acting on erroneous results. Again, this can only be solved on a case by case basis by statisticians and/or data scientists.

4) Functionality

Data analytics can be applied to a wide range of business functions and if you wish to develop a data-driven organisation it is vital to do this and integrate your approach. However, it is unlikely that standardised platforms will have all the required functionality. To add functionality the best approach will be to select a platform that allows third-party add-ons. Unfortunately, this will require users to pay additional fees and the add-ons may still not be suitable/ideal.

5) Competitor differentiation

It is natural that the more common standardised platforms become the less opportunities there will be for competitor differentiation. Once the benefits from standardised techniques – of which there are many – are exhausted, businesses will have to start tailoring their systems to remain competitive.

The standardised platforms can certainly provide big benefits to businesses in a short period of time, reducing costs, improving efficiency and improving sales. They may also be enhanced by the use of third-party add-ons to tailor a system and improve functionality. However, it is not clear that they provide a cheap alternative to hiring an “in-house” analytics team or employing a data analytics consultant. This is because they are simply tools and as such require qualified data scientists to ameliorate the inherent risks. Consequently, you must build a de facto “in-house” analytics team, which is expensive.

 

Proponents of standardised platforms would likely argue that the systems are i) a great low cost way for businesses to start to develop data analytics capabilities ii) robust and well tested with great support iii) can be implemented very quickly and iv) will improve over time. I agree with these points. Finally, I’m actually excited to have the opportunity to play around with IBM’s Watson as it is likely to have the most sophisticated natural language processing available. It would be great for people to be able to embed this in their own applications.

Leave a Reply

Your email address will not be published. Required fields are marked *


6 × seven =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>