benchmarkingblog

Elisabeth Stahl on Benchmarking and IT Optimization

On Big Data: Count Me In, But Do It Right

with one comment

Our local high school is now offering a new class in introductory statistics. And from what I’ve been seeing lately, we need this like my dog needs rawhide. (You see otherwise he will chew on sticks, rocks, and cement.)

I was recently reviewing some availability statistics. A regulatory group (which shall remain unnamed) was comparing number of outages between different types of equipment. Which is all very fine. The problem was that they were counting numbers of times, not percentage of times. Which means very little when you may have hundreds of instances of one type of equipment — and a total of ONE of another.

Another fallacy was that they were analyzing the 95% of the outages that had to do with one maintenance issue that had recently been solved – so what they really needed to focus on was the other 5% — and especially the outliers.

Another technique that drives me crazy is when someone rounds up when they should round down.

I’m not saying that everyone needs to have a deep understanding of multivariate ANOVA or the like. But with the plethora of Big Data applications and the way data is now woven into our society and in everything we do, it becomes exceedingly important to analyze and understand it in the right way.

We love to say “Do the Math.” But we need to make sure that when we do the math, we use the data in the correct and very best way to solve the problem.

************************************************

The postings on this site solely reflect the personal views of the author and do not necessarily represent the views, positions, strategies or opinions of IBM or IBM management.

technorati tags: , , , , ,,,,,,

Advertisements

Written by benchmarkingblog

September 18, 2013 at 11:42 am

Posted in Big Data

Tagged with

One Response

Subscribe to comments with RSS.

  1. Well said. Elizabeth.
    You inspired me to see if this is sited in the new Common Core
    They have a section on Mathematics » High School: Statistics & Probability »
    http://www.corestandards.org/Math/Content/HSS/introduction
    “Decisions or predictions are often based on data—numbers in context. These decisions or predictions would be easy if the data always sent a clear message, but the message is often obscured by variability. Statistics provides tools for describing variability in data and for making informed decisions that take it into account.”

    Bob Creedon

    October 15, 2013 at 1:26 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: