Bad Data is Worse Than No Data

0
1113

If we’ve learned one thing from the Fake News era – and Lord knows we should have by now — it’s the perils of bad data.

Sometimes, bad data has nefarious origins. We all know where to look to see examples of purposely misleading or downright false information.

Often, though, bad data is not intentionally misleading, but is a result of competing priorities. In the Big Data era, our own personal data is a valuable commodity for companies that reprocess it and then sell it back to us, ostensibly in a form more helpful to us.

In their urgency to keep us hooked to their product and ultimately sell advertising dollars, however, companies often present us with analyses of our data with a confidence and certainty that is not warranted, given the quality of the data itself.

Running And Bad Data

Runners tend to be a highly data-based crowed. We quantify our speeds, our mileage, our heartrate, and just about anything else that can possibly be assigned a number. Good data, after all, is essential to know if we’ve achieved our goals.

New technology creates an increasingly humongous pool of data about ourselves. Just by strapping on my running watch, I generate thousands of data points per minute, including my GPS location, my heartrate, my motion, and a zillion other elements.

I primarily use two apps to help organize this data into useful form: Strava and Garmin. I use data from my Garmin watch throughout my run to monitor my pace, my distance travelled, and my elevation gain. Afterwards, I use Strava to learn how my run compared to others who’d done the same segments and to see my progress over the course of the run.

I use these two apps becauseof all of the great data they help me collect and track. But not all the data I receive from them is good data.

In their eagerness to provide fitness and training-related data, I often see information that is just plain meaningless. This includes my training level, which Garmin reports based on data collected from my runs, but doesn’t take into always seem to take into account some factors such as elevation gain. My Garmin also reports to me my goal steps, which is a largely meaningless figure and doesn’t take into account, say, steps running vs. steps walking, or other exercise I’ve had that day.

Strava, for its part, has the somewhat annoying habit of telling me if I’m “improving” on a run. But whether I’m improving has everything to do with what the goals of that run were. Was it a tempo run in which I was trying to beat a previous time? Or was it an easy recovery run, with no particular time goals?

In The Real World, Bad Data Is Toxic

If I receive bad data from a fitness app, it’s no big deal. We know to take the metrics they give us in stride and use our common sense in training and competing.

In the real world, though, bad data can be distinctly harmful, even when it is well-intentioned. IBM estimates that bad data costs the US economy roughly $3.1 trillion dollars each year. Debacles from the Enron debacle to 2016 Presidential election predictions were rooted in bad data.

The solution is not to distrust all data. That, indeed, is a recipe for nihilism, and distrust of all media is exactly what has given rise to partisan media. Rather, the solution is to consider the source of your data and how it’s been collected and analyzed.

Moreover, amassing good data means collecting it from multiple data sources when possible. If Garmin watch tells me my training level has decreased but I feel strong and have been running on challenging courses, I factor this data collected into the picture.

LEAVE A REPLY

Please enter your comment!
Please enter your name here