I remember a news story from probably about 20 years ago where a full scale nuclear panic was instigated in the back garden of a suburban house in the South East of England when small capsules marked as nuclear waste had been discovered when flowerbeds had been turned over.
To cut a long story short, the “nuclear waste” turned out to be the plastic pots attached to the bottom of a Dinky Toys Space 1999 Eagle Freighter, a toy I remember friends having in my own childhood. The tiny little triangular logo on a piece of TV merchandising was enough to summon the massed ranks of the emergency services. Describing something as “nuclear waste” is one heck of a metaphor.
Which is exactly what Maciej Cegłowski does in this cracking presentation from last year, describing how data should be regarded in such terms. We are living in an era of infophilia, where data and information is heralded as the new source of power, always benevolent. But of course every form of power can be turned to nefarious objectives, and maybe we should be thinking about that when creating and storing data.
Conceptually this chimed with me. I love a good metaphor. I appreciate really bad ones too. But it wasn’t until this week that I saw an example which made it all fit together.
I was hosting a session on the subject of gamification with a group of Learning and Development professionals organised by Winmark. The subject turned to an app designed to provide leaders with constant feedback data to give them information about how engaged their team were at any time.
The app provided data in a closed loop between team members and manager. What a great way to help gamify the task of leadership…
This, for me, is a classic example of data as nuclear waste.
Because whilst today the data will be private within teams, used only to help managers become more effective, the pressure within corporate life for everything to be metricised means that at some point, someone will have a bright idea: “Why don’t we turn that data into KPIs that can be used as part of our performance management processes to improve leadership?”
It is iconoclasm in the extreme these days to question the validity of the aphorism “You can’t manage what you can’t measure.”
But by creating these seemingly useful nuggets of data, a chain reaction can be easily imagined by which management and leadership becomes quickly much worse, fuelled by a meltdown caused by radioactive information.
There are two “laws” of social science that I often quote. The first, Goodhart’s Law, roughly speaking says that if you use a measure to become the goal of a social change, the meaning of that measure changes. The second, Campbell’s Law, roughly says that if you use a measure to become the goal of a social change, people cheat to hit it.
Take the engagement data from the app (useful as a barometer in the same way that an actual barometer might be useful to choose what to wear in the morning) and make it used in setting goals and its meaning will no longer be a measure of engagement. It will be a measure of how well a manager is at managing their engagement score. That, crucially, is not the same thing. Nefarious means of managing that score will then emerge.
One of the most untrustworthy people I’ve ever worked with is a case in point. The manager, working in an organization that had decided to tie six-monthly staff survey results into their manager performance metrics, bluntly told his team members how to fill in their staff survey. He played the Prisoners’ Dilemma in real life, knowing that it was unlikely anyone would break ranks. Goodhart’s Law begat Campbell’s Law. It wasn’t a highly engaged workplace, but the metrics demonstrated the opposite.
As the costs of storage and capture of data continue to plummet, more and more of it is being collected and stored. Analytic and AI techniques might find some uses, although will only ever be extrapolating the past into the future – good for some things, terrible for others.
But just like with Nuclear Waste, maybe we should now be starting to question this “well, it’s there” attitude to the creation and collection of data, adding a level of critique into the Data=Good prevailing orthodoxy, before we end up with multiple Data Sellafields that are costly and dangerous..
Tangential to this theme, but equally important, is the question of “how long is the data valid for”. I am frequently drawn back to a metaphor, admittedly posted by a data quality provider, that the half life of most business data is far shorter than we think.https://www.dqglobal.com/2014/05/08/the-half-life-of-data/
I have no affiliation the the post in question but the idea of applying a half life approach to data quality resonates