The biggest problems with HR analytics, and intelligent software like Machine Learning boils down to one thing. The behavior of people in organizations is not like Jeopardy, Chess, Go, Marketing funnels, Autonomous cars, or other problems with a relatively finite set of answers. Thus, the tools one uses to solve those problems will always be wanting when applied to organizations.
Current tools are fantastic for reviewing the past and speculating about a future where nothing changed in the interim. Intelligent software can tell you a lot about the past and nothing at all about the future. That should be the thing you notice most about the score given a recommendation. It should be marked ‘based on historical inputs.’
When you hear that intelligent software is less error-prone than human beings, that’s not exactly true. It’s clearer to say something like, “Historically speaking, this decision is 90% likely to produce the same result as a similar decision did some period of time ago. We don’t actually have enough data to give you a real-time answer. Let’s talk about what’s happened in the interim.”
Reductionism (bear with me) is, “the practice of analyzing and describing a complex phenomenon in terms of phenomena that are held to represent a simpler or more fundamental level, especially when this is said to provide a sufficient explanation.” Most analytics, data modeling, and linguistic analysis make the assertion that their simple models adequately and accurately reflect the reality they describe. It’s a self-fulfilling prophesy.
Just because we measure and model an organization doesn’t mean we understand it.
Models are judged ‘usable’ when they can predict the past with 80% accuracy. In English, when the model can predict who won last year’s NCAA Tournament 80% of the time it’s good enough to use. That’s fantastic for situations with finite permutations. It’s pretty risky in organizations.
Totally Wrong 20% Of The Time
Latency is the difference between what the machine knows and reality. It is impossible for the machine to know what it doesn’t know. The variance between what the machine can see and what’s actually there happens on a variety of fronts.
Machine learning depends on historical data as the foundation for anticipating the future (actually, it predicts that the past will repeat itself). Historical data is notorious for missing immediate but undocumented circumstances. The latency problem means that any machine led decision is subject to completely missing the mark. 80% accuracy means that the system is totally wrong 20% of the time (not 80% right all of the time).
The Sonoma County Fire Example
Here’s a simple example.
I live in Sonoma County, CA, right near last summer’s fires. Recently, a number of the local business people have been experiencing a dramatic decline in cash flow, as much as 15% each month of the first quarter. It’s taken a lot of head scratching to figure out what happened
1/5th of the local housing stock burned to the ground last summer. The vast majority of those local homeowners live and work in the community. The insurance companies were generous with cash immediately after the fires.
But, many of the insured were not insured well enough to rebuild. And, the emerging wisdom is that similar fires may well happen soon. So, there are a large number of people who are living awkwardly (in RVs, hotels, expensive rentals, or away from the area). Their houses are gone. Their mortgages are not. They are running out of money.
They are not spending in the local economy. It may well last longer than a quarter. The neighborhoods are changing.
It’s a new world. It doesn’t match California or National economics. It’s an anomaly. Historical data does not predict the current circumstance.
Predicting That The Past Will Repeat Itself
If you had machine learning in place, it might well predict another downturn next year. The fires invalidated the relevance of history to future forecasting. The system is restarting with a different population basis, new inflation in home prices and radical shifts in crime rates, divorce rates, domestic violence reports, and social services demands.
These sorts of systemic resets do not happen in systems with fixed rule sets. They happen routinely in organizations. Organizations are complex dynamic systems that are very good at adjusting to change. Internal rules always change to accomadate shifting circumstances.
Another example of the way data ages is the emerging set of tools for tagging and cataloging learning data (videos and powerpoints). Automation is the very best way to inventory the surge of small, grassroots learning objects. Policies and practices change without a lot of rhyme or reason. As the way things get done changes, micro training assets become outmoded. Unfortunately, the older the asset, the more likely it is to be recommended by the system.
There are no good ways for tracking the current relevant quality of the data under management in an organization. You should expect this problem to metastasize. There will be new jobs for the people who curate the material the machine has categorized.