You are currently viewing The Need for Organized Data – or, Why Your Data is Sh!t

The Need for Organized Data – or, Why Your Data is Sh!t

The most successful companies in the world have clean data at the source running through the pipes and out to all the people who need it. It’s a valid – and attainable – goal to want your enterprise to have a data ecosystem that is clean and free-flowing. 

Unfortunately, in many large organizations with disparate systems, that sort of clean and free-flowing data ecosystem is not a thing. 

Think about all the systems in a company’s current ecosystem as a toilet.  

You want to drink Poland Springs, not water that’s going to give you Montezuma’s revenge.  

You need clean, organized data 

Let’s use an example here: You call up your bank for a loan. The first thing they need to do is put you into their customer system and get all your basic information so it can be propagated to other systems that need it.  

A good system will not allow the person inputting the data to put bad characters in. They’ll make you use proper upper and lowercase, they’ll give you dropdown menus of data that is appropriate – this is what’s called a valid value. And the system that checks whether your data is valid and alerts you if it isn’t is called field-level validation.  

So, all this input starts with the customer data. But when you’re putting in account data, there should be some sort of field-level validation that keeps people from putting in erroneous data.  

Because if you have one of these front-facing systems propagating data down to other systems and the data put in at the front is sh!t, that sh!t rolls downhill and just causes more sh!t to happen.  

And ultimately, you end up with customer records that are all messed up from data that does not match the data taken in by various other lending systems in the bank, whether it be a small business loan system, a commercial business loan system, or any of the different systems that you as a business could have.  

If your customer data doesn’t match up to what the other systems have, when you try to aggregate all that data together to create that 360-degree view of customer and other accounts, some of those accounts won’t even show up because they can’t be linked to the customer. So you might not be able to get that complete picture.  

And that’s kind of a microcosm for what is happening in a lot of businesses: a bunch of data toilets are feeding a data sewer.  

Data sewers in the real world 

I’m going to use a bus for an example. The bus tells you how many people get on and off that bus at any given stop. You have systems within the bus to record where it’s going and where it’s been, and you have systems that look at any work scheduled, both for the operator operating the bus and for the type of service on the line that needs to happen in a certain time.  

The data from all of these different systems is collected and used to try to forecast the next quarter: what ridership is going to look like, what service needs to be built, what routes need more coverage, etc. And if you get good, reliable, organized data on all these things, you’ll notice patterns.  

But the problem is, a lot of those systems, when they’re initially put in at a business like that automated passenger count system or the GPS system that’s put into a bus, are not necessarily configured properly to collect all of the granular types of data that you need – not at first, anyway.  

Then, when you get to that forecasting phase, say one or two quarters down, you may not have the full picture. You may need to start to piece the data together and make decisions.  

Time and again, I’ve seen the folks who were tasked with coming up with the data to make some of these forecasts making stuff up based on an educated guess because their systems were deficient. And on top of that, what they weren’t communicating back to the owners of those systems was that if a few small tweaks were made, they’d result in more accurate and granular data, allowing the owner to make better decisions.  

And so, a huge part of the “data sewer” concept is that the people in the ecosystem who act as the sewage treaters in the treatment plant, tend to take in that crappy data and massage it to make it look like something else – in this case, they polish a turd. And too many enterprises work this way.  

If they would really think about how all of their systems interrelate, how those pieces of data interrelate, what level of granular data they need in order to make better decisions, and whether those systems have been configured up front, and if they’d actually take some time to reconfigure them so that they are getting the right level of organized data, then they could really do some powerful things with regards to reporting and metrics.  

Ultimately, you need clean springs flowing into the main spring where you get the water. It’s much better than a bunch of people taking dumps in different systems and then having all the water be aggregated at a plant where people have to clean up the poop. 

One final note 

When you’re doing a large-scale enterprise conversion or engagement, you have to have an initial assessment. If you’re putting in a new enterprise application that touches lots of other applications, you need to do a quick assessment and analysis of how all those systems are communicating with each other, all the business lines that use them and how they talk to each other, whether through the system or outside of it, and see if there are any logical holes in that communication pattern.  

Have an analyst sit down and watch people do their daily activities and they’ll see pretty quickly if a system is antiquated and doesn’t have the correct validation on the screen. And while you have the hood up, so to speak, you might as well fix some of those deficiencies so that when you finish the project, you have things flowing the correct way again.  

What will end up happening is, month over month, quarter over quarter, year over year, you’ll get all kinds of actionable data that will allow you to sell more products and create new products that you didn’t know you needed because you’ll start to see trends that you didn’t see before.  

This is also why data stewardship, or the people who are in charge of the data, is a very important piece of the puzzle. You need a manager who’s going to be in charge of the data stewardship program. A lot of businesses realize this is a problem, but they don’t articulate it very well or understand why it’s happening. That’s another key part of the data sewer analogy, on top of things not being configured properly. 

Personally, I’m a big fan of following the person who cuts all the checks. If you want a decision made, you need to go to the person who’s actually signing the checks. Now, this is somewhat figurative in a large enterprise, but not completely – you are going to have some sort of department manager who actually signs off on invoices and makes sure that if there are additions or modifications to a system that need to happen, they approve whether that happens or not, and where the money is allocated.  

There needs to be someone to verify that their system is good and clean – someone to make the holy water holy, or usable by other systems. And so many companies are reluctant to take ownership of the data, even though they have ownership of the application.  

This is important: taking responsibility for your data is not a bad thing. Understand that you’re going to get questioned if some of your data is bad, but it takes only slightly more effort for the data you’re in charge of to be really good and organized. All you’ll need to do is make a plan to periodically review your data and clean up discrepancies or put together sub projects that will fix the little pieces of the system that caused the problem.  

And in the end, you’ll be able to trust that even more of your incoming data is good, reliable, organized, and useful. 

Do you have contaminated data flowing through your business? If so, give us a call – let’s get it fixed and start making decisions based on clean, actionable data.