Usually when a company enrolls into a data clean-up exercise the focus is mainly on the in-depth analysis and profiling of the data.
Sometimes it’s worthwhile to step back from the in-depth look at data cleaning to get some perspective.
You should start asking questions like “Why is data dirty in the first place?”, “Are any of the clean-up exercise that we are currently doing reversible ?” or “Does this really need to become an ongoing process ?” Continue reading Data integrity & Business re-engineering→
One of the biggest themes in operational databases and data warehouses alike that is universally recognized but far too often ignored is the cleanliness of the data.
From hundreds of meetings with data processing and IS staff, I have identified three consistent themes.
While working on different database projects I usually find it very necessary to perform a data sizing exercise on all tables focusing mostly on the number of rows in order to determine a growth factor of each table. This kind of information is useful in performing long term planning of growth and scalability of the data that we are managing.
Out of an exercise like this one we can also determine future needs for data archiving.