Category Archives: Big Data

How to implement a Credit Risk Data Warehouse – Part 2

In How to implement a Credit Risk Data Warehouse – Part 1 I did an overview and an executive summary of the activities required for the delivery of Credit Risk Data Warehouse project. If you haven’t read this already I recommend you read How to implement a Credit Risk Data Warehouse – Part 1 first.

In addition to the above, the DW Architect, Project Manager & their teams will have to work and define the implementation approach, which will allow the Implementation team to complete the work on an accelerated timeline.
Usually a multistage “work breakdown structure” approach for the implementation phases of the project is preferred. Continue reading How to implement a Credit Risk Data Warehouse – Part 2

How to implement a Credit Risk Data Warehouse – Part 1

In the past years I’ve worked as a hands-on architect on the design and implementation of various data migration, data warehouse and business intelligence systems in various industries like Insurance, Retail, Food & Bev’s, Pharma & Investment Banking. Out of all nothing quite stands out like the complexity of building a Credit Risk Data Warehouse & Reporting system for the compliance with the Basel Regulatory requirements. Continue reading How to implement a Credit Risk Data Warehouse – Part 1

Data integrity & Business re-engineering

Usually when a company enrolls into a data clean-up exercise the focus is mainly on the in-depth analysis and profiling of the data.

Sometimes it’s worthwhile to step back from the in-depth look at data cleaning to get some perspective.

You should start asking questions like “Why is data dirty in the first place?”,  “Are any of the clean-up exercise that we are currently doing reversible ?” or “Does this really need to become an ongoing process ?” Continue reading Data integrity & Business re-engineering

Why do we end up with Dirty Data [platform agnostic] ?

One of the biggest themes in operational databases and data warehouses alike that is universally recognized but far too often ignored is the cleanliness of the data.

From hundreds of meetings with data processing and IS staff, I have identified three consistent themes.

Although these three themes stand out dramatically as the biggest problems in corporate data access, the same data processing and IS staffs that identify them are usually attacking only the first two of them. Continue reading Why do we end up with Dirty Data [platform agnostic] ?

What is Hadoop ?

Here are 2 videos that explain in details what is Hadoop and how was invented. Continue reading What is Hadoop ?