Tag Archives: sdc

Data Warehouse – Slowly changing dimensions (High Level) in a dimensional model

Slowly changing dimensions (High Level) on a Dimensional model

So why do we need changes on a dimension in any case lets take a real world example: we have a dimension (Table) with customer information and one of the columns (field) are named Customer Surname and another Marital status now as we all know people grow old and things change and some of us get married so our marital status change from being a single person to having a partner thus we surely need to see this change in the data and in some cases we even take the surname of the partner.

We could have simply overwritten the data but then how do we see how many people was single at any point in time and what was that persons surname before. That was one sample where you would like to see the history of the dimension we have places where we simply will update the existing records it maybe the age of the person this does not need to be tracked (if we have the birth date) in the dimensions history thus we will simply overwrite the value.
Then we even have one more possibility that I know about the (not use it myself before) but here we go as from the Ralph Kimball books a type 3 where you have an additional column that will display the prev value from the fact table thus keeping the joining in history and going forward.

Ok now the breakdown and properties of all the above mentioned Types:

SCD Types

  • Type 1

    Overwrite existing attribute.
    No history
    Update by business KeyDay 1
Key Age DOB
1 12 1978/10/12

Day 2

Key Age DOB
2 13 1978/10/12
  • Type 2
    Inserting new records
    Update new records Open date. – My preference to make tracking easy.
    NB: Open date indicator is system column not related to the data’s data you may also use version number.
Key Surname M Status Close data
1 Small Single 01/01/2005
2 Small Married 01/02/2012
3 Van den Heever Married 01/03/2012
  • Type 3
    Allows us to associate existing Fact table data with a deferent view of the data like an alternative hierarchy depending on how you select the report.
Key
(Natural)
Product Description Current Department History Department
1 Games Children Children
1 Games Adult Children
1 Games Retired Adult
  • Type 4
    This is simply keeping history in a deferent table with a date stamp when the record was inserted on the history table.
  • Type 6/Hybrid
    Inserting new record and updating fields on the existing record before closing it.
    Update existing records active indicator – my preference to make tracking easy.NB: Active indicator is system column not related to the data’s data.
    Update existing records Close date – my preference to make tracking easy. Insert a default date here when no date far into the feature (makes MIS simpler)NB: Active Close date is system column not related to the data’s data. Active can be called Current record.
    Inserted new record must have closing records ID(Key) – my preference to make tracking easy.NB: Active closing records ID (Key) is system column not related to the data’s data.
Key Surname M Status Open data Close Date Active Prev Key
1 Small Single 01/01/2005 01/02/2012 False -1
2 Small Married 01/02/2012 01/03/2012 False 1
3 Van den Heever Married 01/03/2012 01/12/2999 True 2

These are all slowly changing attribute when dealing with rapidly changing attributes then we need to consider other alternatives – I will go into details on this in later blog posts.
As you can clearly see you can now play with def types on one table but I would suggest keep it simple you goal at the end of the day is to ensure that the end user can use the table and that its all fast!!.

Fast for read and insert do not overdo the SDC or you will entangle yourself in a spaghetti bowl once you need to make changes to any ETL process.

More details have a look at The Data Warehouse Toolkit from Kimball Group.

For more types and more advance stuff stay tuned and read on the web.

http://en.wikipedia.org/wiki/Slowly_changing_dimension

Remember Google is your friend with searching and make an informed decision when you create you model do not say for example all must be SDC type 2 this can be detrimental to your end gaol.

Did you notice we are missing a type 5?

A funny thing we asked Ralph why did he call it Type 1,2,3 … changes – his answer – “We could not think of a name and the industry started to use it” lol …

{Views and opinions on this Blog does not reflect current/past employers view(s).}