Fact Tables

Lesson Notes

Fact Tables Lesson Notes:

  1. Presentation Mode
  2. Static Mode

Video

Lecture Notes:

Fact Table Recap

“There is no point in hoisting fact tables up the flagpole unless they have been chosen to reflect urgent business priorities“ By Ralph Kimball, kimballgroup.com”

Fact Table

What is the fact table?

  • It is the foundation of the data warehouse.

  • It consists of facts and measurements of a particular business aspect and processes ex: daily revenue for a product.

  • It is the target of queries in most of DWH analysis and reports.

  • It contains measurements/facts and foreign keys to dimensions table.

  • It located at the center of the schema and surrounded by dimension tables.

How to design a fact table?

  • Choose the business process.

  • Identify the grain.

  • Identify the dimensions.

  • Identify the fact.

Fact Granularity

  • The grain is the definition of what a single row in the fact table will represent or contains.

  • The grain describes the physical event which needs to be measured.

  • Grain controls the dimensions which are available in fact.

  • Grain represents the level of information we need to represent. It is not always time; it could be the physical business measurement level.

  • Design from the lowest possible grain.

Fact Table Types

There are three types of fact tables:

  • Transaction.

  • Periodic.

  • Accumulated Snapshot.

Fact Types: Transaction Fact Table

  • Fact grain set at a single transaction

  • It has one row per transaction.

  • For each transaction, we add a new single record.

  • The transaction fact table is known to grow very fast as the number of transactions increases.

Fact Types:Transaction Example

customer_idtrns_datetrns_timecall_typeduration
12342020-01-0112:22:45.9Incoming29
12342020-01-0112:22:45.9Incoming3134
12342020-01-0215:22:45.0Outgoing890
12342020-01-0215:22:45.0International119
12342020-01-0323:22:45.0Incoming145
12342020-01-0323:22:45.0Outgoing124
12342020-01-0323:22:45.0Outgoing1200

Transaction fact example of telecom calls data.

Fact Types: Periodic Fact Table

  • A periodic fact table contains one row for a group of transactions over a period.

  • It must be from lower granularity to higher granularity hourly, daily, monthly, and quertrly, then yearly.

Fact Types: Periodic Fact Table Example

cust_idmonth_idincomingoutgoinginternational
12342020013133082124119

Fact Types: Accumulated Snapshot Fact Table

  • An accumulating fact table stores one row for the entire process.

  • It does not accumulate time it accumulates business process.

  • A row in an accumulating snapshot fact table summarizes the measurement events occurring at predictable steps between the beginning and the end of a process

  • Accumulating Fact tables are used to show the activity of progress through a well-defined process and are most often used to research the time between milestones.

  • These fact tables are updated as the business process unfolds, and each milestone is completed.

Fact Types: Accumulated Snapshot Fact Table Example

  • Accumulated Snapshot use cases are engaged when we need to report the entire process life-cycle.Fact Types: Accumulated Snapshot Use Cases.

  • It also uses to measure the process performance life-cycle.

    • Order life-cycle.

    • Insurance processing.

    • Hiring process.

Fact Types: Accumulated Snapshot Fact Table Example

Example of Accumlated Snapshot: An insurance company

  • It fact table named: fact_claim_processing.

  • This fact represents the claim life-cycle inside the company.

  • It contains detail related to claim.

  • This fact update after each stage finished.

  • The requirement it to report the number of days (lag) between stages (milestone) and the claim data (starting).

Virtualization Deployment

Fact Types:Accumulated Snapshot Example

  • One solution to implement the requirement is to use SCD.

  • In this case, we will have stages and dates, and we will calculate the difference between stages and dates using complex sub-query.

  • Another solution is to implement an accumulated snapshot fact

    Periodic fact example of telecom calls data:

    FACT_CLAIM_PROCESSING
    CLAIM_KEY
    CUSTOMER_KEY
    POLICY_KEY
    CLAIM_DATE
    INVESTIGATION_DATE
    REVIEW_DATE
    DECISION_DATE
    PAYMENT_DATE
    FACT_CLAIM_PROCESSING_ACCUM
    CLAIM_KEY
    CUSTOMER_KEY
    POLICY_KEY
    CLAIM_DATE
    INVESTIGATION_DATE
    DAY_TO_INVESTIGATE
    REVIEW_DATE
    DAY_TO_REVIEW
    DECISION_DATE
    DAY_TO_DECISION
    PAYMENT_DATE
    DAY_TO_PAYMENT

Fact Types: Accumulated Snapshot Table Example

Accumulated Snapshot Fact Example on Claim Process Data.

column_namecolumn_value
claim_key123
customer_key5235326
policy_key23632623
claim_date2020-01-01
investigation_date2020-01-03
day_to_investigate2
review_date2020-01-07
day_to_review6
decision_date2020-01-08
day_to_decision7
payment_date2020-01-11
day_to_payment10
process_completed_flag10

Fact Table Types: Comparison

Fact tables types comparison.

FeatureTransactionPeriodicAccumulating
Grain1 row/transaction1 row/time-periodevent stages
Date DimensionLowest granularityEnd-of-period granularityMultiple date
FactsTransaction activitiesPeriodic activitiesDefined lifetime activities
SizeLargestMediumSmallest
UpdateNoNoYes, after stage finished

Fact types

Each fact table includes facts and it has different types:

  • Additive facts.

  • Semi-additive facts.

  • Non-additive facts.

  • Derived facts.

  • Textual facts.

  • Factless fact.

Additive facts

  • It is the most flexible and useful facts.

  • Its measures can be summed across any of the dimensions associated with the fact table.

    Sales
    Date
    Store
    Product
    Sales_Amount

Semi-additive facts

  • It can be added across some dimensions but not all also known as (partially-additive).

  • what’s the total current balance for all accounts in the bank?

  • What’s the current balances for a given account for each day of the month does not give us any useful information?

    account_details
    Date
    Account
    Current_Balance
    Profit_Margin

Non-additive facts

  • It can’t be added for any of the dimensions.

  • Non-additive facts are usually the result of ratios (percentage) or other mathematical calculations.

  • Profit_Margin is an example non-additive.

    account_details
    Date
    Account
    Current_Balance
    Profit_Margin

Derived facts

  • Derived facts are created by performing a mathematical calculation on a number of other facts, and are sometimes referred to as calculated facts. Derived facts may or may not be stored inside the fact table.

  • Total_sales = Qty_Sold * ( Unit_price - Discount)

    Order_Details
    Order_id
    Item_id
    Order_date
    Qty_Sold
    Unit_price
    Discount
    Total_sales

Textual facts

  • A textual fact consists of one or more characters such as flags and indicators.

  • It should be avoided in the fact table

Factless fact

  • A fact table with only foreign keys and no facts is called a factless fact table.

References

Further Reading

Dimensional Modeling: In a Business Intelligence Environment. The book is free, and you can download it from this link. You can read the following:

  • Chapter 5.5 Identify the facts page 169
  • Chapter 6.4 Facts and facts types page 297