Saturday, September 5, 2015

Data Warehouse Concepts Interview Questions

1) Describe Enterprise Data Warehouse ?

Enterprise Data warehouse ( EDW or EDH ) store current and historical data integrated from one or more disparate sources. This is a system used for data analysis and reporting purposes.

Bill Inmon created the accepted definition of what a data warehouse is - a subject oriented, nonvolatile, integrated, time variant collection of data in support of management's decisions..


2) What is a Data mart ?



A data mart is a miniature version of data warehouse where the focus is on a single subject area or functional area like sales, marketing or finance. Data marts are normally owned and operated by individual departments.


3) What are the two most commonly used approaches for building a data warehouse ?

Bill Inmon’s "top down" approach : 


Bill Inmon is widely recognized as the father of data warehousing. According to him there one data warehouse at enterprise level and individual dimensional data marts source from EDW. "Atomic" data, that is, data at the lowest level of detail, are stored in the data warehouse in the 3rd normal form.

Here are the salient features of data warehouse according to Inmon.
  1. Subject-oriented: The data in the data warehouse is organized so that all the data elements relating to the same real-world event or object are linked together
  2. Time-variant: The changes to the data in the database are tracked and recorded so that reports can be produced showing changes over time
  3. Non-volatile: Data in the data warehouse is never over-written or deleted -- once committed, the data is static, read-only, and retained for future reporting
  4. Integrated: The database contains data from most or all of an organization's operational applications, and that this data is made consistent

Ralph Kimballs’s "bottom up" approach : 


According to this approach data marts are first created to provide reporting and analytical capabilities for specific functional areas. These data marts can be later integrated together to form data ware house.

Here are the highlights of this approach.

  1. Dimension Modelling: Data stored in a de-normalized ( 2NF ) form. A dimensional model packages the data in such a way so as to enhance user understandability, query performance, and resilience to change.
  2. Bus Architecture: Implementation of the "bus" is a collection of confirmed dimensions which are shared by facts across multiple data marts.

No comments:

Post a Comment