STAR-it Star Spaces

[Table of Contents]

Data Academy STAR-it Star Spaces are closely related to Transformation Spaces. In fact, they are a specialized form of Transformation Space, with some extra functionality to help you build Star Schemas for OLAP reporting.

A Star Space may contain the following entity types:

  • Fact Tables
  • Dimensions
  • Slowly Changing Dimensions
  • Time Dimensions

When working on a Star Space, you will set up one or more of each of these entities, and then define the relationships between them.

Fact Tables

A Fact Table sits at the center of our star, and contains the facts or measures upon which we will be reporting - for a manufacturing business, this might be the number of units sold and the unit price of each item. To add a Fact Table, we select the main source table from the drop-down. (Don't panic if your facts are stored across multiple tables - additional data sources can be joined-in later!)


Add the Fact fields to your Fact Table. Note that the default Target Option (Append New) requires you to identify a Primary Key. The Primary Key is identified by a yellow "key" icon.


Dimensions

A Dimension contains the descriptive data pertaining to your facts. You will be building multiple dimensions for products, customers, etc. Drill into your Dimension, and add the appropriate fields - Customer Name, City, etc.

When Data Academy builds your dimension, it will include a Surrogate Key (an Identity field) but you also need to identify the original Business Key of your data. The Business Key is identified by a red "key" icon.


Slowly Changing Dimensions

A Slowly Changing Dimension allows you to handle changes in the values of your data - for example, a customer may move from one city to another. A Slowly Changing Dimension will allow you to assign new orders to the customer in the new city, while "remembering" that older orders took place in the previous city; and linking the two together via the Business Keys.

To create a Slowly Changing Dimension, simply check the box in the Dimension Detail screen.

When Data Academy builds a Slowly Changing Dimension, it appends each new version of the record to the output data. By default it looks at all non-key fields to identify changed data, but if you want to only look at certain fields, set the Slowly Changing Dimension Constraint check-box in the Field Details.


In a Slowly Changing Dimension, Data Academy appends three special fields to the output:

  • Slowly Changing Dimension Start Date - A datetime field which defines when the record was created.
  • Slowly Changing Dimension End Date - A datetime field which defines when the record moved into history. While the record is "live" this will have a value of "09/09/9999".
  • Slowly Changing Dimension Status - A bit field, indicating the current record with a value of "1" and historical records with a value of "0".

Note: Existing "Slowly Changing Dimension" Data

If your data already contains fields corresponding to Slowly Changing Dimension Start/ End Date, then create a Regular Dimension instead, not forgetting to set the Fact Context Date property of the Fact Dataset!

Make sure that the target fields are called SlowlyChangingDimensionStartDate and SlowlyChangingDimensionEndDate; and that the End Date is a real date (not null). You can accomplish the latter with the following Field Function, which sets a null End Date to September 9 9999.

COALESCE(#, CAST('99990909' AS datetime))

Time Dimensions

A Time Dimension allows you to define the relationship between your data and the calendar - you might be interested in Financial Months or Calendar Quarters! When you add a new Time Dimension you need to identify the Start and End Dates. It is very important to ensure all of your data falls (and will continue to fall) within the identified dates, but you should also be careful not to define too wide a span, because a Time Dimension can get big - a Time Dimension with a granularity of "Minutes" will contain half a million records for every year! Unless you have an explicit need to support analysis by hour or minute, it is strongly recommended that the Granularity be left at the default "day" level.


When you save a new Time Dimension and drill back in, you will see that Academy has added some default fields. (These defaults are defined in the Dataset Date Expansions.) You can add new fields if you need them, or remove any that are unnecessary.


When Academy builds the Time Dimension it adds a special key to the output, called Time Key - this is a global key, derived from the number of minutes elapsed since midnight on 01/01/1753.

Automatic Scoping of Time Dimensions

In Data Academy 5.6+ you can set STAR-it up to automagically scope your time dimensions, by setting the Start and End Dates to be equal on the Time Dimension Details screen.

Data Academy will then derive the appropriate scope from the data which appears in any related FACT tables - see Defining Relationships, below.

It is important to note that this operation only ever WIDENS the scope of a time dimension - it will never narrow or reduce the scope!

Defining Relationships

Having specified your dimensions, you need to define how they relate to your Fact Tables. This is a analogous to setting up Foreign Keys in SQL Server! Within the Fact Table Detail screen, select the Relationships tab, and add a new relationship. You will be taken to the Relationship Details screen where you specify the keys which define the relationship.


You also have the option here to "Infer Missing Members" - this will create temporary records in the Dimension table to handle items which are currently missing. Use the "Inferred Defaults" tab to set up the default values which are created when a missing member is inferred.


When the Star Space is built, Academy uses the Surrogate Key in the Dimension table to link the Relationship to the Fact Table; and an "SK_" field is appended to the Fact Table to accommodate this.

For Time Dimension relationships, the Time Key is identified by the TK_ prefix. When a Time Dimension Relationship is defined, Academy also adds Time Floor and Date Floor fields to the Fact Table, which define the "rounded down" time ("05-07-2008 11:56" becomes "05-07-2008 00:00") as required by Analysis Services Time Dimensions.