An Academy Dataset is a relational transformation which creates a new table in your database - one way of thinking of it is as a "pre-built" view, allowing you to reshape and pre-calculate your data to make it more closely align with your reporting requirements. You will use Datasets in both Transformation Spaces and Star Spaces.
A Transformation Space is a logical collection of Datasets, and we access them via the "TRANSFORM-it" item in the main menu.
Creating a Transformation Space
When we add a new Transformation Space, we are presented with just two options - the Transformation Space name, and a "Rollback On Failure" check-box. The option to "Rollback On Failure" allows all of the Datasets in the Transformation Space to revert to their original state if there is an error in the build process - this is ideal in a live environment, but for development we recommend leaving this option un-checked, so that we can better understand when things go wrong.

Adding a Dataset
If we save our new Transformation Space and drill back in, we see a Datasets tab, which allows us to add new Datasets to our Transformation Space. Each Dataset is based on a primary, root table which we can select from the drop-down. In the example that follows, we will create a Dataset based on the Order Details table, so we select that in the drop-down and click "Regular" to add a standard Dataset. (Pivot and Un-Pivot Datasets are covered elsewhere.)

Adding & Customizing Fields
The Dataset Details screen allows us to add fields to our Dataset by selecting them from the drop-down. These columns will then be included in the output table (DS_tblOrderDetail) when it is built.

Drilling into the detail of each field allows us to set more advanced options, such as renaming fields or designating a field as part of the Primary Key. We can also use custom criteria to filter the data returned in our Dataset. For example, say we were only be interested in products with a Unit Price greater than $5 - we can specify this in the "Criteria" field using Academy Custom Criteria. This way, only rows matching the chosen expression will be included in our output.
Note that the Criteria is applied to the source table. This means that the typing has to match the source - if the source is a datetime field, then the criteria applied will be date-based, even if a Field Function is applied to turn the output into something else!
At this same point we can also manipulate our source field by applying a "Field Function" to it. A field function is any function available to you in SQL Server, and you can even write your own function using UDFs! In this example, we apply the native SQL "Round" function, to eliminate the pennies from our output "Unit Price" data. When we apply a Field Function, we can reference any of the fields from our source data, but the "#" character is a convenient short-cut to the currently scoped field.

Dataset Detail Options
The Dataset Details screen contains a wealth of other options to customize your output.

Target Table: The name of the output table. This can be a table which already exists, in which case columns will be appended to the table where required.
Target Option: The first time you run your Dataset, data is appended to your output table as you would expect. But what about the second (and subsequent) runs? Should the existing data be deleted? Or should only new records be added to the table? The most important options are:
- Append - Append all records every time the Dataset is run.
- Delete With Rollback - Delete all records in the target, and then append all records from the source.
- Truncate Without Rollback - Truncate all records from the target, and then append all records from the source. This will be faster than "Delete With Rollback" but any subsequent errors in the Transformation Space may leave your warehouse in an inconsistent state.
- Append New* - Only new records in the source are appended to the output.
- Append / Update* - New records are appended to the output, and existing records are updated.
- Append / Update / Delete* - New records are appended to the output, existing records updated, and any records no longer in the source are deleted. This is functionally equivilent to the "Delete" and "Truncate" options, but may be faster in some circumstances.
*
These options require you to identify a Primary Key.
Transitory: A "Transitory" Dataset is somewhat equivalent to a "temp" table - it exists while the Transformation Space is running, but is automatically dropped at the end of the process.
Apply Escalation: Apply Escalation Criteria to your Dataset, alerting you by email to the presence of data matching specified parameters.
Last Completed: The date and time on which the Dataset was last run.
SQL Button: This button is one of your most useful tools for diagnosing problems with your Dataset. If your Dataset runs without errors, but still doesn't work quite as expected, this is where you want to be - it shows you the SQL "SELECT" statement that Data Academy will run to create the output records for your Dataset; and you can copy-paste it into SQL Server Management Studio to debug it.
Description: A free-text field for you to document your Dataset.
Target Fields - Input Amended Date: Adds an Input Amended Date field to your output, driven by a SQL Trigger, allowing you to catch even changes made outside of Data Academy. The field is always named InputAmendedDate.
Target Fields - Identity - Adds a SQL Identity (auto-number) field to your output. In Data Academy prior to 5.6, the field is always named IndentityID. In 5.6+, you can amend the field name by clicking the "..." link next to the check-box.
Advanced WHERE Clause: - A pure, native SQL "WHERE" clause to add advanced filtering to your source data. (You don't have to type the word "WHERE", although if you really want to...!!)
Select Distinct: - Selects only unique records from your source, allowing you to de-dupe your output.
Running Your Transformation Space
Building your Transformation Space does two things - it creates (or adds newly required fields to) the target table; and then inserts the data from the source table(s). In Academy "lingo" we call this "Running" the Transformation Space; and to do so you need to click the "Run" icon on the root Transformation Space screen. (Note that, unlike some Data Academy components, there is no seperate option to "Generate" a Transformation Space.)

When you Run your Transformation Space, Academy will take you to the Data Services Event screen. Initially the event will be colored gray, inidcating that it is pending. Shortly after, it will turn orange/ bold as the processing begins. Finally, it will turn green on success, or red/ italic on failure.

If the Run fails, be sure to drill into the Data Services Event Detail screen to identify the problem. Academy Data Services keeps a thorough log of every operation it carries out, together with the SQL commands processed.
