LOAD-it

[Table of Contents]

Data Academy LOAD-it is the easiest way to load data into you warehouse, from almost any data source. There are explicit connectors for SQL Server, Oracle, Access and Text data sources; and an ODBC[1] connector which can deal with pretty-much anything else you might want to throw at it. Data Academy Data Loaders are thoroughly tested, and come predefined with logging and error handling built-in. Not only that, but LOAD-it also supports incremental loads[2], right out of the box, so you don't have to re-import your entire source system on a daily basis, but only the stuff that has changed.

LOAD-it works by querying your data sources for information about data types and field lengths, and then building an infrastructure to import the data into your warehouse. And it is so easy that in many cases you can have a working import within literally minutes of logging on to Data Academy.

Before you start, however, there are a couple of things you need to set up.

SQL Server Integration Services (SSIS)

LOAD-it works in conjunction with SQL Server Integration Services (SSIS) to import data from your source system into your warehouse; so you need a running instance of SSIS on your Data Academy server.

"Staging" Database

A Data Load is a two phase process. First Data Academy pulls your data into a temporary "staging" area, where it can be checked for errors using JAIL-it. Only then, when it is safe to do so, does Data Academy move your data into the warehouse. In order to provide this "staging" area, you need to create an additional database on your Data Academy server - call it something like "DataAcademyStaging". You will need to ensure that this database has enough storage space to handle your overnight load, and ideally it will have its own physical disk on the server.

SQL Agent

Data Academy uses the SQL Agent to run Data Loaders from the user interface, so make sure it is started on your server!

My First Data Loader

SSIS Service started? SQL Agent running? Staging database created? Ready to create a Data Loader in just 5 minutes? Right - let's get started!

 1 - Creating a Data Loader

Select LOAD-it from the Data Academy menu, and click "Add" to create a new Data Loader.

Thanks to all the smart defaults applied by Academy, there are just a handful of fields to fill out here.

  • Data Loader Name - Anything you like, but no punctuation please - SSIS doesn't like it!
  • Source Server - The name of the server we are pulling the data from.
  • Source Login/ Password - You can leave these bank to use Windows security, but remember that Academy will log into the Source Server using the account which is running IIS on your web server, so you may need to change this to one which has permissions on your source database. See below under 6. Windows Authentication.
  • Source Database - The database where the data is coming from.
  • Staging Database - The staging database we set up earlier.

Click "Save"!

2 - Adding Tables

When you drill back into your Data Loader, you will see the the "Data Loader Tables" section.

You can select the individual tables you want to import, but the easiest thing to do is simply click "Add All". This will carry out a full interrogation of your source system, so it may take a few moments - it is filling in all the details of column names, primary keys and data types, translating to SQL Server types when required. When the process is complete, your Tables section will list all the tables from your source system, together with the number of columns in each table.

3 - Table Details

The default values created by Academy are usually plenty sufficient to build a working Data Loader, but let's drill into the table details just the same.

You can see that Data Academy has added the fields in this table, and correctly assigned the Primary Key.

4 - Generating Our Data Loader

Our final task is to "Generate" the data loader. This will do two things.

  1. Create all the tables required to support our data import, both in the staging database and in the warehouse. In the staging database, Academy will create two tables - a "tmp" table into which the data is directly imported, and an "errors" table where any validation failures are placed. (A data item may fail validation if it contains invalid data - for example a date prior to 1/1/1753 - or if it fails one of our explicitly defined Data Jailer rules.) If you are doing an incremental data load, a "keys" table will also be created in the staging database. In the warehouse, Academy will create the target table, with a table name prefix as defined in the Data Loader.
  2. Create the SSIS packages to run our import. Three packages are created - the most important of which is the "Master" package - this is the one you should run if you want to execute your Data Loader in Management Studio.
To Generate the Data Loader, go back to the LOAD-it home screen and click the "Generate" icon.

You will be whisked off to the Data Services Event Log, and after a few minutes the Event should go green to indicate success. (If it goes red, you have an error, and you'll need to drill into the log to investigate. If it stays gray - check to ensure that Academy Data Services in running.)

5 - Running our Data Loader

Generating the Data Loader created an infrastructure, but it hasn't imported any data yet. To do that, you need to return to the LOAD-it home screen, where the "Run" button will now be enabled. Click it to "Run" the Data Loader and kick-off the import process.

The SSIS Import Log will provide details of all your data imports, and will indicate success or failure of the load. (Note that more detail on any errors will be found in the Data Academy Error Log. This Error Log reports failure conditions in some detail, and a failed load may result in several messages being posted, so don't forget to look at all of them!)

Drilling into the detail, we can also see the status of individual steps in the load, indicating how long each table took to import. (When examining this detail, bear in mind that unless you specified a "Serial Load" then tables are imported in parallel, at the discretion of the SSIS runtime engine - so the table that finished last may not be the one which took the longest!)

[1] ODBC Level 3 driver compliance required.

[2] Incremental loads are not supported for Text data sources.

6 - Windows Authentication

Ordinarily, Data Academy runs in a local account on the server on which it was installed. This account is, by default, NT AUTHORITY\NETWORK SERVICE and the reason for this account being used is that it has restricted permissions and, hence, a comparatively lower security risk.


However, the NETWORK SERVICE account (despite its name alluding to the contrary) cannot access files on other computers, network shares or mapped drives. So, if we want to build a Data Academy Data Loader sourcing, for example, an Access database on a remote computer, then we need to configure Data Academy to run in a domain account. Once this is done, then everything else just falls into place.

See the appendix Configuring Data Academy to run in a domain account