If seen quite a lot of questions on forums around extracting data from Excel workbooks using SSIS. I therefore decided to write this post to demonstrate how to extract data from and excel workbook. When extracting data from Excel, driver errors are frequently encountered that relate to 32-bit vs 64-bit drivers, e.g.:
Error at Extract data from Excel [Connection manager “Excel Connection Manager”]: The requested OLE DB provider Microsoft.ACE.OLEDB.12.0 is not registered. If the 32-bit driver is not installed, run the package in 64-bit mode.
This article will explain what causes these error messages and how to get around them.
- SSIS 2012 (although the process is very similar in 2008)
- Microsoft.ACE.OLEDB.12.0 provider (both 32 and 64-bit)
- An Excel 2010 or 2013 file with data on two sheets.
- Create a new SSIS project and package.
- Right click in the Connection Managers tab and select New Connection…
- On the Add SSIS Connection Manager dialog box, select EXCEL and click Add…
- On the Excel Connection Manager dialog box, enter the path of your existing Excel source file. Also specify the version of the Excel file and whether the first row contains column names or not. Click OK.
Note: You will notice that unlike most of the other connection managers, the Excel connection manager does not provide a way for you to specify the name of the connection manager. It defaults to “Excel Connection Manager”. You can rename it right clicking on the connection manager and then clicking on Rename, or by changing the Name property of the connection manager.
- Add a Data Flow task to the control flow and edit the Data Flow task.
- Add an Excel Source adapter to your dataflow and edit the source adapter.
The Excel connection manager treats each sheet in the Excel workbook as a separate table. At this stage, I tried to select the name of the sheet from the Name of the Excel sheet drop down list and got an error message:Could not retrieve the table information for the connection manager ‘Excel Connection Manager’.
This message was not very descriptive. I then tried a different approach. I change the data access mode from Table or view to SQL command and typed the following query:SELECT * FROM Geograhpy$
I then clicked on Preview, and got the following error message:
Error at Extract data from Excel [Connection manager “Excel Connection Manager”]: The requested OLE DB provider Microsoft.ACE.OLEDB.12.0 is not registered. If the 32-bit driver is not installed, run the package in 64-bit mode. Error code: 0x00000000.
That gave me a better clue. It seems like the 32-bit version of Excel was not installed. Remember that SSDT is a 32-bit IDE. Therefore, when data is access from SSDT the 32-bit data providers are used. When running the package outside of SSDT it runs in 64-bit mode (not always, but mostly) and uses the 64-bit data providers.
Always keep in mind that if you want to run your package in 64-bit (which you should aim for) you will need both the 32-bit data providers (for development in SSDT) as well as the 64-bit data providers (for executing the package in production).
I downloaded the 32-bit access drivers from http://www.microsoft.com/en-za/download/details.aspx?id=13255. After installation, I could see the worksheets as in the image below.
- Select one of the sheets and click on the columns section to update the metadata. Click OK.
- For this article, I will only be doing a row count of the data in the excel sheet to prove the concept. Create a variable of type Int32 (totore the row count in).
- Add a Row Count component to the data flow and connect the Excel source adapter data path to the Row Count component.
- Edit the Row Count component and assign the variable created in step 8. Your package should look like this:
You can now execute this package in either SSDT (32-bit) or through DTEXEC (64-bit) without any errors.
I hope this post was helpful to some of you.