How to Get Azure Notebook to Read Excel File in R
By: | Updated: 2021-07-06 | Comments (2) | Related: > Azure Data Factory
Trouble
The demand to load data from Excel spreadsheets into SQL Databases has been a long-standing requirement for many organizations for many years. Previously, tools such as VBA, SSIS, C# and more have been used to perform this data ingestion orchestration process. Recently, Microsoft introduced an Excel connector for Azure Data Mill. Based on this new Excel connector, how tin can we go well-nigh loading Excel files containing multiple tabs into Azure SQL Database Tables?
Solution
With the new addition of the Excel connector in Azure Information Manufactory, nosotros now have the adequacy of leveraging dynamic and parameterized pipelines to load Excel spreadsheets into Azure SQL Database tables. In this article, we will explore how to dynamically load an Excel spreadsheet residing in ADLS gen2 containing multiple Sheets into a single Azure SQL Tabular array and too into multiple tables for every canvas.
Pre-Requisites
Create an Excel Spreadsheet
The epitome beneath shows a sample Excel spreadsheet containing 4 sheets containing the same headers and schema that nosotros will utilise in our ADF Pipelines to load information in Azure SQL Tables.
Upload to Azure Information Lake Storage Gen2
This same Excel spreadsheet has been loaded to ADLS gen2.
Within Data Manufactory, nosotros can add an ADLS gen2 linked service for the location of the Excel spreadsheet.
Create Linked Services and Datasets
We'll demand to ensure that the ADLS gen2 linked service credentials are configured accurately.
When creating a new dataset, find that nosotros have Excel format as an selection which we can select.
The connexion configuration properties for the Excel dataset tin be found below. Annotation that we will demand to configure the Sheet Name belongings with the dynamic parameterized @dataset().SheetName value. Too, since we have headers in the file, we will need to check 'Outset row as header'.
Within the parameters tab, we'll need to add SheetName.
Next, a sink dataset to the target Azure SQL Table will also need to exist created with a connection to the appropriate linked service.
Create a Pipeline to Load Multiple Excel Sheets in a Spreadsheet into a Single Azure SQL Table
In the post-obit department, we'll create a pipeline to load multiple Excel sheets from a single spreadsheet file into a single Azure SQL Table.
Inside the ADF pane, we tin next create a new pipeline and so add a ForEach loop action to the pipeline canvas. Next, click on the white space of the sail within the pipeline to add a new Array variable called SheetName containing default values of all the sheets in the spreadsheet from Sheet1 through Sheet4, as depicted in the image below.
Next, add together @variables('SheetName') to the items property of the ForEach Settings.
Adjacent, navigate into the ForEach activity and add a CopyActivity with source configurations as follows.
Within the sink configurations, nosotros'll demand to set the table option property to 'Motorcar Create Table' since we currently do not have a table created.
After executing the pipeline, nosotros can see that the iv Sheets take been loaded into the Azure SQL Table.
When nosotros navigate to the Azure SQL Table and query it, we can see that the information from all the Excel Sheets were loaded into the single Azure SQL Table.
Create a Pipeline to Load Multiple Excel Sheets in a Spreadsheet into Multiple Azure SQL Tables
In this next example, we will examination loading multiple Excel sheets from a spreadsheet into multiple Azure SQL Tables. To brainstorm, we will need a new Excel lookup tabular array that volition comprise the SheetName and TableName which will be used by the dynamic ADF pipeline parameters.
The following script tin can be used to create this lookup table.
Set ANSI_NULLS ON Get SET QUOTED_IDENTIFIER ON GO CREATE Table [dbo].[ExcelTableLookUp]( [SheetName] [nvarchar](max) Zip, [TableName] [nvarchar](max) Cipher ) ON [Principal] TEXTIMAGE_ON [PRIMARY] GO
Once the tabular array is created, we can insert the SheetNames and corresponding TableNames into the table:
Next, we will also need to create a new dataset with a connection to the Excel Look upwards table.
The connectedness properties of the Excel Spreadsheet will be similar to the previous pipeline where nosotros parameterized SheetName equally follows.
In this scenario, we will also need to add a parameter for the TableName in the Azure SQL Database dataset connection as follows.
In the Azure SQL DB connection section, we'll leave the schema as hardcoded and would need to add together the parameter for the TableName every bit follows.
In this pipeline, we will also demand a lookup tabular array which will serve the purpose of looking up the values in the SQL lookup table through a select * lookup on the tabular array.
The values from the lookup tin exist passed to the ForEach loop activeness's items belongings of the settings tab, as follows:
Adjacent, within the ForEachLoop activity, nosotros'll need a Copy Information activity with the source dataset properties containing the parameterized SheetName value, equally follows.
Side by side, the sink dataset properties will also need to contain the parameterized TableName value, equally follows. Note that the tabular array option is again set to 'Auto Create Table'.
After nosotros run this pipeline, we tin encounter that the pipeline succeeded and iv tables were created in the Azure SQL Database.
Upon navigating to the Azure SQL Database, we can see that all 4 tabular array were created with the appropriate names based on the TableName values nosotros defined in the SQL Lookup tabular array.
As a terminal bank check, when we query all 4 tables, we can come across that they all contain the data from the Excel Sheets which confirms that the pipeline executed successfully and with the correct mappings of sheets to multiple tables which were divers in the lookup tables.
Adjacent Steps
- For more details on the Excel Connector, read the Microsoft commodity - ADF Adds Connectors for Delta Lake and Excel.
- For a listing of all the other Azure Data Mill Connectors, read Azure Data Mill Connector overview.
- Explore my commodity, Using SQL Server Integration Services to Generate Excel Files Based on Criteria which was built using SSIS and explore how to re-create this similar procedure in Azure Data Factory and also explore other capabilities and patterns to piece of work with Excel files in ADF.
Related Articles
Popular Articles
About the author
Ron L'Esteve is a seasoned Data Architect who holds an MBA and MSF. Ron has over fifteen years of consulting experience with Microsoft Business Intelligence, information technology, emerging deject and large data technologies.
View all my tips
Article Last Updated: 2021-07-06
How to Get Azure Notebook to Read Excel File in R
Source: https://www.mssqltips.com/sqlservertip/6909/import-data-from-excel-to-azure-sql-database-azure-data-factory/
0 Response to "How to Get Azure Notebook to Read Excel File in R"
Post a Comment