Connecting Azure Machine Learning studio with on-premises SQL Database
This article is originally published at https://tomaztsql.wordpress.com
Azure Machine Learning Studio has been around for the past 3 years and a lot of new features have been added. I am positive, many more will follow. One of more thrilling features has been the ability to connect to more data sources.
One more welcoming is the ability to connect your on-premises SQL Server database as a datasource and writing T-SQL queries against it.
On-Premises SQL Database data source is available in the Import Data object, as shown on the print-screen below:
But First things first. We need to setup the secure connection between your Azure ML Studio and on-premises SQL Server Database.
Once, you are logged in into your Azure ML Studio, select Settings:
and choose DATA GATEWAYS. you will need to define the gateway, where you will also receive the Registration Key. Once you select create new gateway, give it a name, and proceed to next step. Here you will get the overview on the gateway information and the registration key itself.
The second part, you will need to download the Data Management Gateway. You can click here, and start downloading and install the software. Some 200+MiB later, you will start the Microsoft Integration Runtime Configuration Manager and welcome screen will want you to insert the Gateway Registration Key (generated in Azure ML Studio Settings).
Once entered, you should be able to register the gateway, by clicking on Register.
After registration key authentication and node initialization components, you will be able to launch the Configuration Manager.
In the Configuration Manager, you can run the diagnostics (checking the connections), tweaking the setting of Proxy and running the updates and backups. But most important is, that the gateway now is connected with your Azure Account and your local machine.
Check your credentials under the Diagnostics tab, once your credentials to the on-premises SQL Server database have been successfully entered, you should get green “go”:
Credentials should be same as the credentials entered in SSMS, whenever you want to connect to database or open a new query windows, literally as:
Now, we will setup the Landing database (for the sake of hygiene) in our On-premises SQL Server database and setup the table, to immediately check and test the data types.
Connect to your on-premises SQL Server and run:
USE [Master]; GO CREATE DATABASE [AzureLanding]; GO USE [AzureLanding]; GO DROP TABLE IF EXISTS test_azure_read; GO CREATE TABLE test_azure_read ( ID INT IDENTITY(1,1) NOT NULL ,num1 FLOAT ,num2 DECIMAL(20,10) ,num3 BIGINT ,tex1 NVARCHAR(MAX) ,tex2 CHAR(10) ,bin1 VARBINARY(MAX) ) -- Populating the table with couple of records INSERT INTO test_azure_read (num1,num2,num3,tex1,tex2,bin1) SELECT 2.5, 33.5774, 75643221345, N'Some text. Nothing to see here' ,'CharText',CAST('sometext' AS VARBINARY(MAX)) UNION ALL SELECT 67654.87654,87654.876543,987654335467 ,N'Pearl Jam-đšžćčž@ääü','Pearl Jam',CAST('đšžćčž@ääü' AS VARBINARY(MAX)) UNION ALL SELECT -0.87654,87.6543,-967,N'Pearl Jam-đšžćčž@ääü' ,'Pearl Jam',0x05543f5323
Note, that I have deliberately created a test table “test_azure_read” to check the data-types. I am testing the BIGINT, FLOAT, DECIMAL, NVARCHAR and VARBINARY and on SQL Server side, data looks fine:
Once the data on-premises have been prepared, let’s go back to Azure ML Studio. Create new experiment and add the “Import Data” object on the canvas:
Set up the Import Data Reader, by setting it / pointing it to the same server, we have used it in Diagnostic test in the Configuration Manager:
The only thing that is left out is to enter the credentials (by clicking on Enter values under “User name and password”). Note here that, 1) you might need to install an additional plug-in if you are not using IE or Edge, and 2) check your settings if you are blocking and pop-up windows. In this case, you should enable it and 3) you might get a dialog window for downloading the CredentialManager.application from the proxy cloud hub. No worries, you can confirm it:
Feel free to start it, even if you get prompted to continue, you should not worry, because it is trusted application. Once this is through, you will see the connection being set up and you will also be prompted to enter the credential: user name and password.The dialog window will be:
After successfully entering the user and password, you should be getting the green go in your import dataset object:
Now we will test the previously created table, by entering the query:
SELECT * FROM test_azure_read
And you will be now testing the 1) the connection with Azure ML Studio and 2) importing data from your on-premises SQL Server Database to Azure ML Studio.
Save and run the experiment.
After the initial run, the experiment will fail (deliberately)! this is due to the data type conversions. The decimal data type will not be converted through the data import reader and you will need to do the cast / convert of this particular field:
SELECT ID ,num1 ,CAST(num2 as FLOAT) as num2_float ,num3 ,tex1 ,tex2 ,bin1 FROM test_azure_read
Once this is converted, you can ingest the data and start doing the machine learning in the Azure ML studio.
Once the import is finished, by clicking on connection node
you will get the summary and descriptive information on the data-set.
All data types were imported. Prior to import, we have converted the DECIMAL to FLOAT data type and you can also note, that VARBINARY presentation needs some additional cast or convert:
-- final tweak SELECT ID ,num1 --,num2 ,CAST(num2 as FLOAT) AS num2_float ,num3 ,tex1 ,tex2 --,bin1 ,CONCAT('0x',CONVERT(VARCHAR(MAX),bin1,2)) AS bin1_VARCHAR FROM test_azure_read
Happy coding!
Thanks for visiting r-craft.org
This article is originally published at https://tomaztsql.wordpress.com
Please visit source website for post related comments.