2016-01-04

Cross Database Queries in SQL Azure DB - Part 1

Hi
 
I wanted to write this post a long time ago but have not had the chance, so now with the new year - this will be the first post for 2016.
 
Cross Database Queries in SQL Azure DB was a key feature that was missing from day 1 on SQL Azure DB.
Finally we have this  key feature, and no, it is not the same as in the regular SQL Server. We have a new way to do this. It is bases on the new key feature  in SQL Server 2016 called "External Tables" (https://msdn.microsoft.com/en-us/library/dn935021.aspx).
Also known as "PolyBase".
 
The logic behind the feature is that you can query tables with-in your DB (like you are doing today) and also at any other DATA source like it is a table within the DB.
 
From one SQL Azure DB to another we can not connect to it via linked server. However a second SQL Azure DB is referenced like any other external data source.
Microsoft thinking is that SQL Azure DB is another DATA sources for External tables ( like Hadoop, Blob storage ect..).
 
 
I am adding some scripts to make it all easy for you to run and test (here I will not elaborate now about the query performance and execution plans...).
 
First are 2 scripts with reference to security issues (its all about trusted connections between 2 DATA sources):
 
IN DB1
 
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'youramazingpassword';
 
CREATE DATABASE SCOPED CREDENTIAL ElasticDBQueryCred
WITH IDENTITY = 'youramazingID',
SECRET = 'youramazingpassword';

The "username" and "password" should be the username and password used to login into the DB2 database.
 
IN DB2 (you create a regular table):
 
CREATE TABLE [dbo].[Customers223](
[CustomerId] [int] NOT NULL,
[Name] [nvarchar](256) NOT NULL,
[RegionId] [int] NOT NULL

);

 
IN DB1  (you create an external table):
 

you create an external DATA Source :
 
CREATE EXTERNAL DATA SOURCE RemoteReferenceData
WITH
(
TYPE=RDBMS,
LOCATION='YOURSERVERNAME.database.secure.windows.net',
DATABASE_NAME='DB2',
CREDENTIAL= ElasticDBQueryCred

 

);


 
 
 
you create an external table that reference to that table in DB2 - with the same name:
 
CREATE EXTERNAL TABLE [dbo].[Customers223](
[CustomerId] [int] NOT NULL,
[Name] [nvarchar](256) NOT NULL,
[RegionId] [int] NOT NULL

)

WITH
(
DATA_SOURCE = RemoteReferenceData
);
 
IN DB2
insert into [Customers223] select 2,'TestPini',3
 
IN DB1
 
select * from [dbo].[Customers223]
 
And....................Yeeeeeeeessssssssssssssssssssssssssss
you will see data from DB2
4 long years I am waiting for this !!!!!!
 
A few things to remember:
  1. If you will do not have an object with the same name in DB2 - you will get error when query it saying "The underlying error message received was: 'Invalid object name 'dbo.Customers224'.'." 
  2. If you add column in DB2 in the table. and you do not add it in the EXTERNAL Table... then you will not see the data. BUT an error " The feature 'ALTER TABLE' is not supported with external tables." You need to drop and create the external table.
  3. the first query - when creating the connection is extremely slow - make your coffee - relax and then all will be good.

  4. the external table can be in a different name then the source, but then - when you create it you must add this in the "WITH" clause:
    WITH
    (
      DATA_SOURCE = RemoteReferenceData,
    SCHEMA_NAME = N'dbo',
    OBJECT_NAME = N'Customers226',
     
    ); 
in DB1 you can check these DMV's:
 
select * from sys.external_data_sources
select * from sys.external_tables
 
have a lovely year.