THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

John Paul Cook

  • SSMS Presenter Mode

    Recent builds of SQL Server Management Studio (SSMS) have a feature called PresentOn that can be accessed from Quick Launch.


    Figure 1. Quick Launch location in SSMS.

    Enter Present in to Quick Launch and select it.


    Figure 2. Select PresentOn.

    Using PresentOn is helpful when you give a presentation. It doesn’t increase the size of the results, which I find is a limitation when actually giving presentations. Overall it is a good feature that you might want to use.


    Figure 3. Object Explorer and the Query Pane are larger, but not the Results Pane.

    To turn off PresentOn, select RestoreDefaultFonts.


    Figure 4. Use RestoreDefaultFonts to turn off presenter mode.

    If you don’t need to increase the font size of the Object Explorer, which is part of your environment, you can use the zoom feature in the Query Pane. This feature is available all of the time. You don’t need to switch to presenter mode. If the results are in text format instead of grid format, the results will also be magnified.


    Figure 5. Using the magnification feature.

    If you want more control over the formatting of SSMS, see my instructions found here.

  • Modeling an Adjacency List Hierarchy with SQL Server 2017 Graph Database

    The Northwind database implements the manager to employee hierarchy using an adjacency list, which is a type of graph. This post shows you how to migrate the adjacency list to a graph of one node and one edge. This is a continuation of what was started in this prior post.

    In the Employees table, the employee’s manager has an EmployeeID which is stored in the employee’s ReportsTo column. A foreign key relationship is defined where the ReportsTo value stores the primary key of the row containing the manager’s data. For the highest level manager, the ReportsTo column is NULL. The relationship can be queried using a self-join.


    Figure 1. Employee to Manager hierarchy implemented in the Northwind database.

    The hierarchy can be displayed using a self-join.

    SELECT e.FirstName, e.LastName, m.FirstName AS ManagerFirstName, m.LastName AS ManagerLastName
    FROM dbo.Employees AS e
    INNER JOIN dbo.Employees AS m
    ON e.ReportsTo = m.EmployeeID;

    FirstName LastName  ManagerFirstname ManagerLastName
    --------- --------- ---------------- ---------------
    Nancy     Davolio   Andrew           Fuller
    Janet     Leverling Andrew           Fuller
    Margaret  Peacock   Andrew           Fuller
    Steven    Buchanan  Andrew           Fuller
    Michael   Suyama    Steven           Buchanan
    Robert    King      Steven           Buchanan
    Laura     Callahan  Andrew           Fuller
    Anne      Dodsworth Steven           Buchanan

    To implement the hierarchy using a node and an edge, run the following code:

    CREATE TABLE dbo.reportsTo AS EDGE;

    INSERT INTO reportsTo
    SELECT e.$node_id, m.$node_id
    FROM dbo.EmployeesNode e
    INNER JOIN dbo.EmployeesNode m
    ON e.ReportsTo = m.EmployeeID;

    To retrieve the employees and their managers, run the following query:

    SELECT e.FirstName, e.LastName, m.FirstName AS ManagerFirstName, m.LastName AS ManagerLastName
    FROM EmployeesNode AS e, reportsTo, EmployeesNode AS m
    WHERE MATCH (e-(reportsTo)->m);

    FirstName LastName  ManagerFirstname ManagerLastName
    --------- --------- ---------------- ---------------
    Nancy     Davolio   Andrew           Fuller
    Janet     Leverling Andrew           Fuller
    Margaret  Peacock   Andrew           Fuller
    Steven    Buchanan  Andrew           Fuller
    Michael   Suyama    Steven           Buchanan
    Robert    King      Steven           Buchanan
    Laura     Callahan  Andrew           Fuller
    Anne      Dodsworth Steven           Buchanan

    Astute observers will notice that something is missing from these queries. Before giving the answer away, look at the following query:

    ;WITH CTE (EmployeeID, EmployeeFirstName, EmployeeLastName, ManagerFirstName, ManagerLastName)
    -- anchor query
        SELECT e.EmployeeID, e.FirstName, e.LastName, CAST(NULL        AS NVARCHAR(20)), CAST(NULL       AS NVARCHAR(10))
        FROM dbo.Employees AS e
        WHERE e.ReportsTo IS NULL
        UNION ALL
    -- recursive query
        SELECT e.EmployeeID, e.FirstName, e.LastName, CAST(m.FirstName AS NVARCHAR(20)), CAST(m.LastName AS NVARCHAR(10))
        FROM dbo.Employees AS e
        INNER JOIN dbo.Employees AS m
        ON e.ReportsTo = m.EmployeeID
        ON e.ReportsTo = CTE.EmployeeID
    SELECT EmployeeFirstName, EmployeeLastName, ManagerFirstName, ManagerLastName

    EmployeeFirstName EmployeeLastName ManagerFirstName ManagerLastName
    ----------------- ---------------- ---------------- ---------------
    Andrew            Fuller           NULL             NULL
    Nancy             Davolio          Andrew           Fuller
    Janet             Leverling        Andrew           Fuller
    Margaret          Peacock          Andrew           Fuller
    Steven            Buchanan         Andrew           Fuller
    Laura             Callahan         Andrew           Fuller
    Michael           Suyama           Steven           Buchanan
    Robert            King             Steven           Buchanan
    Anne              Dodsworth        Steven           Buchanan

    As you can see, special handling is required for handling the top of the hierarchy. A foreign key column in a relational table can allow NULL.

          NOTE: The CASTs were necessary to prevent the following error which has nothing to do with the subject of this post:

        Msg 240, Level 16, State 1, Line 1
        Types don't match between the anchor and the recursive part in column "ManagerFirstName" of recursive query "CTE".

    What about an edge? Can it handle a NULL?

    INSERT INTO reportsTo
    SELECT e.$node_id, NULL
    FROM dbo.EmployeesNode e
    WHERE e.ReportsTo IS NULL;

    Msg 515, Level 16, State 2, Line 70
    Cannot insert the value NULL into column 'to_obj_id_762C07225C8943A99CF4494DFF96B4EE', table 'Northwind.dbo.reportsTo'; column does not allow nulls. INSERT fails.

    The statement has been terminated.

    I propose an approach similar to the common table expression query shown above.

    SELECT FirstName, LastName, NULL AS ManagerFirstName, NULL AS ManagerLastName
    FROM EmployeesNode
    WHERE ReportsTo IS NULL
    SELECT e.FirstName, e.LastName, m.FirstName AS ManagerFirstName, m.LastName AS ManagerLastName
    FROM EmployeesNode AS e, reportsTo, EmployeesNode AS m
    WHERE MATCH (e-(reportsTo)->m);

    FirstName LastName  ManagerFirstName ManagerLastName
    --------- --------- ---------------- ---------------
    Andrew    Fuller    NULL             NULL
    Nancy     Davolio   Andrew           Fuller
    Janet     Leverling Andrew           Fuller
    Margaret  Peacock   Andrew           Fuller
    Steven    Buchanan  Andrew           Fuller
    Michael   Suyama    Steven           Buchanan
    Robert    King      Steven           Buchanan
    Laura     Callahan  Andrew           Fuller
    Anne      Dodsworth Steven           Buchanan

    I have another solution, but it is for another day.

  • Modeling Many-to-Many relationships in SQL Server 2017 Graph Database

    Relational databases do not natively support many to many relationships although a junction table is an easy workaround. Graph databases natively support many-to-many relationships. One of the first sample databases Microsoft provided for SQL Server was Northwind. It has three many-to-many relationships as can be seen in the entity relationship diagram shown below. I previously published a list of references for understanding graphs here.


    Figure 1. Northwind database ER diagram.

    I’m going to focus on one junction table, the EmployeeTerritories table.


    Figure 2. Many-to-many relationship implemented with the EmployeeTerritory junction table. The colors carry through to the code samples that follow.

    I always tell people to start with obtaining an understanding the data. Let’s run some queries to find out more about the nature of the data.

    SELECT * FROM dbo.Employees WHERE EmployeeID NOT IN (SELECT EmployeeID FROM dbo.EmployeeTerritories);
    SELECT * FROM dbo.Territories WHERE TerritoryID NOT IN (SELECT TerritoryID FROM dbo.EmployeeTerritories);

    Not all territories have employees. All employees have territories. As I pointed out in this post, directionality of an edge matters. It’s something you have to pay attention to. The following query shows how to display all of the rows in the many-to-many relationship.

    SELECT E.FirstName, E.LastName, T.TerritoryDescription
    FROM dbo.Employees AS E
    INNER JOIN dbo.EmployeeTerritories AS ET
    ON E.EmployeeID = ET.EmployeeID
    INNER JOIN dbo.Territories AS T
    ON T.TerritoryID = ET.TerritoryID;

    FirstName LastName  TerritoryDescription
    --------- --------- --------------------
    Nancy     Davolio   Wilton             
    Nancy     Davolio   Neward             
    Andrew    Fuller    Westboro           
    Andrew    Fuller    Bedford            
    Andrew    Fuller    Georgetow          
    Andrew    Fuller    Boston             
    Andrew    Fuller    Cambridge          
    Andrew    Fuller    Braintree          
    Andrew    Fuller    Louisville         
    Janet     Leverling Atlanta            
    Janet     Leverling Savannah           
    Janet     Leverling Orlando            
    Janet     Leverling Tampa              
    Margaret  Peacock   Rockville          
    Margaret  Peacock   Greensboro         
    Margaret  Peacock   Cary               
    Steven    Buchanan  Providence         
    Steven    Buchanan  Morristown         
    Steven    Buchanan  Edison             
    Steven    Buchanan  New York           
    Steven    Buchanan  New York           
    Steven    Buchanan  Mellvile           
    Steven    Buchanan  Fairport           
    Michael   Suyama    Phoenix            
    Michael   Suyama    Scottsdale         
    Michael   Suyama    Bellevue           
    Michael   Suyama    Redmond            
    Michael   Suyama    Seattle            
    Robert    King      Hoffman Estates    
    Robert    King      Chicago            
    Robert    King      Denver             
    Robert    King      Colorado Springs   
    Robert    King      Santa Monica       
    Robert    King      Menlo Park         
    Robert    King      San Francisco      
    Robert    King      Campbell           
    Robert    King      Santa Clara        
    Robert    King      Santa Cruz         
    Laura     Callahan  Philadelphia       
    Laura     Callahan  Beachwood          
    Laura     Callahan  Findlay            
    Laura     Callahan  Racine             
    Anne      Dodsworth Hollis             
    Anne      Dodsworth Portsmouth         
    Anne      Dodsworth Southfield         
    Anne      Dodsworth Troy               
    Anne      Dodsworth Bloomfield Hills   
    Anne      Dodsworth Roseville          
    Anne      Dodsworth Minneapolis

    Create and populate a node table of employee data.

    CREATE TABLE dbo.EmployeesNode(
        EmployeeID int IDENTITY(1,1) NOT NULL,
        LastName nvarchar(20) NOT NULL,
        FirstName nvarchar(10) NOT NULL,
        Title nvarchar(30) NULL,
        TitleOfCourtesy nvarchar(25) NULL,
        BirthDate datetime NULL,
        HireDate datetime NULL,
        Address nvarchar(60) NULL,
        City nvarchar(15) NULL,
        Region nvarchar(15) NULL,
        PostalCode nvarchar(10) NULL,
        Country nvarchar(15) NULL,
        HomePhone nvarchar(24) NULL,
        Extension nvarchar(4) NULL,
        Photo VARBINARY(MAX) NULL,  -- changed from deprecated IMAGE data type
        Notes NVARCHAR(MAX) NULL,   -- changed from deprecated TEXT data type
        ReportsTo int NULL,
        PhotoPath nvarchar(255) NULL
    ) AS NODE;

    SET IDENTITY_INSERT dbo.EmployeesNode ON

    INSERT INTO dbo.EmployeesNode
    (EmployeeID, LastName, FirstName, Title, TitleOfCourtesy, BirthDate, HireDate, Address, City, Region, PostalCode, Country, HomePhone, Extension, Photo, Notes, ReportsTo, PhotoPath)
    SELECT EmployeeID, LastName, FirstName, Title, TitleOfCourtesy, BirthDate, HireDate, Address, City, Region, PostalCode, Country, HomePhone, Extension, Photo, Notes, ReportsTo, PhotoPath
    FROM dbo.Employees;

    SET IDENTITY_INSERT dbo.EmployeesNode OFF

    Create and populate a node table of territories data.

    CREATE TABLE dbo.TerritoriesNode(
        TerritoryID nvarchar(20) NOT NULL,
        TerritoryDescription nchar(50) NOT NULL,
        RegionID int NOT NULL
    ) AS NODE;

    INSERT INTO dbo.TerritoriesNode
    (TerritoryID, TerritoryDescription, RegionID)
    SELECT TerritoryID, TerritoryDescription, RegionID
    FROM dbo.Territories;

    It’s necessary to create a query to retrieve node id pairs to populate the edge table. It’s a modification to the junction table query shown above. I suggest running just the SELECT portion of the following query so you can see what the edge table is being populated with.

    INSERT INTO belongsTo
    SELECT E.$node_id, T.$node_id
    FROM dbo.EmployeesNode E
    INNER JOIN dbo.EmployeeTerritories ET
    ON E.EmployeeID = ET.EmployeeID
    INNER JOIN dbo.TerritoriesNode T
    ON T.TerritoryID = ET.TerritoryID;

    Run a graph query to see the many-to-many relationship represented as 49 rows.

    SELECT EmployeesNode.FirstName, EmployeesNode.LastName, TerritoriesNode.TerritoryDescription
    FROM EmployeesNode, belongsTo, TerritoriesNode
    WHERE MATCH (EmployeesNode-(belongsTo)->TerritoriesNode);

    FirstName LastName TerritoryDescription
    --------- -------- --------------------
    Andrew    Fuller   Westboro           
    Andrew    Fuller   Bedford            
    Andrew    Fuller   Georgetow          
    Andrew    Fuller   Boston             
    Andrew    Fuller   Cambridge          
    Andrew    Fuller   Braintree          
    Steven    Buchanan Providence         
    Anne      Dodswort Hollis             
    Anne      Dodswort Portsmouth         
    Nancy     Davolio  Wilton             
    Steven    Buchanan Morristown         
    Steven    Buchanan Edison             
    Steven    Buchanan New York           
    Steven    Buchanan New York           
    Steven    Buchanan Mellvile           
    Steven    Buchanan Fairport           
    Laura     Callahan Philadelphia       
    Nancy     Davolio  Neward             
    Margaret  Peacock  Rockville          
    Margaret  Peacock  Greensboro         
    Margaret  Peacock  Cary               
    Janet     Leverlin Atlanta            
    Janet     Leverlin Savannah           
    Janet     Leverlin Orlando            
    Janet     Leverlin Tampa              
    Andrew    Fuller   Louisville         
    Laura     Callahan Beachwood          
    Laura     Callahan Findlay            
    Anne      Dodswort Southfield         
    Anne      Dodswort Troy               
    Anne      Dodswort Bloomfield Hills   
    Laura     Callahan Racine             
    Anne      Dodswort Roseville          
    Anne      Dodswort Minneapolis        
    Robert    King     Hoffman Estates    
    Robert    King     Chicago            
    Robert    King     Denver             
    Robert    King     Colorado Springs   
    Michael   Suyama   Phoenix            
    Michael   Suyama   Scottsdale         
    Robert    King     Santa Monica       
    Robert    King     Menlo Park         
    Robert    King     San Francisco      
    Robert    King     Campbell           
    Robert    King     Santa Clara        
    Robert    King     Santa Cruz         
    Michael   Suyama   Bellevue           
    Michael   Suyama   Redmond            
    Michael   Suyama   Seattle

    The results are the same, although the order is not. There is no implicit order in a relational database. I didn’t use a SORT to make sure you understand that you should not expect the same order of results from different queries.

    As you can see, the graph query has very simple code.

    I’m pretty sure somebody is going to ask what if the order of the nodes in the MATCH is changed?

    SELECT EmployeesNode.FirstName, EmployeesNode.LastName, TerritoriesNode.TerritoryDescription
    FROM EmployeesNode, belongsTo, TerritoriesNode
    WHERE MATCH (TerritoriesNode-(belongsTo)->EmployeesNode);

    No rows are returned because of the directionality of the edge. We defined our graph has having employees belonging to territories and not territories belonging to employees. It’s the direction of the arrow that matters, not the order of the nodes in the MATCH. The following query returns all 49 rows.

    SELECT EmployeesNode.FirstName, EmployeesNode.LastName, TerritoriesNode.TerritoryDescription
    FROM EmployeesNode, belongsTo, TerritoriesNode
    WHERE MATCH (TerritoriesNode<-(belongsTo)-EmployeesNode);

    Bidirectional matching is not supported. Original graph query with another arrow added.

    SELECT EmployeesNode.FirstName, EmployeesNode.LastName, TerritoriesNode.TerritoryDescription
    FROM EmployeesNode, belongsTo, TerritoriesNode
    WHERE MATCH (EmployeesNode<-(belongsTo)->TerritoriesNode);

    Msg 102, Level 15, State 1, Line 60
    Incorrect syntax near '>'.

    Graph nodes and edges appear under Graph Tables apart from regular tables.


    Figure 3. Node and Edge tables appear under Graph Tables.

    My next post in this series shows how to migrate Northwind's adjacency list in the Employees table to SQL Server 2017 nodes and edges. 

  • Using Hyper-V for SQL Server CTP and RC Versions

    Yesterday at SQL Saturday I was asked how to connect to SQL Server 2017 CTP 2.1 that was running in a virtual machine by using Remote Desktop Connection. If you want to learn how to use new software before it is released, running it in a virtual machine is a really good idea. It is wise to keep prerelease software off of your primary machine. Sometimes people call this keeping your machine clean. Microsoft no longer refers to prerelease versions of SQL Server as beta versions. Instead, they call the beta versions Community Technology Preview (CTP) versions. Prior to the final production release, they may be some Release Candidate (RC) versions. I do not install beta, CTP, or RC versions of software on my primary machine. Instead, I install such versions in a virtual machine. If you override the default settings for interacting with your virtual machine, you can greatly enhance your user experience. Among other things, you will likely experience a need to easily get files into and out of your virtual machine. CAUTION: The steps shown within this post remove the protections separating the virtual machine from the host machine. If you want to investigate the effects of potential malware within the safety of a virtual machine, you should not follow any of the steps within this post.

    Windows 10 and Windows Server 2016 come with Hyper-V. After initially installing an operating system in a Hyper-V virtual machine, you will connect to it using the Virtual Machine Connection application. Instead of clicking the Connect button, you might want to familiarize yourself with the available options. Although I have primarily worked as a database consultant for the past 20 years, I’ve done quite a bit technical training and presentations as well. One of the things I stress to those I’m teaching is that you should not click on buttons without understanding. You need to know what defaults you are implicitly accepting. You should know if there are advantages to changing the default environment. Click the arrow by Show Options to see what is available.


    Figure 1. Virtual Machine Connection. Click the arrow by Show Options.

    You might want to save your settings by checking the obviously named checkbox so that your customized settings will persist from session to session.


    Figure 2. Click the Local Resources tab.

    Notice that here is a tab that says Local Resources.


    Figure 3. Click the More… button.

    Click the More… button to see what additional options are available. Select the drives on your host machine that you want to make available to your virtual machine.


    Figure 4. Select host operating system drives to make available to virtual machine.

    Once you’ve made all of your selections, connect to your virtual machine. Notice that the host’s machines files are available to the virtual machine.


    Figure 5. Host operating system drives made available to virtual machine.

    Many people prefer the user experience of Remote Desktop Connection over that of the Virtual Machine Connection. I do. By default, Windows operating systems have Remote Desktop Connection turned off. You have to connect to your virtual machine to change that. You’ll do that with Virtual Machine Connection. Navigate to the System applet. It’s probably easiest to type system into the search bar and pick it from the list.


    Figure 6. Navigate to the System applet.


    Figure 7. Select Remote settings.

    By default, Windows operating systems are set to deny Remote Desktop Connection.


    Figure 8. This is why you can’t connect to your virtual machine using Remote Desktop Connection.

    Override the default to enable Remote Desktop Connection.


    Figure 9. Enable Remote Desktop Connection.

    Remote Desktop Connection presents you with the same options as Virtual Machine Connection, plus additional ones. You’ll probably want to override the defaults as show previously to enable access to drives on the host operating system.


    Figure 10. Enabling access to host operating system drives from Remote Desktop Connection.

    Open Windows Explorer in the Remote Desktop Connection so that you can see both the local drives and drives on the host machine. Click the Network node in Windows Explorer. Notice the message about network discovery being turned off. This is another default setting of Windows. Click the message


    Figure 11. Network devices are not visible by default inside the virtual machine.


    Figure 12. Virtual machine now has access to network resources.

  • SQL Server 2017 Graph Database References

    This is a list of references I used in creating my SQL Saturday presentation on SQL Server 2017 graph databases. To get started with the graph database features of SQL Server 2017, I recommend that you start with the following official Microsoft documentation: it provides code for creating and query a simple graph database

    or download all three web pages as this single pdf

    Next I recommend reading this to compare and contrast SQL Server 2017 graph to built for purpose graph databases as well as learn about limitations of graph in SQL Server 2017: Notice how it mentions directionality of edges.

    Once you’re ready for a larger graph database, look at You might want to learn more about graph databases than what you will find in the Microsoft documentation. It’s good to know about built for purpose graph databases.

    Here’s a good article to explain SQL Server 2017 graph in more detail and also discuss the broader topic of graph databases outside of just Microsoft. Once again notice that directionality is mentioned.

    If you’d like to see another simple graph database in SQL Server 2017, you might want to look at

    No discussion of graph databases is complete without mentioning Neo4j. You can download and install the community edition for free. Neo4j is built for purpose. The Cypher query language is used to query Neo4j. Once you familiarize yourself with Cypher you’ll see that Microsoft’s graph extensions to SQL Server are similar to Cypher syntax. Two people who work for Neo4j authored a free O’Reilly ebook on graph databases that you can obtain from The title of the book is The Definitive Book on Graph Databases and Introduction to Neo4j, so it tells you that it is biased toward Neo4j. Once again, you should probably pay attention to edge directionality. If you want to read more but want something short instead of a book, try this Wikipedia page on graph databases.

    If you’ve checked out any of these links, I’m confident you’ve seen at least one force-directed graph visualization. You’ll want a software tool to create these. There is a Force-Directed Graph visualization for Power BI. I wasn’t able to get it to work with SQL Server 2017 CTP 2.1. You can download this sample Power BI report and see what the Force-Directed Graph visualization looks like.


    Figure 1. Force-Directed Graph report in Power BI.


    Figure 2. Mouseover on the node named Logan.

    When you are reading about graph databases, it’s important that you don’t let the terminology confuse you. Sometimes authors use the word connections instead of relationships when referring to graph databases. Just kind in mind that a relationship in a graph database context is different from relationship in a relational database context. Graph nodes represent entities.

    Graph databases are good for applications in fraud detection, management of hierarchies such as bill of materials (BOM – don’t pronounce it at an airport) and parts explosions (opposite of a bill of materials, also not good to discuss in the airport lounge), social networks, and purchase recommendations among many other things. I liked reading the academic paper Incremental Anomaly Detection in Graphs. Graph based anomaly detection and description: a survey is another good paper but it is not free to the general public.

    If you don’t understand SQL Server columnar indexes, you really need to understand them to have a foundation for tuning your graphs in SQL Server 2017. You can find an introduction to columnar indexes here. Once you understand columnar indexes, you’ll see why they can be useful in improving graph performance.

    In the presentation at SQL Saturday today, a few people were unfamiliar with the term ASCII art. (If you like ASCII art, the entire first Star Wars movie can be found as ASCII art here.) Somewhere in the linked documentation above, the operators for the graph queries were described as ASCII art. In other words, the characters – and > are combined together to make a graphic construct. In the code below, that graphic construct is an arrow to indicate directionality of an edge as shown in the following query:

    FROM Person, likes, Restaurant
    WHERE MATCH (Person-(likes)
    AND = 'John';

    One person asked about graph queries that don’t return any data. Here’s some code to use with the sample data from the third link on this page. Add a new person to the Person node. Now we have a person who doesn’t like any restaurants because no companion inserts were done for the likes edge.

    INSERT INTO Person VALUES (6,'Dakota');

    FROM Person, likes, Restaurant
    WHERE MATCH (Person-(likes)->Restaurant)
    AND = 'Dakota';

    The result is no rows returned. But couldn’t restaurants for Dakota’s friends be returned? No. Dakota doesn’t have any friends. No inserts into the friends edge were made for Dakota. Run the query and you’ll see that no rows are returned.

    FROM Person person1, Person person2, likes, friendOf, Restaurant
    WHERE MATCH(person1-(friendOf)->person2-(likes)->Restaurant)

    As I pointed out in the presentation, notice that the Person node in the previous query is aliased to person1 and person2 in a manner similar to aliasing relational tables for a self-join. I also said that that I was working on modifying the Northwind database to use SQL Server 2017 graph tables. You can see the first blog post in this series here.

    Here’s a good paper explaining how hierarchies are graphs.

  • SQL Server 2016 Import and Export Wizard and Excel

    SQL Server 2016 is a 64-bit application and it comes with two versions of the Import and Export Wizard. If you launch the Import/Export Wizard from the SSMS, the 32-bit version is invoked. The 64-bit version does not natively provide support for Excel files. There’s a lot of bad, outdated, or complex advice on how to fix the problem. I was able to easily to fix the problem without installing old software on my machine. The version of Office you have affects what providers are available to the Import and Export Wizard. My machine has the 64-bit version of Office installed. In other words, it is lacking in 32-bit drivers.


    Figure 1. 32-bit Import and Export Wizard can export to Excel if you have the necessary providers.


    Figure 2. 32-bit Excel export options.

    It did not work as you can see because the necessary 32-bit providers were not present as further documented below.


    Figure 3. Error message trying to export to Excel 2016.


    Figure 4. Error message trying to export to Excel 2013.


    Figure 5. Error message trying to export to Excel 2007-2010.

    The error messages are listed below so that search engines can find them. Should I add SEO to my resume?

    The 'Microsoft.ACE.OLEDB.16.0' provider is not registered on the local machine. (System.Data)

    The 'Microsoft.ACE.OLEDB.15.0' provider is not registered on the local machine. (System.Data)

    The 'Microsoft.ACE.OLEDB.12.0' provider is not registered on the local machine. (System.Data)

    Selecting Microsoft Excel 97-2003 does work, by the way. But remember, my goal is to make the 64-bit provider work. The 64-bit Wizard can process larger files than the 32-bit Wizard, which is why I’m interested in getting the 64-bit version to work. I’m not interested in the 32-bit version. It still doesn’t work.

    If you do a default installation of SQL Server 2016, you will find the 32-bit version of the Import and Export Wizard here:

    C:\Program Files (x86)\Microsoft SQL Server\130\DTS\Binn\DTSWizard.exe

    The 64-bit version is found here:

    C:\Program Files\Microsoft SQL Server\130\DTS\Binn\DTSWizard.exe

    You might want to make a shortcut to the 64-bit version so that you can easily Access it – pun intended as you will soon see.

    The 64-bit version of the Import and Export Wizard does not offer export to any version of Excel.


    Figure 6. 64-bit version of the Import and Export Wizard does not by default have Excel as an option.

    To add Excel to the 64-bit Import and Export Wizard, install the 64-bit version of the Microsoft Access Database Engine 2016 Redistributable. Note: there is also a 32-bit redistributable. Installing it might resolve the problem with the 32-bit version of the Import and Export Wizard. I didn’t try this. Remember, this blog post is about getting the 64-bit version of the Import and Export Wizard to work with Excel files, either as sources or destinations.


    Figure 7. 64-bit version of the Import and Export Wizard after installing the 2016 Access Database Engine redistributable shows Microsoft Excel as a supported provider.


    Figure 8. Excel options made available by the 2016 Access Database Engine redistributable.

    That was the Customers table from the Northwind sample database, just in case you were wondering. I’m creating a Northwind graph database you’ll see in a future post.

  • BI Beginner: Using R Visualizations

    Last week I showed a simple line plot of a hypothetical college student’s GPA. The plot could have been done using R. Before showing you a visualization that requires the power of R, I’m starting with the simple line plot recreated in R. For additional information about using R with Power BI desktop, David Iseminger has done a great job writing Power BI documentation that can be found at Microsoft’s Power BI site. He also wrote about configuring Power BI desktop to use an external R IDE.

    Begin by placing an R visualization on the canvas. Resize as needed.


    Figure 1. Place an R visualization on the canvas.

    Power BI desktop creates a data frame named dataset. You’re going to have to accept that and work with it. Power BI desktop also eliminates duplicate rows. The input Excel file I’m using doesn’t have any duplicate rows so it is unchanged by the call to R’s unique function. If you have duplicate rows and want to keep them, you’ll have to make your rows unique by adding a column to make them unique – this is similar to what you might do in SQL Server when you add an IDENTITY property to a column. David Iseminger covered this in the first link I provided.


    Figure 2. Drag the X-axis variable into place. Power BI creates a data frame named dataset and eliminates duplicate rows.


    Figure 3. Drag the Y-axis variable into place.

    Notice that unlike the Line chart visualization, the R visualization does not automatically create the plot. You have to click the Run button. Your R code must create visual output or you’ll get an error message.


    Figure 4. Click the Run button.


    Figure 5. Completed plot using the R visualization. The data appears as individual points by default.

    You can customize your R visualization. You need to read the R documentation for plot and par to find out how to do this. I used the following R code to modify my R visualization:

    plot(dataset,main="GPA Over Time",type="l",col="magenta")


    Figure 6. Using plot options and par to modify your R visualization.

    The next post in this series covers more advanced R visualization.

  • [OT] Great American Eclipse of 2017

    I’m amazed at how many people aren’t yet aware of the eclipse visible from the middle United States on August 21, 2017. Hotel reservations along the route are scarce and expensive. There’s a great website where you can find out all about it: If you think that if you aren’t in the central United States that it’s not something to plan for, you’re mistaken. You need to be outside even if you’re not in the path of totality.

    Universities all along the path of totality are having viewing events. Nashville is a prime viewing spot. Vanderbilt has eclipse viewing activities planned on that day, the first day of orientation for graduate students. If you’re near Portland, Oregon, you’re a reasonable drive from prime viewing spots. You might want to drive east of the Cascade mountains to safeguard against potential cloudy skies. Columbia, South Carolina is another great spot. The Great Smoky Mountains National Park is another spot near the east coast to consider.

    If you’re in San Francisco, Dallas, or New Orleans, 80% of the sun will be covered. That’s impressive! Brownsville, Texas and Bangor, Maine will have about 60% of the sun covered. Daytona Beach, Florida will have 90% of the sun covered. There’s really something for everybody in the 48 contiguous states.

    If you miss this, the April 8, 2024 is a great eclipse for Mexico, Texas, Arkansas, Indiana, Ohio, and Maine. Alaskans have to wait until March 30, 2033 or May 11, 2097. Montana is the place to be on August 23, 2044. That’s also a great day for our Canadian neighbors in Alberta.

    An eclipse is also known as a syzygy. A syzygy does not have to be an eclipse.

    I wonder if fireflies will come out during totality.

  • Intel Active Management Technology Vulnerability

    Certain Intel processors have a security vulnerability that you can read about at the Intel site or at the NIST Computer Security Resource Center. Make certain that you install the latest version of the Intel vulnerability analyzer which as of today was found here. The tool tells you if your computer is vulnerable to allowing a nonprivileged user can obtain privileged access to your machine. I ran the tool and received this message:


    Figure 1. Computer not vulnerable.

    If you install an old version of the tool and then install a newer version, expect to have to reboot.

  • BI Beginner: Avoiding Mistakes With Averages

    I have seen mistakes in business reports, academic papers, and training materials when it comes to displaying and calculating averages. At first I created a sample dataset using sales data for widgets, but after seeing so many graduation posts this month, I decided to use college grades in this example. In our sample dataset, college kid (hereafter known as CK) took 5 years to complete a 4 year university degree, but mom and dad are just happy CK graduated. Dad told CK that graduating with a 3.0 grade point average (GPA) would result in a new car as a graduation gift. The input Excel file has a single worksheet named grades that is referenced when Power BI measures are created. I’m assuming that you already know how to load data into Power BI and make a simple visualization. If not, I’ve covered those steps in this blog post.


    Figure 1. Raw Excel data. The name of the Excel worksheet is grades, which is necessary for you to understand to follow the examples.

    Let’s look at the plot in Power BI. CK shows a very definite upward, improving trend in grades and deserves at least a pat on the back.


    Figure 2. Improving grades.

    Does CK deserve more than a pat on the back? Has CK earned the car? Dad really intended for CK to have a 3.0 GPA upon graduation after 4 years, but mom pointed out that dad never said anything about a 4 year time limit. Mom said just be glad it was 5 years and not 6. CK decided to use a Card visualization to show the overall GPA.


    Figure 3. Card visualization in Power BI.

    CK did some quick drag and drop moves and realized something was wrong because the maximum possible GPA is 4.0.


    Figure 4. GPA value is impossibly high.

    CK did a mouseover on the GPA field under Visualizations to see what the problem was.


    Figure 5. The default operation is to sum the data.

    CK knew something was wrong, but was confident that it was an easy fix. After all, CK took statistics in the first year and made a C in that and every other course that year. CK used the dropdown list and selected Average.


    Figure 6. Changing the default behavior of the Card visualization to obtain an average of averages.

    CK triumphantly showed dad the results and mentioned that the Tesla model 3 will be out later in the year and it would be a nice car to have and worth waiting for. CK was willing to drive the old Taurus in the meantime. Dad pointed out that it is not valid to average averages. A weighted average is called for so that proportional contributions are properly weighted. That last year of taking 10 semester hours of Klingon to finally satisfy the foreign language requirement doesn’t equal that full first year of science, philosophy, and history courses.

    Dad modified the Power BI model to properly calculate the overall GPA. There are several possible solutions. He took an approach to show multiple features of Power BI Desktop. The first step was adding a new calculated column. This process begins by going to the left edge of Power BI Desktop and selecting the Data view.


    Figure 8. Creating a calculated column.

    The calculated column definition is the product of the GPA and the Hours columns. Notice that the column definition uses the Excel worksheet name which is grades.

    Product GPA Hours = grades[GPA] * grades[Hours]

    Next, measures were added.


    Figure 9. Adding the first measure.


    Figure 10. Adding the third measure.


    Figure 11. Adding the final measure.

    The measures created are:

    Minimum GPA = 0.0
    Maximum GPA = 4.0
    Target GPA = 3.0
    Overall GPA = SUM(grades[Product GPA Hours]) / sum(grades[Hours])

    Dad completed the model by adding a Gauge visualization to the canvas and setting the properties.


    Figure 12. Gauge visualization showing the actual overall grade point average.

    CK is driving the beat up Taurus with no Tesla to replace it.

    The grades, graduates, and events in this blog post are fictional. No similarity to actual graduates (living or deceased) should be inferred. No graduates were harmed in the making of this model.

  • In The Cloud: Azure Cosmos DB

    Azure Cosmos DB is Microsoft’s new global scale, distributed database as a service that supersedes Azure Document DB. Cosmos is a technology that enables you to create applications that are unimaginable with a single conventional relational database, even if the single relational database is stored in the cloud. Cosmos is something you need to know more about if you make your living working with data.

    Read more about Azure Cosmos DB here. If you want to try code against Cosmos and you don’t have an Azure account or don’t want to incur usage charges, you can download and install the Azure Cosmos DB Emulator. Some of the documentation refers to Document DB instead of Cosmos. Remember, Cosmos supersedes Document DB.


    Figure 1. Azure Cosmos DB emulator.

  • In The Cloud: Azure Portal and HTML5 Storage Capacity Exceeded

    While preparing blog posts last week, the following message appeared in my browser window. I was in the Azure portal when the message appeared. It’s not an error with Azure or the browser. It could happen when using any website or browser. I tried again and everything was fine. The browser had many tabs open for days. A lot of things had been done in the browser.

    Because technical training is part of what I do from time to time, I saw this as a good educational opportunity. I think all technically oriented web users are aware of cookies. HTML5 has something different called localStorage and sessionStorage. If you’re interested in learning more about this, I highly recommend reading this from W3 Schools.


    Figure 1. Not enough storage capacity in the browser when using the Azure portal.

    Seeing the message is a very rare event. My point in posting this is not about the message per se, but to make you aware of the modern storage features of HTML5.

  • [OT] Never Bet Against Cloud Computing

    Today at the Preakness horse racing event, Classic Empire dominated and led the race appearing to be the certain winner with Always Dreaming staying a close second and appearing to be a possible contender. At the end, Cloud Computing charged ahead leaving Classic Empire behind. Always Dreaming faded to eighth place. The victory of Cloud Computing was called an upset, but really people, had I been there, I would have put my money on Cloud Computing from the get go.

  • In The Cloud: Owner Role and Security Administration

    Azure uses Role Based Access Control (RBAC), which is something people generally don’t pay at lot of attention to when initially learning how to use the Azure portal. Take a close look at the screen capture shown below. The Delete button is disabled. There are definitely times you want to protect Azure resources from accidental deletion. That’s just basic good governance.


    Figure 1. Delete button disabled for a Data Catalog resource.

    The reason that the Delete button is disabled can be understood by going to subscription management.


    Figure 2. In the Azure portal, click the arrow to show more services, then select Subscriptions.


    Figure 3. Notice that I was in the User Access Administrator role which doesn’t have delete privilege.

    The individual vertical sections in the Azure portal are called blades. Clicking the Add button causes the Add Permissions blade to appear. You need to select a role and then select the member(s) to add to the role.


    Figure 4. The Add Permissions blade before selecting member(s) to add to the role.


    Figure 5. The Add Permissions blade after selecting member(s). Click Save to make the RBAC change.


    Figure 6. Notice that the Delete button is now enabled. It doesn’t immediately perform the deletion. It prompts you to make sure.


    Figure 7. Prompt to make sure you really want to do this. Notice that the Delete button is disabled at this point.

    Typing the resource group name is a pain. I always move the mouse over the resource group name in the warning, double-click, copy, and paste.


    Figure 8. Use copy and paste to simplify the deletion process.


    Figure 9. The Delete button is enabled after entering it into the box.


    Figure 10. Deletion in progress.

  • BI Beginner: Stacked Charts in Power BI

    People new to Power BI Desktop have asked me how to create a columnar chart where each column has different colors for different values stacked on top of each other. There’s no reason to be intimidated. Creating such a chart is very simple with only two additional requirements since yesterday’s port. First, your input data must classify the data into groups. You’ll either need groups specified in the data or be able to classify the data into groups based on some criteria. Today the sample data has two columns of sales data, one for sales of Widget A, and the other for sales of Widget B.


    Figure 1. Sales data for Widget A and Widget B instead of total overall sales.

    Second, you must chose an appropriate visualization. Not all visualizations are suitable for stacking different data values on top of each other. You’ll need a visualization that has stacked within the visualization’s name. You want stacked data, so you need a stacked visualization. I told you this was easy!

    It is assumed that you understand the content in yesterday’s post. Those initial steps are not repeated here so that we can focus on the concept of stacked charts. What’s different from yesterday is that two columns are placed underneath Value.


    Figure 2. You must drag both A Units Sold and B Units Sold to place them underneath Value.


    Figure 3. Final stacked column chart.


    Figure 4. You can change from a stacked column chart to a stacked bar chart by simply clicking the icon for a stacked bar chart.

    With Power BI, it’s very easy to change your visualization by clicking on a different visualization. I caution you that you do need to pay attention. Not all visualizations are equal. Take a close look at the Y-axis when you take this data and select the line chart visualization.


    Figure 5. Notice that the line chart visualization isn’t stacked and has a maximum Y-axis value of 20.


    Figure 6. The stacked area chart has a maximum Y-axis value of 30.

This Blog



Privacy Statement