THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server.
You can follow me on Twitter: @marcorus

  • Format measures as dates in Power BI #dax #powerbi

    If you want to format a measure as date in Power BI, you need an expression returning a date/time data type. This is different from what you can do in Power Pivot and in SSAS Tabular, where you can format any numeric expression using a date and/or time format, and the conversion is implicitly done to perform such a visualization.

    The problem might be not very common, because you have a numeric expression only when you start manipulating a date without using the dedicated DAX functions to do that. A date is a floating point number where the integer part is the distance in days since December 30, 1899, and the decimal part is the fraction of a day. In the following examples I will use NOW() to obtain a date and time, but you might have any other expression operating on your data.

    This measure in Power BI returns a date/time, and can be formatted using any “Date Time” format:

    Measure := NOW ()

    You can remove the time by truncating the number, obtaining only the day with the time corresponding to 12:00am:

    Measure := TRUNC ( NOW () )

    At this point, if you try to change the format of the measure, the “Date Time” format is disabled:

    image

    In DAX there is no type conversion operator to date/time. However, you can sum the numeric expression that you have to a recognized date/time value. If you add the equivalent of 0 (corresponding to December 30, 1899), your final data type is a date/time. The following expression returns a date/time data type corresponding to the current day.

    Measure := TRUNC ( NOW() ) + DATE ( 1899, 12, 30 )

    The format of the measure now displays only Date Time formats.

    image

    Please note that for the specific example used in this post, you can use the DAX function TODAY(), which returns the current date without the time. The function NOW has been used only to represent any date/time expression that you might obtain from other DAX expressions over your data.

    Thanks to Jeffrey Wang for providing me the hint used in the technique I described in this blog post.

  • DAX Editor 2.0 finally available: it is a must have for #ssas #tabular #dax developers

    The SQL Server Data Tools (SSDT) editor for Tabular models has been improved in the recent updates. For example, the Tabular Model Explorer enables you to quickly jump between the entities of your model organized in a hierarchical way. However, when it comes to editing DAX measures, you are still constrained in a user interface where you can only modify one measure at a time. Moreover, when you have hundreds of measure in the same Tabular model, every change you apply might require a few seconds of waiting time, because of the validation executed each time.

    Several years ago, a first version of DAX Editor for Visual Studio was created by Nick Medveditskov, a former developer of the Analysis Services team who released the tool to the community as an open source project. Later on, I and Teo Lachev contributed to upgrade the tool when new versions of SSDT were released (Visual Studio 2012 and 2013). However, such a version was no compatible with the new compatibility level (1200) introduced by Analysis Services 2016, which saves the model in a JSON format instead of XML. Thanks to the collaboration of a few customers who wanted to migrate this tool to the new version of Analysis Services, we (at SQLBI) hired a developer who helped us in this migration and now a new version of the DAX Editor is available in the Visual Studio Gallery.

    DAX Editor 2.0 supports both XML and JSON formats (compatibility levels 1103 and 1200), and it can be installed on Visual Studio 2012, 2013, and 2015.
    You can see a quick introduction of how to install and use it in the DAX Editor – Introduction video (just 3’).
    A more detailed explanation of how to install and start to use DAX Editor is available in the article Using DAX Editor on SQLBI.
    The source code is available on GitHub.
    Finally, a single page with all the links is available in DAX Editor tools on SQLBI.

    One warning and one request. The warning first: despite the many tests we did, there will be certainly other bugs. Create a backup of your BIM file before using DAX Editor, you don’t want to lose your hard work because of a bug in an editor!
    The request: please, report any bug in the Issues page on GitHub, write a review if you like the tool, and if you are interested in contribute to the development of new features, contact me to request access.

  • Install SQL Server 2016 SP1 for Analysis Services #ssas #tabular #sqlserver

    Microsoft recently released the Service Pack 1 for SQL Server 2016. If you use Analysis Services, installing this update is a no-brainer and you should do that as soon as possible. The reason are all related to memory management. Microsoft officially described this in a blog post on MSDN. I suggest you reading it, but let me describe in a more informal way why this update should be a priority for you.

    1. SSAS Tabular had a heap fragmentation issue in SSAS 2012 and 2014, but most of the issues were still present in SSAS 2016 because of other bugs. If you run SSAS 2012/2014 (any version/SP), or SSAS 2016 RTM (any CU), then you should mitigate the problems using the workarounds described in the article Heap Memory Settings for Analysis Services Tabular 2012 / 2014. The new allocator in SSAS 2016 SP1 should get rid of the issues, so you should reset the HeapTypeForObjects and MemoryHeapType settings to their new default values (MemoryHeapType=-1 and HeapTypeForObjects=0). I still don’t have metrics from real world usage, so I invite you to test these parameters, monitor the memory in your server after a few hours/days and report (using comments) whether the new settings work better than the previous ones.
    2. Improved performance thanks to the new memory allocator. I made some stress test on machines with one socket and multiple cores, and I measured improvements between 5% and 10% on test stressing queries over large tables using the same hardware and with the same condition. Since the improvement is for memory allocation, I would not be surprised if there are improvements also in processing time. However, getting faster performance for free is always welcome.
    3. NUMA support. The Microsoft post is very honest and describe it as “NUMA awareness” and talks about “better NUMA support”. I still don’t have numbers, but based on a number of information I collected we can say that the benefits are relevant for large data models running on 4 sockets or more. I have seen statements describing as 30GB the model size that would start to get improvements, but I think that you should also consider the model type. A single model with a huge number of tables and columns might not benefit too much, whereas if a single table has billions of rows, then the benefits should be more visible. Remember: if you are able to store your model on a single socket machine, you probably have (on average) better performance. But if you want to enter the club of 10+ billion rows in a single table, then this is an option you should consider. If you have these models and you have 4 sockets or more, please share your experience!
      At the same time, if you have many small models, it’s probably better to split your workload on different SSAS Tabular instances, each one running on a single socket. But you can try, and share your results especially if my prediction is wrong and you see performance benefits scattering data of small models on multiple sockets.
      Finally, if you have a large table (>2 billion rows) and only 2 sockets, please contact me to play with some VertiPaq non-default setting.

    Getting scalability over NUMA architecture is not easy, but this is a long awaited first step in the right direction. In any case, this update is a priority if you have SSAS Tabular 2016.

  • New Server Timings features in DAX Studio 2.5.0 #dax #powerbi #ssas #tabular

    Last week, a new version of DAX Studio (2.5.0) has been released. You can find a summary of the new features in the blog post from Darren Gosbell - thanks Darren for your wonderful job with this tool!

    My small contribution to this tool is mainly in the area of performance analysis. In the last few months, I worked on implementing a support for DirectQuery, which I described in the article Analyze DirectQuery requests using DAX Studio on SQLBI. I also fixed a few bugs in the xmSQL formatting code (we clean up a number of verbose information, but sometimes we still cut too much from xmSQL, expect more fixes in upcoming releases). But I also added a small feature that will help to save a lot of time in performance analysis.

    The server timings tab has two new columns, Rows and KB, that have the following meaning:

    • Rows: it is the number of rows that have been estimated by the query engine as a result of the query. This number is important to get an idea of the cardinality of the result. However, be careful: this is an estimation, and the actual result could be different, but in general the order of magnitude provided is relevant. When you spot one or more storage engine queries returning more rows than the result of the entire query, you know that such a materialization will be filtered or aggregated by the formula engine, which is not efficient in doing that as the storage engine. In other words, a large number of rows in a storage engine query could be indirectly responsible of a bottleneck recognized in the formula engine.
    • KB: it is the estimated size in memory (measured in KB) of the result of the storage engine query (this result is also called data cache). Usually this size is related to the number of rows, but when you materialize an entire table instead of a few columns, the KB number will be very high compared to the Rows. By identifying the storage engine queries with the larger KB size, it should be easier to identify which part of the DAX code is responsible for that. Classical examples of that are filters based on a tables instead of one or two columns only, and context transition iterating a table without a primary key (typical in fact tables) instead of iterating just the values of a single column.

    These two columns are populated only when you connect DAX Studio to Power BI, or Excel 2016, or Analysis Services 2016 (if you connect to previous versions, you will see these columns empty). The reason is that we simply parse the text of the storage engine query, and in these products at the end of the query text there is an estimation of rows and memory used, which we simply copy in the properties of the events captured in the trace session, as you see in the following screenshot.

    image

    This feature is particularly useful when you have many storage engine queries for a single MDX or DAX query, and you want to identify potential bottlenecks in both the storage engine (complex queries in SQL)  and the formula engine (which does not cache its results, and usually iterates all the rows of the data cache).

  • Analyze multiple EVALUATE statement in a single #dax statements in DAX Studio

    A few hours ago, DAX Studio 2.5.0 has been released, with a number of small new features (I will write about DirectQuery and new column in Server Timings in a future blog post and article). In the many bug fixes, this version of DAX Studio does not raise an error when multiple EVALUATE statements are executed within the same Run operation. This could be particularly useful when you analyze the DAX queries generated by Power BI, which optimize the roundtrip between client and server by using exactly this technique. However, DAX Studio still doesn’t have a full support, but it’s good enough to start an analysis. Let’s see in details the current situation:

    • Results: only the rows returned by the first EVALUATE are displayed in the Results pane. Currently, DAX Studio ignores the following resultsets, which are executed on the server and transferred to the client, but not displayed.
    • Query Plan: the logical query plan contains all the operation of all the EVALUATE statements. However, the physical query plan only displays the operations executed for the first statement, ignoring the physical query plans of following EVALUATE statements.
    • Server Timings: all the storage engine events of all the statements are displayed and computed. Thus, if you consider the set of EVALUATE statements as a single operation, the Server Timings does exactly the right thing. However, you cannot easily split the time spent for each EVALUATE statement.

    The plan for future improvements is to align Query Plan behavior to Server Timings, showing all the operations of all the EVALUATE statements. For Results pane, we have to find a way to display other resultsets in an efficient way (feedback is welcome – I don’t like the idea of creating a pane for each result).

    Looking at this issue, I also found an answer to a problem that I’ve found discussing with Chris Webb one week ago commenting his post Defining Variables in DAX Queries. The question was why you should use the VAR syntax before EVALUATE? For example, why you should use the first syntax instead of the second one?

    DEFINE
        MEASURE Sales[Qt] =
            SUM ( Sales[Quantity] )
        VAR TotalQuantity = [Qt]
    EVALUATE
    ADDCOLUMNS
        ALL ( 'Product'[Color] ), 
        "Qt %", [Qt] / TotalQuantity 
    )

    DEFINE
        MEASURE Sales[Qt] =
            SUM ( Sales[Quantity] )
    EVALUATE
    VAR
    TotalQuantity = [Qt]
    RETURN ADDCOLUMNS
        ALL ( 'Product'[Color] ), 
        "Qt %", [Qt] / TotalQuantity 
    )

    The reason is now clear to me: when you want to share the same variable in multiple EVALUATE statements, the former syntax guarantees a single definition and evaluation!

    DEFINE
        MEASURE Sales[Qt] =
            SUM ( Sales[Quantity] )
        VAR TotalQuantity = [Qt]
    EVALUATE
    ADDCOLUMNS
        ALL ( 'Product'[Color] ), 
        "Qt %", [Qt] / TotalQuantity 
    )

    EVALUATE 
    ADDCOLUMNS
        ALL ( 'Product'[Brand] ), 
        "Qt %", [Qt] / TotalQuantity 
    )

    I know, these details are interesting only if you are writing a DAX client and you are not in the Power BI team (they already use this technique) – in this case, write your comments below, I’d like to know who is working on these tools!

  • Q&A from 24 hour of PASS #pass24hop #powerbi

    One week ago I delivered a session for 24 hour of PASS, the online free event delivered by PASS (recordings are now available), where I introduced my one-day preconference Create a Power BI Solution in one day that I will deliver at PASS Summit 2016 in Seattle on October 25, 2016.

    As usual, there were too many questions and not enough time, so I include in this blog post the Q&A that I was not able to answer online. I hope it will be helpful.

    • Can you give us some information about the best way to govern security for accessing reports?
      • This is a long topic that you can understand better by reading the free eBook Introducing Microsoft Power BI.
      • In short, you can share a dashboard from your personal workspace (you can invite people from outside your organization in this case), or you can create a group workspace within your organization so that all the members access to all documents without requiring single authorization for each dashboard. You also have organizational content packs as a way to deliver shared content within an organization.When we can get a solution of BI like Power BI without to have publish my data on the cloud?
    • Do your company's network administrators have to open ports in order for the gateway to work?
      • The Data Gateway is like a client connecting to web services through ports 80/443, plus a few other outbound ports that have to be opened. You can find a detail here in section Ports.
    • Can PowerBI connect to an on-premise SQL Server OLTP database or only to an Analysis Services database?
      • You can do both. The connection with SQL Server could be in Import or DirectQuery mode, the one with Analysis Services could be in Import or Live mode. Fundamentally, Import creates a copy of data on Power BI service that you can refresh, and data are available even if your gateway is not accessible. Using DirectQuery / Live connections, data are not stored in Power BI service, but your on-premise server must we available at query time.
    • When using the Data Gateway with a windows user where does the AD that authenticates that user can reside?
      • The data gateway connects to Analysis Services using an administrator, and it can impersonate an user using the EffectiveUserName property in the connection string.
      • I suggest you reading the Power BI Security article written by Adam Saxton.
    • Any thoughts about the row-level security introduced in the July release of Power BI versus the use of row-level security in SSAS?
      • The row-level security is fundamentally the same feature you have in Analysis Services, just exposed through Power BI.
  • BLANK and Boolean functions like IF in #dax

    A recent change in the DAX language transformed the behavior of IF statement, so that it should not return BLANK but only TRUE/FALSE if the results should be logical expressions. In that case, the BLANK is transformed in a FALSE condition.

    For example, consider that the result of the following expression is FALSE and not BLANK:

    IF ( 1 = 1, BLANK(), TRUE )

    I wrote a longer explanation of that, thanks to Jeffrey Wang who provided the details of the implementation. Also a big thanks to Darren Gosbell, who raised the initial question.

  • Upcoming conference speeches and workshops in 2016 #ssas #tabular #dax #powerpivot #powerbi

    The summer is almost over and while we are working on new content (books and other for www.sqlbi.com), I already have the plans for this Autumn’s conferences.

    If you are interested in attending the PASS Summit 2016, don’t miss 24 hours of PASS (live online, September 7-8, 2016), I will preview the full-day seminar about Power BI on 07 Sep 2016 21:00 GMT. This event is free, you just have to register, and there are many other interesting sessions to watch.

    I and Alberto Ferrari will also also have a number of public trainings:

    The course about Analysis Services Tabular Workshop is renewed and updated to Analysis Services 2016. The one in Amsterdam will be the first delivery in a public classroom, depending on the demand, we will propose new dates in 2017.

    See you around the world!

  • Update custom visuals on OKViz (and name survey result) #powerbi

    In the last few days, users of the Synoptic Panel and Smart Filter (custom visuals for Power BI) experienced some issue in the behavior of these components. Changes applied to API and automatic updates pushed through the Microsoft Power BI Gallery created some unexpected problems. Now the components are synchronized on www.okviz.com and the Power BI Gallery. If you have used the components in Power BI Desktop, make sure to download and update the components the latest version available, and if necessary publish the report on Power BI service, too.

    We worked to make sure such a disruption will not happen again! We also have some interesting improvements for Smart Filter, but we have to make sure certain API will stabilize before deploying them.

    In the meantime we closed the survey for the OKViz name, and the result is… stay with OKViz! Complete results are available here.

  • Leverage INTERSECT to apply relationships in DAX

    If you are used to virtual relationships in DAX (see Handling Different Granularities in DAX), you probably use the following pattern relatively often:

    [Filtered Measure] :=
    CALCULATE (
        <target_measure>,
        FILTER (
            ALL ( <target_granularity_column> ),
            CONTAINS (
                VALUES ( <lookup_granularity_column> )
                <lookup_granularity_column>,
                <target_granularity_column> 
            )
        )
    )

    In the new DAX available in Excel 2016*, Power BI Desktop, and Analysis Services 2016, you can use a simpler syntax, which offers a minimal performance improvement and is much more readable:

    [Filtered Measure] :=
    CALCULATE (
        <target_measure>,
        INTERSECT (
            ALL ( <target_granularity_column> ),
            VALUES ( <lookup_granularity_column> )
        )
    )

    You can find a longer explanation of this new pattern and download some examples in the new article Physical and Virtual Relationships in DAX, on SQLBI web site.

  • Happy Birthday Power BI #powerbi

    Power BI has been on the market one year. My biggest concern, when the product was still in private beta, was the promise of monthly releases made by Microsoft. Today, I can say that the promise was real. I see a long road ahead, in terms of features and possible improvements. But it’s a matter of fact that Power BI is a product with a growing user adoption, that every month adds features that increase the number of companies and users that can consider its usage.

    If you go back to the situation of Microsoft BI two years ago, you should remind the lack of a mobile story, the requirement for SharePoint and the poor situation on the client side. A great server product (such as Analysis Services) was limited in its adoption because of the client options available. Today the trend is completely different, and on certain areas of the BI platform Microsoft became the leader instead of a follower.

    I always try to find the missing part, what can be improved, without spending too much time praising what is good (and there are many things that are). But, today, I just want to join the choir you will see in the video:

    HAPPY BIRTHDAY POWER BI !

  • A new MemoryHeapType default in #ssas #tabular 2016 (please, fix your setting in production!)

    If you already installed Analysis Services 2016, you should change the MemoryHeapType setting. There is a new default value (-1), which is an automatic choice that currently applies a new hybrid allocator (which has the number 5 as a value). It should resolve the memory fragmentation problem causing performance issue as described in an article I wrote a few years ago. However, the setup does not write the new default value as a current value and it still write the old default “2”, which is not good for Tabular. Thus, if you installed SSAS Tabular 2016, you probably have this setting (look at the Default Value different than the Value!). The new setting is also the suggested one for Multidimensional.

    image

    You should change the value to -1 and then restart the service. After that, reconnect to SSAS Properties and check that you have the following configuration:

    image

    Of course, we hope future updates of SQL 2016 will fix this setup issue. In the meantime, fix the setting to avoid performance issues on a production server!

  • Free Introducing Power BI eBook and new DAX recorded video course #powerbi #dax

    Microsoft Press released a free eBook you can already download, Introducing Microsoft Power BI, which I and Alberto Ferrari wrote in the last few months. Please note it is a very introductive book, don’t expect an inside-out. As we wrote in the introduction:

    analyticsWe wanted to write an introduction to Power BI that covers the basics of the tool and, at the same time, shows you what the main capabilities of Power BI are. […] At the beginning, we go for an easy introduction of the concepts along with an educational approach that lets you follow on your PC the same steps we show in the book. […] After the first chapters, we begin to run a bit faster, knowing that we are no longer guiding you step by step. […]

    This book is targeted to a variety of readers. There are information workers and people who are totally new to the BI world. For those readers, the book acts as a simple introduction to the concepts that are the foundation of BI. Yet, another category of we wanted to target is that of IT professionals and database administrators who might need to drive the decisions of the company in adopting Power BI, because their users are asking for it. If this is you, this book acts as both a simple introduction to the basic concepts, to help you understand why users are so interested in Power BI, and as an overview of the capabilities and tools available in Power BI, so that you can make educated choices in adopting it.

    As you see in the side picture, we included some real-world reports, and we have an entire chapter titled “Improving Power BI reports” where you will find a number of useful examples and guidelines. And, of course, they are included in the companion content, which is a separate download available here. We used some of the components available ok okviz.com, even if when we wrote the book such a web site was not ready, so we didn’t mention it in the book.

    masteringdaxThe goal of the book is to be introductive. So, what’s next? We are working on some new content for later this year, but in the last few months we also worked on a recorded version of our Mastering DAX course. So, if you cannot join us in one of the many classes we deliver around the world, you can now get a recorded version, which is complete with all the exercises. You will not have the same interaction that is possible in a classroom, but early adopters who also attended the live class told us that getting a recorded video as a revise tool. The structure and the flow is the same, and we tried to compensate the lack of interactivity with a physical presence on the screen.

    You can watch a number of segments for free, and you can save 70 USD until July 5 getting access to the entire course.

  • Many-to-many relationships in DAX and CROSSFILTER #dax #powerbi #powerpivot #ssas #tabular

    I wrote many articles and white papers about implementing many-to-many relationships in Analysis Services - for both Multidimensional and Tabular models. For the latter, the same techniques are also useful in Power Pivot and Power BI. With the recent release of Excel 2016, Analysis Services 2016, and the engine available in Power BI Desktop, there are new tools and techniques that we can use to implement this type of data models, which are more efficient and easier to manage in your DAX code.

    The article Many-to-many relationships in Power BI and Excel 2016 describes how to use the bidirectional filters in Power BI and Analysis Services 2016, and also how to obtain the same performance in Excel 2016 by using the CROSSFILTER function. These techniques use a smart technique that applies the filter through the relationships in the many-to-one direction only when this is really necessary, because there is an existing filter over the bridge table in a many-to-many relationship. This is possible in the "old" DAX with a more complex conditional statement, which makes the code less readable and also suffer of another performance issue related to the usage of an IF statement in a measure (also solved in the "new" engine).

    I think there are a huge number of data modeling options that are now possible thanks to these innovations, many of them will have an impact on several DAX Patterns, which we will revisit in a few months using the new techniques. 

  • Licensing and upgrade options for Analysis Services Tabular in SQL Server 2016 #ssas #tabular

    With the release of Microsoft SQL Server 2016, there are several options to use Analysis Services Tabular.

    • SQL Server Standard: it includes Analysis Services Tabular, and it is available in two licensing models: Server + Client Access License (CAL) and Core-based.
    • SQL Server Enterprise: it has all the features, and is available only in a license per-core.

    For a Tabular model, the limitations existing for the Standard edition affect the memory available for an instance (16GB) and the lack of these features, available only in Enterprise: DirectQuery, partitions, and perspectives. The limit of 24 cores for the standard fundamentally is a non-issue, because with the right hardware with a single socket you should have so many cores (no NUMA is still better for Tabular, at least in current release of 2016). You can find more details in this detailed matrix of the features available by the Editions of SQL Server 2016.

    If you have a Tabular model in SSAS 2012 or SSAS 2014 today, you might want to upgrade immediately because of the big performance improvements available in SSAS 2016 (almost any data model and report should benefit). However, what are the options to upgrade? It depends on your current license and whether you have a software assurance (SA) in place or not.

    • SQL 2012/14 Enterprise with SA: you can upgrade today, you should be already covered.
    • SQL 2012/14 Business Intelligence with SA: for the duration of your current SA contract, you can upgrade to SSAS 2016 using a licensing model Server+CAL that is not available otherwise. At the end of your SA, you will have to renew using current SQL Server Enterprise licensing terms. You can also get new licenses (of Business Intelligence that you upgrade in this way) up to 25% of the number you had on May 1, 2016.
    • SQL 2012/14 Enterprise or Business Intelligence without SA: you have to buy a license for SQL 2016. You might consider using the Standard for SSAS Tabular in case your model does not need the features available in Enterprise, otherwise you should get an Enterprise license.

    Please note this is a summary I created to recap the current situation. I suggest you to consider all details and terms explained in this PDF. You can also visit this Microsoft website with all the information about SQL Server licensing.

    Finally, a very welcome news is that Microsoft SQL Server 2016 Developer edition is now free. It had a minimal cost in the past, but now it will be much easier to install development enviroments also in those companies with a very long and bureaucratic procurement process, other than for all the consultants who want to study the new environment on their workstations without using a trial version expiring in 6 months.

More Posts Next page »

This Blog

Syndication

Archives

Privacy Statement