DAX – Find the Items Ranked in Top n for Multiple Periods (with Dynamic Slicing)

One of my previous blog post introduces how to find the items which are ranked in top n for multiple periods, using the INTERSET and TOPN functions. However, that approach needs to hard-code the periods and the number of top items in the DAX scripts. This blog post introduces an approach that allows users to dynamically specify the periods and the number of top items to evaluate, using the interactive dashboard slicers.

a1

In this blog post, we will still use the Eurovision dataset as example that contains the rows of country-to-country votes for each year.

1

We will create four measures, including “Rank”, “In Top N (This Year)”, “In Top N (All Selected Years)”, and “All Selected Years in Top N”. These measures will be used in an evaluation context made of the combination of each year and each country. To build the evaluation context, we can use a Power BI table visual and add the “Year” and “ToCountry” columns from the Eurovision dataset to the table. The four measures will be added to the table later that evaluates the rank and whether in top n of each country in each year.

a2

A “Year” slicer will be added to the dashboard that allows users to filter the table by the selected years. Any number of years can be selected and the selected years can be consecutive or nonconsecutive.

a4

Measure – Rank

The first measure to create is the “Rank” measure that computes the ranks of the countries in each selected year.

Rank = RANKX(ALL(data[Country]), CALCULATE(SUM(data[Points])))

Measure – In Top N (This Year)

Based on the “Rank” measure, we will create the “In Top N (This Year)” measure that compute whether the current country is ranked in top n in the current year-country evaluation context. Here we need to allow users to dynamically specify the N (the number of top items) to evaluate. We can achieve that using a disconnected parameter table that defines the options for the N.

a3

In the DAX measure, we can get the user selected N value using VALUES function which will be compared to the “Rank” measure we created earlier to evaluate whether the current country is in top N in current year context.

In Top N (This Year) = 
    IF([Rank]<
        IF(HASONEVALUE('TopN'[Top N ]),
            VALUES('TopN'[Top N ]),
            10
        ), 1, 0)

We will then filter the table using the “In Top N (This Year)” measure that only keeps the countries ranked in the top N in at least one of the selected years.

a5.PNG

a7

Measure – In Top N (All Selected Years)

After we applied the filter on the “In Top N (This Year)” measure, the table only contains the rows of countries ranked in top N in at least one selected years. If we count the rows in the filtered table by a country, we will  get the number of selected years when this country is ranked in top N. This is what the “In Top N (All Selected Years)” measure will do.

In Top N (All Selected Years) = 
   CALCULATE(
        DISTINCTCOUNT(data[Year]),
        ALLSELECTED(data[Year])
    )

a8

Measure – All Selected Years in Top N

Now that we have the “In Top N (All Selected Years)” measure which tells us how many of the select years  a country is ranked in top 10, we can then calculate the total number of the select years and compare it to the “In Top N (All Selected Years)” measure. If the value of the “In Top N (All Selected Years)” measure is equal to the total number of selected years, that means the country is  ranked in top 10 in all the selected years.

All Selected Years in Top N = 
    VAR NumberOfSelectedYears = 
        CALCULATE(
            DISTINCTCOUNT(data[Year]),
            ALLSELECTED(data[Year]),
            ALLSELECTED(data[Country])
        )
    RETURN
        [In Top N (All Selected Years)] = NumberOfSelectedYears

a9.PNG

Please find the pbix file here.

Advertisements

R Visual – Create Gartner Magic Quadrant-Like Charts in Power BI using ggplot2

In this blog post, I am going to create a R visual that renders the Gartner magic quadrant-like charts in Power BI using the ggplot2 package.

2.PNG

A dummy dataset will be created, including three columns, the “Company” column holding the name of the companies which will be ranked in the quadrant chart, the “ExcutionScore” column and the “VisionScore” column corresponding to the “Ability to Execute” metric and the “Completeness of Vision” metric in the Gartner magic quadrant assessment. In the dummy dataset, the “ExcutionScore” and the “VisionScore” are scale from 0 to 100.

3

We drag a R visual onto Power BI editor canvas and add the three columns from the dummy dataset. We can bind the RStudio IDE to Power BI and use it to author and test the R scripts.

In the R script editor, we first reference the “ggplot2” library and the “grid” library. The “grid” library is used to draw custom annotations outside of the main ggplot2 panel.

library(ggplot2)
library(grid)

We then create a ggplot2 object using the dataset referenced in the R visual, assigning the “VisionScore” value to x-axis and assigning the “ExcutionScore” value to y-axis.

p <- ggplot(dataset, aes(VisionScore, ExcutionScore))
p <- p + scale_x_continuous(expand = c(0, 0), limits = c(0, 100)) 
p <- p + scale_y_continuous(expand = c(0, 0), limits = c(0, 100))

4

We now have our base panel and we can start our journey to build the Gartner Magic Quadrant-Like chart.

First of all, we set the x-axis label as “COMPLETEMENT OF VISION” and set the y-axis label as “ABILITY TO EXECUTE” and make them aligned to left-side. We then remove the axis ticks and text from the plot. We will also add a title to the top of the plot.

p <- p + labs(x="COMPLETEMENT OF VISION",y="ABILITY TO EXECUTE")
p <- p + theme(axis.title.x = element_text(hjust = 0, vjust=4, colour="darkgrey",size=10,face="bold"))
p <- p + theme(axis.title.y = element_text(hjust = 0, vjust=0, colour="darkgrey",size=10,face="bold"))

p <- p + theme(
          axis.ticks.x=element_blank(), 
          axis.text.x=element_blank(),
          axis.ticks.y=element_blank(),
          axis.text.y=element_blank()
        )

p <- p + ggtitle("Gartner Magic Quadrant - Created for Power BI using ggpolt2") 

Those steps will progress our chart to somewhere like:

5

We then add four rectangle type of annotations to fill the four quadrant areas using the Gartner magic quadrant scheme. We also need to create a border and split lines for the quadrant chart.

p <- p +
      annotate("rect", xmin = 50, xmax = 100, ymin = 50, ymax = 100, fill= "#F8F9F9")  + 
      annotate("rect", xmin = 0, xmax = 50, ymin = 0, ymax = 50 , fill= "#F8F9F9") + 
      annotate("rect", xmin = 50, xmax = 100, ymin = 0, ymax = 50, fill= "white") + 
      annotate("rect", xmin = 0, xmax = 50, ymin = 50, ymax = 100, fill= "white")

p <- p + theme(panel.border = element_rect(colour = "lightgrey", fill=NA, size=4))
p <- p + geom_hline(yintercept=50, color = "lightgrey", size=1.5)
p <- p + geom_vline(xintercept=50, color = "lightgrey", size=1.5)

6

We also need to add a label to each quadrant area:

p <- p + geom_label(aes(x = 25, y = 97, label = "CALLENGERS"), 
                    label.padding = unit(2, "mm"),  fill = "lightgrey", color="white")
p <- p + geom_label(aes(x = 75, y = 97, label = "LEADERS"), 
                    label.padding = unit(2, "mm"), fill = "lightgrey", color="white")
p <- p + geom_label(aes(x = 25, y = 3, label = "NICHE PLAYERS"), 
                    label.padding = unit(2, "mm"),  fill = "lightgrey", color="white")
p <- p + geom_label(aes(x = 75, y = 3, label = "VISIONARIES"), 
                    label.padding = unit(2, "mm"), fill = "lightgrey", color="white")

7

Up to this point, our chart starts to look like the Gartner magic quadrant. Next, we need to draw the company points to the chart with the position corresponding to their “Ability to Execute” value and “Completeness of Vision” value.

p <- p + geom_point(colour = "#2896BA", size = 5) 
p <- p  + geom_text(aes(label=Company),colour="#2896BA", hjust=-0.3, vjust=0.25, size=3.2)

8

Our quadrant chart is nearly done, just one part missing, the arrows next to the “Ability to Execute” and “Completeness of Vision” text labels.

10.PNG

As the arrows need to be located outside of the main panel, we need to create custom annotation (annotation_custom) with linesGrob to draw a straight line with an arrow at the far end of the line. To make the arrows to visible outside of the main panel, we need to turn off the clip attribute of the main panel.

p <- p + annotation_custom(
            grob = linesGrob(arrow=arrow(type="open", ends="last", length=unit(2,"mm")), 
                   gp=gpar(col="lightgrey", lwd=4)), 
            xmin = -2, xmax = -2, ymin = 25, ymax = 40
          )
p <- p + annotation_custom(
  grob = linesGrob(arrow=arrow(type="open", ends="last", length=unit(2,"mm")), 
                   gp=gpar(col="lightgrey", lwd=4)), 
  xmin = 28, xmax = 43, ymin = -3, ymax = -3
)

gt = ggplot_gtable(ggplot_build(p))
gt$layout$clip[gt$layout$name=="panel"] = "off"
grid.draw(gt)

We now have our completed quadrant chart.

9

You can find the complete source code here:

Please find the pbix file here.

DAX – Find the Items Ranked in Top n for Multiple Periods

UpdateI have suggested another approach here that allows users to dynamically specify the periods and the number of top items to evaluate, using the interactive dashboard slicers.

When analysing the best performers against a specific measure such as the best sold products, we sometimes need to take multiple periods into consideration. For example, we want to find the products that are not only ranked in Top 10 in this year but also in the other years. This blog post introduces how to achieve this type of calculations using DAX.

Here we will use Eurovision competition dataset as the example to compute the countries that are ranked in top 10 for both year 2015 and 2016.

The Eurovision competition dataset contains the rows of country-to-country votes for each year.

1

To find the countries which are ranked in top 10 for both year 2015 and 2016, we first compute the top 10 countries for each year, using CALCULATETABLE function to filter on the year and TOPN DAX function to return the set of countries ranked in top 10 for that year. Then we use the INTERSECT function to return the countries appearing in both years.

Top Countries In Both 2015 And 2016 = 
    INTERSECT(
        CALCULATETABLE(
            TOPN(10,
                SUMMARIZE(data, data[ToCountry]),
                CALCULATE(SUM(data[Points]))
             ),
            data[Year]=2016
        ),
        CALCULATETABLE(
            TOPN(10,
                SUMMARIZE(data, data[ToCountry]),
                CALCULATE(SUM(data[Points]))
             ),
            data[Year]=2015
        )
    )

The DAX script above will return the three countries which are ranked in top 10 for both year 2015 and 2016.

2

We can further improve the DAX script to make it return not only the name of the country but also the rank of the country for each year.

3

We can use the SUMMARIZECOLUMNS funciton combined with the RANKX function to computer the rank for all the countries and then use the NATURALINNERJOIN function to inner join the set we created earlier for computing the countries ranked in top 10 for both year 2015 and 2016.

Top Countries In Both 2015 And 2016 = 
  NATURALINNERJOIN(
    CALCULATETABLE(
        SUMMARIZECOLUMNS(
                        data[Year], 
                        data[ToCountry], 
                        "Rank", RANKX(ALL(data[ToCountry]), CALCULATE(SUM(data[Points])))
                        ), 
        data[Year]=2016 || data[Year]=2015
    ),
    INTERSECT(
        CALCULATETABLE(
            TOPN(10,
                SUMMARIZE(data, data[ToCountry]),
                CALCULATE(SUM(data[Points]))
             ),
            data[Year]=2016
        ),
        CALCULATETABLE(
            TOPN(10,
                SUMMARIZE(data, data[ToCountry]),
                CALCULATE(SUM(data[Points]))
             ),
            data[Year]=2015
        )
    )  
  )

 

R Visual – from Grid-Facet to Geo-Facet in Power BI

R Visual – from Grid-Facet to Geo-Facet in Power BI

In one of my previous blog post, I used the facet_wrap function in ggplot2 package to build a grid facet to display the rank history of each Eurovision competition country.

1t1

The grid facet looks pretty neat as all sub-panels are perfectly aligned, however, it fails to display the geospatial information of the countries that may reveal some useful insights. For example, in my last blog post , I built a voting network chart of Eurovision competition that has revealed the mutual high voting scores between some neighbour countries.

There is a R package, namely geofacet, which comes with a list of pre-built geospatial grids for a number of geographical areas, countries and states. One of the pre-built grids is for Europe area which is perfect for our Eurovision example.

It is very straightforward to use the geofacet package. After referenced the package in our R script, all we need to do is to replace the facet_wrap function in our ggplot2 code with the facet_geo function provided by the geofacet package. We need to specify the column by which the facet is divided and the name of the pre-built grid we will use. In this example, we use “eu_grid1” which is the grid for Europe area.

b3

Now we have done all the work to convert our standard grid facet to geospatial facet. You can download the pbix file here.

b2

Apart from the Europe area grid, you can find a list of other pre-built grids here. Considering where I am living at this moment, another pre-built grid I am particularly interested at is the London Borough grid. This is a geo-facet chart I have created to visualise the unemployment rate in the London boroughs.

b1 You can also create your own grid which is literally a data frame with four columns, name and code columns that map to the facet label column in the dataset, and the row and col columns that specify the grid locations.

This is a test grid I have created to demonstrate how to create custom grid:

customGrid <- data.frame(
  name = c("Enfield", "Haringey", "Islington", "Hackney", "Camden", "Hackeny", "Redbridge", "Brent", "Ealing"),
  code = c("Enfield", "Haringey", "Islington", "Hackney", "Camden", "Hackeny", "Redbridge", "Brent", "Ealing"),
  row = c(1, 2, 3, 3, 3, 3, 3, 4, 5),
  col = c(3, 3, 5, 4, 1, 2, 3, 3, 3),
  stringsAsFactors = FALSE
)

b4

R Visual – Build Eurovision Voting Network Chart in Power BI

I have been watching Eurovision competitions for several years. I personally think the voting results from Eurovision competitions can be a very good source for the research of relationships between European countries. In this blog post, I will create a social network R visual using iGraph package and use the visual to analyse the voting network of Eurovison competitions.

1

Firstly, we need to prepare our raw Eurovision dataset into the following format that contains three columns, “From country” (where is the vote from), “To Country” (where is the vote to), and “Avg Point” (that computes the average points the “From country” has given to the “To Country” over the years. You can prepare the data either using DAX (creating the “AvgPoint” measure) or Power Query (group by “From country”+”To Country” and calculate average points).

a3

We add a R visual to the Power BI canvas and add the three columns to the R visual. If you prefer to use other R IDE (e.g., RStudio) to edit the R scripts, you can bind your IDE to Power BI.

We will use the igraph R package to render the voting network chart. Firstly, we need to load igraph library and then create a igraph data frame from the dataset specified on the Power BI R visual. We will use the plot function to render the network chart with the style attribute settings of vertex, edge etc.

# user igraph library
library(igraph)

# create a igraph data frame from the dataset specified on the Power BI R visual
df.g <- graph.data.frame(d = dataset, directed = TRUE)

# define colors
comps <- components(df.g)$membership
colbar <- rainbow(max(comps)+1)
V(df.g)$color <- colbar[comps+1]

# render the network chart and set the style attributes of vertex, edge etc.
plot(df.g, 
     vertex.label = V(df.g)$name,
     layout=layout_with_fr, 
     vertex.size=12,
     vertex.label.dist=0, 
     vertex.label.color= "darkblue",
     vertex.shape = "circle",
     vertex.label.cex = 1,
     vertex.label.font = 2,
     edge.arrow.size=0.5,
     edge.curved=T,
     margin =-0.05
 )

a4

After we authored and tested the R script in RStudio, we can now add the script to the R visual in Power BI that will be able to interact with other visuals on the same page.

Before we set any threshold on the average voting points, all voting paths between the countries will be draw on the network chart that makes the chart unreadable.

a2

However, when we set a higher average voting points as threshold which only shows the voting path over the threshold, we can find some relationship patterns.

1

For example, we can see the mutual high votes between neighbour countries like Spain <-> Andorra, Roumania <-> Moldova, and Greece<->Cyprus.

11.PNG

Please find the pbix file here.

 

R Visual – Building Facet Grid in Power BI

Since Power BI started to support R visual, it has become difficult to criticise Power BI’s visualisation capability because we can now take full advantage of R’s powerful visualisation packages such as ggplot2 to create Power BI reports. Unlike creating Power BI custom visual which is a rather time-consuming task, we can create eye-catching charts in just a few of lines with R visual.

Facet grid is a popular chart type but is not supported by Power BI yet. However, we can easily build a facet grid chart with the help of ggplot2 package.

1t1

Firstly, we need to get our data into the right format. In this example, we use the Eurovision competition dataset which contains the voting records between 1975 to 2016.

5

We need to calculate the rank of each country for each year based on the points they received from the rest of countries. We can use the DAX RANKX function to calculate the rank measure and get the results like:

2

Now we are ready to create our facet grid visual. We add a R visual to the Power BI canvas and add three columns, Year, ToCountry, Rank, to the visual.

7.PNG

3

On the R script editor, we first reference the ggplot2 library and then create a ggplot object placing Year on x-axis and Rank on y-axis. The key step is to add facet_wrap(~ToCountry) that generates the facet grid by voting destination country (ToCountry column).

4

Please download the pbix file here.