Tuesday, April 3, 2018

Power BI: M Query Basics (Part 1)

When you work with Power BI, you'll find that there are two languages to work with: DAX and M. Since most of the M query that you will regularly use is automagically (*that's not a real word, but it SHOULD be one) written by the Power BI user interface, the majority of Power BI developers focus on becoming proficient in DAX and really only look at M when they're stuck in a situation that requires that they learn it.

This M&M is a mashup of chocolate
 and pretzels
So, what exactly is M?

Some people think M is for "Microsoft".

Not quite.

According to the Microsoft Power Query M Reference,  "The Power Query M formula language is optimized for building highly flexible data mashup queries. It's a functional, case sensitive language similar to F#."

Generally speaking, most Power BI professionals settle upon saying that M is for 'Mashup'.

Now, when I heard that, the first thing I thought of was the orange pretzel M&M guy - he's a mashup of a favorite chocolate candy with pretzel and maybe some other stuff. Quite enjoyable, but not your standard or traditional M&M - just like M Query.

M is a functional script language with a step by step language structure used for data preparation. With it, one can transform or combine data sets, pivot, unpivot, split columns,  add or remove columns, remove unwanted characters, modify data types and perform data manipulations, among other things.

In Power BI, M Query can be automagically generated within its Query Editor user interface. The "Applied Steps" listing there summarizes all of the sequential M Query steps that have been applied to the current query, allowing users who aren't proficient in M Query to quickly view the changes made to the data set and to modify the changes without intimate knowledge of the M Query syntax.

Let's take a look at the structure of an M Query. You can access the M Query code from the Query Editor in Power BI by clicking on "Advanced Editor".

Here is an example of a Power BI M Query:

In Dark Blue, we have the "let" expression. This is equivalent to our SQL "CREATE PROCEDURE" declaration - it demarks the beginning of the series of steps used to transform the data into the final table to be used in Power BI.

Next, in Purple, we have the "in" expression. The Named Expression which follows "in" is the Output or final result of our M query. This defines the values that are seen in the grid view within the query editor.

In Aqua, we have the Named Expressions and variables. Each Step is a Named Expression which functions as a label to identify that step. Steps that are a single word may be declared by simply using the word alone, however, step names which contain spaces must be marked by using # and then enclosing the words within ". An example would be #"Renamed Columns"... Note that Source, which is only one word, does not need either the # or the " surrounding it.

In Green, we have the M Functions. These are the transformations performed upon the data set and the commands used to shape or mashup the data as needed by the final Power BI report.

In Black are the references to the previous steps. If you check each line, you will see that the Previous Step reference (in black) exactly matches the Named Expression (in Aqua) in the line above it. This helps demonstrate the step by step nature of the language - each successive step references and builds upon prior steps.

 There are a few other types within the M Query syntax, such as strings in Brick Red, data types and key words, but generally speaking the above summarizes the M Query syntax.

No comments:

Post a Comment

Resoved: Error Creating Storage Event Trigger in Azure Synapse

My client receives external files from a vendor and wants to ingest them into the Data Lake using the integration pipelines in Synapse (Syna...