I’ve recently discovered Power
BI. Basically, it’s the child of Power Pivot and SQL Reporting Services, but
without Kryptonite to slow it down. You can connect to all kinds of
datasources – Excel, SQL, Hadoop, folders ….and 55+ other data sources – and
then perform analyses of the data therein. Microsoft has even allowed developers
to publish / create new “visualizations” (charts, graphs, maps, etc.) for users
to adopt for their data analysis.
I decided to give it a try, and used my Wine Cellar Excel spreadsheet as the
test data set so I could more easily recognize any trends that the
Visualizations pointed out.
I first evaluated my collection by Country & Region by Quantity of Wine
(in ml). The light blue circles correspond to “France”, and the pink is
“Portugal”…Somehow Burgundy, France wound up near New York…and our Douro,
Portugal (Port!!) sat near the Great Lakes. This I cannot explain.. .I had to
edit “Rhone, France” to “Lyon, France” to move it from somewhere in Asia to
France. I’m not sure how it was handling the mapping, but I guess it cannot handle Region names.
The only way I could get the mapping to look better was to look up the regions
and set them to specific cities in each country. I guess that the mapping
program cannot understand Regions (Bordeaux, Burgundy, Cotes D’Or, Willamette
Valley, Douro Valley, etc.). Therefore, if one wishes to use Power BI to display
data on maps, then the locations listed must be City names, and not Region
names, nor a mix of Cities and Regions as I had in my Wine List spreadsheet.
I then tried and analysis of Quantity (in ml) by Year and Appellation. This
provided the expected results: we own a LOT of Chateauneuf du Pape 2005 (Pink in below graph).
I also evaluated it using the pie chart to see how that turned out.
Again, it’s obvious what we like to stock in our wine cellar (light blue is all 2005, and the majority is Chateauneuf du Pape), but it also
shows that Power BI makes it extremely simple to illustrate that fact through
the visualizations it provides - I spend only a few minutes to generate these charts from my data.
All in all, I’d say that Power BI is very useful for quick and efficient analysis of any dataset
where a graph or chart would best illustrate the relationship of data elements
within it.
Tuesday, April 26, 2016
Tuesday, April 19, 2016
Mutually INCLUSIVE Sets In SQL
SQL is your best friend when you need to find mutually exclusive sets of
data – the more conditions you add to your query, the fewer records are
returned. But how do you handle mutually INCLUSIVE sets where you want the
content of each group to expand as more criteria are met?
Most people when faced with a request such as that will respond similar to our Mikos (on the left) – with the “You want what? When?” look.
I had the person requesting this code twice demonstrate how he calculated the mutually inclusive set within Excel before taking on the project so I could be certain I understood the logic he was employing to calculate them.
The first demonstration included some “black magic”. During the second demonstration, I asked him to slow down and explain why he was doing what he did.
Normally, there’s a one-to-one or one-to-many relationship between Milestones and Contracts for Projects (at least in the software that I have worked with), so this was to handle the exceptional relationship so they could calculate Project costs and revenue appropriately.
Project – A project – like “build a house”
Project Milestone – Completion of certain tasks for a project – like Painting or Install Carpeting
Contract – the agreement between Customer & the subcontractor doing the work – the painter who also installs carpeting
Reporting Groups – Mutually inclusive sets related to the Milestone & Contract combinations – one reporting group contains all milestones for each Contract as well as all Contracts for each milestone
The following may not be ideal SQL, but I found that it returned results exactly matching several examples provided by the Customer. Yes, the seemingly redundant updates are necessary because without them the final output does not match his final results. I have not found any alternative method to replicate the same results more efficiently.
Please note that I’ve simplified the content:
DECLARE @ContractMilestones TABLE (
Co TINYINT
,Project VARCHAR(10)
,[Contract] VARCHAR(16)
,Milestone VARCHAR(20)
)
INSERT @ContractMilestones (Co, Project, [Contract], Milestone)
VALUES
( 102,'Test', '5','493')
,( 102,'Test', '1.01','111')
,( 102,'Test', '1.01', '211')
,( 102,'Test', '1.01', '311')
,( 102,'Test', '2.02', '111')
,( 102,'Test', '3.01', '211')
DECLARE @ReportingGroups TABLE (
Co TINYINT
,Project VARCHAR(10)
,Milestone VARCHAR(20)
,[Contract] VARCHAR(16)
,MilestoneRptGroup INT
,ContractGrouping INT
,FinalGroup INT
);
WITH preranked (
Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
)
AS (
SELECT Co
,Project
,Milestone
,[Contract]
,DENSE_RANK() OVER (
PARTITION BY Co
,Project ORDER BY Co
,Project
,Milestone
) AS MilestoneRptGroup
FROM @ContractMilestones
)
,contractgroupd (
Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
,ContractGrouping
)
AS (
SELECT Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
,min(MilestoneRptGroup) OVER (
PARTITION BY [Contract] ORDER BY [Contract] DESC
) AS ContractGrouping
FROM preranked
)
INSERT @ReportingGroups (
Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
,ContractGrouping
,FinalGroup
)
SELECT Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
,ContractGrouping
,DENSE_RANK() OVER (
ORDER BY Co
,Project
,ContractGrouping
) AS FinalGroup
FROM contractgroupd
ORDER BY FinalGroup
DECLARE @ItemGrouping INT
DECLARE @MilestoneGroup TABLE (MilestoneRptGroup INT)
DECLARE @ContractGroup TABLE (ContractRptGroup INT)
SELECT @ItemGrouping = min(FinalGroup)
FROM @ReportingGroups
WHILE @ItemGrouping IS NOT NULL
BEGIN
INSERT @MilestoneGroup (MilestoneRptGroup)
SELECT DISTINCT MilestoneRptGroup
FROM @ReportingGroups
WHERE FinalGroup = @ItemGrouping
UPDATE @ReportingGroups
SET FinalGroup = @ItemGrouping
,MilestoneRptGroup = @ItemGrouping
WHERE MilestoneRptGroup IN (
SELECT MilestoneRptGroup
FROM @MilestoneGroup
)
DELETE
FROM @MilestoneGroup
INSERT @ContractGroup (ContractRptGroup)
SELECT DISTINCT ContractGrouping
FROM @ReportingGroups
WHERE FinalGroup = @ItemGrouping
UPDATE @ReportingGroups
SET FinalGroup = @ItemGrouping
,ContractGrouping = @ItemGrouping
WHERE ContractGrouping IN (
SELECT ContractRptGroup
FROM @ContractGroup
)
DELETE
FROM @ContractGroup
INSERT @MilestoneGroup (MilestoneRptGroup)
SELECT DISTINCT MilestoneRptGroup
FROM @ReportingGroups
WHERE FinalGroup = @ItemGrouping
UPDATE @ReportingGroups
SET FinalGroup = @ItemGrouping
,MilestoneRptGroup = @ItemGrouping
WHERE MilestoneRptGroup IN (
SELECT MilestoneRptGroup
FROM @MilestoneGroup
)
DELETE
FROM @MilestoneGroup
INSERT @ContractGroup (ContractRptGroup)
SELECT DISTINCT ContractGrouping
FROM @ReportingGroups
WHERE FinalGroup = @ItemGrouping
UPDATE @ReportingGroups
SET FinalGroup = @ItemGrouping
,ContractGrouping = @ItemGrouping
WHERE ContractGrouping IN (
SELECT ContractRptGroup
FROM @ContractGroup
)
DELETE
FROM @ContractGroup
SELECT @ItemGrouping = min(FinalGroup)
FROM @ReportingGroups
WHERE FinalGroup > @ItemGrouping
END;
-- This is the final table that holds the data we've retrieved
-- insert commented out so we can do a simple select.
-- INSERT FinalReportingGroups (
-- Co
-- ,Project
-- ,[Contract]
-- ,ReportingGroup
-- ,Milestone
-- )
SELECT Co
,Project
,[Contract]
,DENSE_RANK() OVER (
ORDER BY Co
,Project
,ContractGrouping
) AS ReportingGroup
,Milestone
FROM @ReportingGroups
ORDER BY Co
,Project
,[Contract]
,Milestone
Most people when faced with a request such as that will respond similar to our Mikos (on the left) – with the “You want what? When?” look.
I had the person requesting this code twice demonstrate how he calculated the mutually inclusive set within Excel before taking on the project so I could be certain I understood the logic he was employing to calculate them.
The first demonstration included some “black magic”. During the second demonstration, I asked him to slow down and explain why he was doing what he did.
Terminology
Essentially, they had a many to many relationship between Project Milestones and Contracts for each Project and they needed to be able to accurately report the revenue for these items.Normally, there’s a one-to-one or one-to-many relationship between Milestones and Contracts for Projects (at least in the software that I have worked with), so this was to handle the exceptional relationship so they could calculate Project costs and revenue appropriately.
Project – A project – like “build a house”
Project Milestone – Completion of certain tasks for a project – like Painting or Install Carpeting
Contract – the agreement between Customer & the subcontractor doing the work – the painter who also installs carpeting
Reporting Groups – Mutually inclusive sets related to the Milestone & Contract combinations – one reporting group contains all milestones for each Contract as well as all Contracts for each milestone
Algorithm
Determining the correct Reporting Group value is a multi-step process calculated over all Contracts and Milestones associated to the Project.- Sort by Milestone
- Assign First-pass group value using DENSE_RANK() OVER (ORDER BY Milestone)
- Sort By Contract
- For each Contract with multiple Milestones, if the grouping values differ, assign all instances of the grouping value to the lowest value (Example: In Item 16, there are two Milestones. One has Grouping Value 4 & other has Grouping Value 6. Set all Grouping Value 6 instances = 4. Repeat for each Contract where a similar situation exists)
- Sort by Grouping Value
- Update Grouping Values so that any gaps in the sequence are closed (Example: Grouping Value 6 was removed in an earlier step, so all items with Grouping Value 7 become 6, then all the 8s become 7, etc. so that the Grouping Values are consecutive.)
Code
I found that the only way to reproduce the results was to loop through the data sets similar to the demonstrated method which includes suspiciously redundant code required to produce the “correct-according-to-the-customer” results.The following may not be ideal SQL, but I found that it returned results exactly matching several examples provided by the Customer. Yes, the seemingly redundant updates are necessary because without them the final output does not match his final results. I have not found any alternative method to replicate the same results more efficiently.
Please note that I’ve simplified the content:
- Only include what was directly relevant to the calculation of the mutually inclusive data sets
- The sample dataset that is entered into the @ContractMilestones table is not nearly close to the 100+ line examples given to me
- I have excluded any direct references to proprietary tables
DECLARE @ContractMilestones TABLE (
Co TINYINT
,Project VARCHAR(10)
,[Contract] VARCHAR(16)
,Milestone VARCHAR(20)
)
INSERT @ContractMilestones (Co, Project, [Contract], Milestone)
VALUES
( 102,'Test', '5','493')
,( 102,'Test', '1.01','111')
,( 102,'Test', '1.01', '211')
,( 102,'Test', '1.01', '311')
,( 102,'Test', '2.02', '111')
,( 102,'Test', '3.01', '211')
DECLARE @ReportingGroups TABLE (
Co TINYINT
,Project VARCHAR(10)
,Milestone VARCHAR(20)
,[Contract] VARCHAR(16)
,MilestoneRptGroup INT
,ContractGrouping INT
,FinalGroup INT
);
WITH preranked (
Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
)
AS (
SELECT Co
,Project
,Milestone
,[Contract]
,DENSE_RANK() OVER (
PARTITION BY Co
,Project ORDER BY Co
,Project
,Milestone
) AS MilestoneRptGroup
FROM @ContractMilestones
)
,contractgroupd (
Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
,ContractGrouping
)
AS (
SELECT Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
,min(MilestoneRptGroup) OVER (
PARTITION BY [Contract] ORDER BY [Contract] DESC
) AS ContractGrouping
FROM preranked
)
INSERT @ReportingGroups (
Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
,ContractGrouping
,FinalGroup
)
SELECT Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
,ContractGrouping
,DENSE_RANK() OVER (
ORDER BY Co
,Project
,ContractGrouping
) AS FinalGroup
FROM contractgroupd
ORDER BY FinalGroup
DECLARE @ItemGrouping INT
DECLARE @MilestoneGroup TABLE (MilestoneRptGroup INT)
DECLARE @ContractGroup TABLE (ContractRptGroup INT)
SELECT @ItemGrouping = min(FinalGroup)
FROM @ReportingGroups
WHILE @ItemGrouping IS NOT NULL
BEGIN
INSERT @MilestoneGroup (MilestoneRptGroup)
SELECT DISTINCT MilestoneRptGroup
FROM @ReportingGroups
WHERE FinalGroup = @ItemGrouping
UPDATE @ReportingGroups
SET FinalGroup = @ItemGrouping
,MilestoneRptGroup = @ItemGrouping
WHERE MilestoneRptGroup IN (
SELECT MilestoneRptGroup
FROM @MilestoneGroup
)
DELETE
FROM @MilestoneGroup
INSERT @ContractGroup (ContractRptGroup)
SELECT DISTINCT ContractGrouping
FROM @ReportingGroups
WHERE FinalGroup = @ItemGrouping
UPDATE @ReportingGroups
SET FinalGroup = @ItemGrouping
,ContractGrouping = @ItemGrouping
WHERE ContractGrouping IN (
SELECT ContractRptGroup
FROM @ContractGroup
)
DELETE
FROM @ContractGroup
INSERT @MilestoneGroup (MilestoneRptGroup)
SELECT DISTINCT MilestoneRptGroup
FROM @ReportingGroups
WHERE FinalGroup = @ItemGrouping
UPDATE @ReportingGroups
SET FinalGroup = @ItemGrouping
,MilestoneRptGroup = @ItemGrouping
WHERE MilestoneRptGroup IN (
SELECT MilestoneRptGroup
FROM @MilestoneGroup
)
DELETE
FROM @MilestoneGroup
INSERT @ContractGroup (ContractRptGroup)
SELECT DISTINCT ContractGrouping
FROM @ReportingGroups
WHERE FinalGroup = @ItemGrouping
UPDATE @ReportingGroups
SET FinalGroup = @ItemGrouping
,ContractGrouping = @ItemGrouping
WHERE ContractGrouping IN (
SELECT ContractRptGroup
FROM @ContractGroup
)
DELETE
FROM @ContractGroup
SELECT @ItemGrouping = min(FinalGroup)
FROM @ReportingGroups
WHERE FinalGroup > @ItemGrouping
END;
-- This is the final table that holds the data we've retrieved
-- insert commented out so we can do a simple select.
-- INSERT FinalReportingGroups (
-- Co
-- ,Project
-- ,[Contract]
-- ,ReportingGroup
-- ,Milestone
-- )
SELECT Co
,Project
,[Contract]
,DENSE_RANK() OVER (
ORDER BY Co
,Project
,ContractGrouping
) AS ReportingGroup
,Milestone
FROM @ReportingGroups
ORDER BY Co
,Project
,[Contract]
,Milestone
Subscribe to:
Posts (Atom)
How to Turn Off Smart Charging
After a recent update, I discovered that my battery was down to 80% while plugged in, which I hadn't expected to see. I also noticed tha...
-
I've been playing with PowerBI recently, and have discovered their Gallery of Custom Visuals . I decided to try some to see what they ...
-
This post is part of T-SQL Tuesday , a monthly blog party on the second Tuesday of each month which was started back in 2009. This month...
-
TSQL2sday Linkback Happy TSQL Tuesday Everyone! This week's TSQL2day is hosted by the lovely Jen McCown ( blog | twitter ), ...