SQL is your best friend when you need to find mutually exclusive sets of
data – the more conditions you add to your query, the fewer records are
returned. But how do you handle mutually INCLUSIVE sets where you want the
content of each group to expand as more criteria are met?
Most people when faced with a request such as that will respond similar to
our Mikos (on the left) – with the “You want what? When?” look.
I had the person requesting this code twice demonstrate how he calculated the
mutually inclusive set within Excel before taking on the project so I could
be certain I understood the logic he was employing to calculate them.
The first demonstration included some “black magic”. During the second
demonstration, I asked him to slow down and explain why he was doing what he
did.
Terminology
Essentially, they had a many to many relationship between Project Milestones
and Contracts for each Project and they needed to be able to accurately report
the revenue for these items.
Normally, there’s a one-to-one or one-to-many relationship between Milestones
and Contracts for Projects (at least in the software that I have worked with),
so this was to handle the exceptional relationship so they could calculate
Project costs and revenue appropriately.
Project – A project – like “build a house”
Project Milestone – Completion of certain tasks for a
project – like Painting or Install Carpeting
Contract – the agreement between Customer & the
subcontractor doing the work – the painter who also installs carpeting
Reporting Groups – Mutually inclusive sets related to the
Milestone & Contract combinations – one reporting group contains all
milestones for each Contract as well as all Contracts for each milestone
Algorithm
Determining the correct Reporting Group value is a multi-step process
calculated over all Contracts and Milestones associated to the Project.
- Sort by Milestone
- Assign First-pass group value using DENSE_RANK() OVER (ORDER BY
Milestone)
- Sort By Contract
- For each Contract with multiple Milestones, if the grouping values differ,
assign all instances of the grouping value to the lowest value (Example: In Item
16, there are two Milestones. One has Grouping Value 4 & other has Grouping
Value 6. Set all Grouping Value 6 instances = 4. Repeat for each Contract where
a similar situation exists)
- Sort by Grouping Value
- Update Grouping Values so that any gaps in the sequence are closed (Example:
Grouping Value 6 was removed in an earlier step, so all items with Grouping
Value 7 become 6, then all the 8s become 7, etc. so that the Grouping Values are
consecutive.)
Code
I found that the only way to reproduce the results was to loop through the
data sets similar to the demonstrated method which includes suspiciously
redundant code required to produce the “correct-according-to-the-customer”
results.
The following may not be ideal SQL, but I found that it returned results
exactly matching several examples provided by the Customer. Yes, the seemingly
redundant updates are necessary because without them the final output does not
match his final results. I have not found any alternative method to
replicate the same results more efficiently.
Please note that I’ve simplified the content:
- Only include what was directly relevant to the calculation of the mutually
inclusive data sets
- The sample dataset that is entered into the @ContractMilestones table is not
nearly close to the 100+ line examples given to me
- I have excluded any direct references to proprietary tables
Please note that the final insert has been commented out since the
“FinalReportingGroups” table was not defined in the code, but it exists within
the Customer’s database to store the Reporting Groups thus calculated. I’m sure
that Readers of this post understand that they only need to view the final
results of the calculations since updating the final table would depend upon
their own personal database schema.)
DECLARE @ContractMilestones TABLE
(
Co TINYINT
,Project VARCHAR(10)
,[Contract]
VARCHAR(16)
,Milestone VARCHAR(20)
)
INSERT
@ContractMilestones (Co, Project, [Contract], Milestone)
VALUES
(
102,'Test', '5','493')
,( 102,'Test', '1.01','111')
,( 102,'Test',
'1.01', '211')
,( 102,'Test', '1.01', '311')
,( 102,'Test',
'2.02', '111')
,( 102,'Test', '3.01', '211')
DECLARE @ReportingGroups TABLE
(
Co TINYINT
,Project VARCHAR(10)
,Milestone
VARCHAR(20)
,[Contract] VARCHAR(16)
,MilestoneRptGroup
INT
,ContractGrouping INT
,FinalGroup INT
);
WITH preranked (
Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
)
AS (
SELECT Co
,Project
,Milestone
,[Contract]
,DENSE_RANK() OVER (
PARTITION BY Co
,Project ORDER BY Co
,Project
,Milestone
) AS MilestoneRptGroup
FROM
@ContractMilestones
)
,contractgroupd (
Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
,ContractGrouping
)
AS
(
SELECT Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
,min(MilestoneRptGroup) OVER
(
PARTITION BY [Contract] ORDER BY [Contract]
DESC
) AS ContractGrouping
FROM
preranked
)
INSERT @ReportingGroups (
Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
,ContractGrouping
,FinalGroup
)
SELECT Co
,Project
,Milestone
,[Contract]
,MilestoneRptGroup
,ContractGrouping
,DENSE_RANK() OVER (
ORDER BY
Co
,Project
,ContractGrouping
) AS FinalGroup
FROM
contractgroupd
ORDER BY FinalGroup
DECLARE @ItemGrouping INT
DECLARE
@MilestoneGroup TABLE (MilestoneRptGroup INT)
DECLARE @ContractGroup
TABLE (ContractRptGroup INT)
SELECT @ItemGrouping =
min(FinalGroup)
FROM @ReportingGroups
WHILE @ItemGrouping IS NOT NULL
BEGIN
INSERT @MilestoneGroup (MilestoneRptGroup)
SELECT
DISTINCT MilestoneRptGroup
FROM @ReportingGroups
WHERE
FinalGroup = @ItemGrouping
UPDATE @ReportingGroups
SET FinalGroup = @ItemGrouping
,MilestoneRptGroup =
@ItemGrouping
WHERE MilestoneRptGroup IN (
SELECT
MilestoneRptGroup
FROM @MilestoneGroup
)
DELETE
FROM
@MilestoneGroup
INSERT @ContractGroup
(ContractRptGroup)
SELECT DISTINCT ContractGrouping
FROM
@ReportingGroups
WHERE FinalGroup = @ItemGrouping
UPDATE @ReportingGroups
SET FinalGroup = @ItemGrouping
,ContractGrouping =
@ItemGrouping
WHERE ContractGrouping IN (
SELECT
ContractRptGroup
FROM @ContractGroup
)
DELETE
FROM
@ContractGroup
INSERT @MilestoneGroup
(MilestoneRptGroup)
SELECT DISTINCT MilestoneRptGroup
FROM
@ReportingGroups
WHERE FinalGroup = @ItemGrouping
UPDATE @ReportingGroups
SET FinalGroup = @ItemGrouping
,MilestoneRptGroup =
@ItemGrouping
WHERE MilestoneRptGroup IN (
SELECT
MilestoneRptGroup
FROM @MilestoneGroup
)
DELETE
FROM
@MilestoneGroup
INSERT @ContractGroup
(ContractRptGroup)
SELECT DISTINCT ContractGrouping
FROM
@ReportingGroups
WHERE FinalGroup = @ItemGrouping
UPDATE @ReportingGroups
SET FinalGroup = @ItemGrouping
,ContractGrouping =
@ItemGrouping
WHERE ContractGrouping IN (
SELECT
ContractRptGroup
FROM @ContractGroup
)
DELETE
FROM
@ContractGroup
SELECT @ItemGrouping =
min(FinalGroup)
FROM @ReportingGroups
WHERE FinalGroup
> @ItemGrouping
END;
-- This is the final table that holds the
data we've retrieved
-- insert commented out so we can do a simple
select.
-- INSERT FinalReportingGroups (
-- Co
--
,Project
-- ,[Contract]
-- ,ReportingGroup
--
,Milestone
-- )
SELECT Co
,Project
,[Contract]
,DENSE_RANK() OVER (
ORDER BY
Co
,Project
,ContractGrouping
) AS ReportingGroup
,Milestone
FROM @ReportingGroups
ORDER BY Co
,Project
,[Contract]
,Milestone