Tuesday, October 20, 2015

Counting things that do not exist with a correlated subquery

Today I had a problem where I needed to find the difference between total and in-active row counts based on a FK value from a parent table. I had a developer explaining that she wanted to do a count for values that don't exist. She gave me a query like this:

SELECT leader_id, count(*) as count
FROM lists
WHERE active = 1
GROUP BY leader_id
HAVING count = 0

I told her that you cannot count something that doesn't exist. It took me a little bit to understand what was needed but I was able to do it with a correlated subquery. Take this example table and example data:


CREATE TABLE `lists` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `leader_id` int(10) unsigned NOT NULL,
  `active` tinyint(4) NOT NULL DEFAULT '1',
  `name` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_fk_leader_id` (`leader_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `lists` (`id`, `leader_id`, `active`, `name`) VALUES (NULL, '1', '1', 'one');
INSERT INTO `lists` (`id`, `leader_id`, `active`, `name`) VALUES (NULL, '2', '1', 'two');
INSERT INTO `lists` (`id`, `leader_id`, `active`, `name`) VALUES (NULL, '1', '0', 'One old');
INSERT INTO `lists` (`id`, `leader_id`, `active`, `name`) VALUES (NULL, '1', '1', 'One also old');
INSERT INTO `lists` (`id`, `leader_id`, `active`, `name`) VALUES (NULL, '3', '1', 'three');
INSERT INTO `lists` (`id`, `leader_id`, `active`, `name`) VALUES (NULL, '3', '0', 'three old');
INSERT INTO `lists` (`id`, `leader_id`, `active`, `name`) VALUES (NULL, '4', '0', 'four old');

I wrote this query to get the difference:


SELECT l1.leader_id,
count(*) as total_count,
l2.inactive_count,
count(*) - l2.inactive_count as difference
FROM lists l1
INNER JOIN
(SELECT leader_id, count(*) as inactive_count FROM lists l3 WHERE active = 0
GROUP BY leader_id) l2 ON l1.leader_id = l2.leader_id
GROUP BY l1.leader_id

Results look like this:



What we really wanted was just the value where total count was equal to the inactive count (difference = 0) so I added a HAVING clause also:

SELECT l1.leader_id,
count(*) as total_count,
l2.inactive_count,
count(*) - l2.inactive_count as difference
FROM lists l1
INNER JOIN
(SELECT leader_id, count(*) as inactive_count FROM lists l3 WHERE active = 0
GROUP BY leader_id) l2 ON l1.leader_id = l2.leader_id
GROUP BY l1.leader_id
HAVING difference = 0

Results look like this:





No comments:

Post a Comment