In implementing a comprehensive SNOMED CT browser as part of a research clinical information system, I cache the results of several processor and disk-intensive queries to speed up lookup and SNOMED CT navigation. I was finding some odd results and found that my caching was missing for several items. I needed a way of comparing the contents of one table with another in PostgreSQL and finding out which rows were not present.
To do this, PostgreSQL supports set operations. For instance, I have two tables:
rsdb=# \d t_concept
Table "public.t_concept"
Column | Type | Modifiers
----------------------+------------------------+-----------------------------------------------------
concept_id | bigint | not null default nextval('t_concept_seq'::regclass)
concept_status_code | integer | not null
ctv_id | character varying(10) | not null
fully_specified_name | character varying(255) | not null
is_primitive | integer | not null
snomed_id | character varying(10) | not null
rsdb=# \d t_cached_parent_concepts
Table "public.t_cached_parent_concepts"
Column | Type | Modifiers
-------------------+--------+-----------
child_concept_id | bigint | not null
parent_concept_id | bigint | not null
I want to identify rows in t_concept
without a corresponding row in t_cached_parent_concepts
using SQL:
Try this!
select concept_id from t_concept except select child_concept_id from t_cached_parent_concepts;
Now this doesn’t explain why my sophisticated caching system is not so sophisticated…..
rsdb=# select count(*) from t_concept;
count
--------
388289
(1 row)
rsdb=# select count(*) from (select concept_id from t_concept except select child_concept_id from t_cached_parent_concepts) as tmp;
count
--------
142960
(1 row)
Hmm.. that’s a lot of concepts without a cache….
Update 8th March 2010
rsdb=# select count(*) from t_concept;
count
--------
2
(1 row)
Now that’s better. But why isn’t there only a single root node with no parent concepts? Further investigation identifies a UK-only SNOMED term that hasn’t been classified correctly. A bug report will be filed with Connecting for Health!