Quantcast
Channel: medicalnerds.com
Viewing all articles
Browse latest Browse all 10

Finding rows in one table not present in another table with PostgreSQL

$
0
0

In implementing a comprehensive SNOMED CT browser as part of a research clinical information system, I cache the results of several processor and disk-intensive queries to speed up lookup and SNOMED CT navigation. I was finding some odd results and found that my caching was missing for several items. I needed a way of comparing the contents of one table with another in PostgreSQL and finding out which rows were not present.

To do this, PostgreSQL supports set operations. For instance, I have two tables:


rsdb=# \d t_concept
                                      Table "public.t_concept"
        Column        |          Type          |                      Modifiers
----------------------+------------------------+-----------------------------------------------------
 concept_id           | bigint                 | not null default nextval('t_concept_seq'::regclass)
 concept_status_code  | integer                | not null
 ctv_id               | character varying(10)  | not null
 fully_specified_name | character varying(255) | not null
 is_primitive         | integer                | not null
 snomed_id            | character varying(10)  | not null


rsdb=# \d t_cached_parent_concepts
Table "public.t_cached_parent_concepts"
      Column       |  Type  | Modifiers
-------------------+--------+-----------
 child_concept_id  | bigint | not null
 parent_concept_id | bigint | not null

I want to identify rows in t_concept without a corresponding row in t_cached_parent_concepts using SQL:

Try this!


select concept_id from t_concept except select child_concept_id from t_cached_parent_concepts;

Now this doesn’t explain why my sophisticated caching system is not so sophisticated…..


rsdb=# select count(*) from t_concept;
 count
--------
 388289
(1 row)

rsdb=# select count(*) from (select concept_id from t_concept except select child_concept_id from t_cached_parent_concepts) as tmp;
 count
--------
 142960
(1 row)

Hmm.. that’s a lot of concepts without a cache….

 

Update 8th March 2010


rsdb=# select count(*) from t_concept;
 count
--------
 2
(1 row)

Now that’s better. But why isn’t there only a single root node with no parent concepts? Further investigation identifies a UK-only SNOMED term that hasn’t been classified correctly. A bug report will be filed with Connecting for Health!


Viewing all articles
Browse latest Browse all 10

Trending Articles