|
From: | Syan Tan |
Subject: | [Gnumed-devel] optimizations |
Date: | Thu, 09 Mar 2006 22:25:23 +0800 |
experimenting with a large dataset, I found that there are some problems with
the postgresql query planner which requires some manipulation of the sql to compensate for.
For instance, a 1-2 minute access of a fairly large record becomes 4 seconds .
the problem is selecting on a base table where there exists child tables with large dataset.
e.g. clin_root_item and clin_narrative ( with 15000+) entries.
indexes exist for the search condition piece in both base table and child table , but
the default for the qeury parser is to sequentially search the 15000 entries of the child table
without using the index. This was mentioned as a problem in the online documentation
in one of the comment posts on the section about partitioning a table using inheritance
and index range check conditions in child tables ( each child table has a check constraint that
allows it to only include a certain subrange of indexes).
the optimization is to explicitly search each child table and join , and then get the union of
the joins. This reduces a 10000 msec search time to about 1 msec.
You can turn on log statement duration and log statements to see where the problem queries are
, and doing "explain analyze " in psql, this will tell you the execution plan and how
long each part of the plan takes.
attached is some diffs in gmClinicalRecord , a diff of the dump gnumed schema, and the
gmClinicalViews.sql I was using ( sorry, no good diff).
gmClinical.py.diff
Description: Binary data
gnumed_v2.schema.diff
Description: Binary data
gmClinicalRecord.py
Description: Binary data
[Prev in Thread] | Current Thread | [Next in Thread] |