Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Data Bases > Pgsql Performance > Re: Question ab...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 2 of 5 Topic 4017 of 4296
Post > Topic >>

Re: Question about disk IO an index use and seeking advice

by matthew@[EMAIL PROTECTED] (Matthew Wakeling) Apr 24, 2008 at 04:27 PM

On Thu, 24 Apr 2008, Nikolas Everett wrote:
> The setup is kind of a beast.

No kidding.

> When I run dstat I see only around 2M/sec and it is not consistent at
all.

Well, it is having to seek over the disc a little. Firstly, your table may

not be wonderfully ordered for index scans, but goodness knows how long a 
CLUSTER operation might take with that much data. Secondly, when doing an 
index scan, Postgres unfortunately can only use the performance equivalent

of a single disc, because it accesses the pages one by one in a 
single-threaded manner. A large RAID array will give you a performance 
boost if you are doing lots of index scans in parallel, but not if you are

only doing one. Greg Stark has a patch in the pipeline to improve this 
though.

> When I do a similar set of queries on the hardware raid I see similar 
> performance except the numbers are both more than doubled.

Hardware RAID is often better than software RAID. 'Nuff said.

> Here is the explain output for the queries:

EXPLAIN ANALYSE is even better.

> Sort  (cost=16948.80..16948.81 rows=1 width=10)"
>   Sort Key: count(*)"
>   ->  HashAggregate  (cost=16948.78..16948.79 rows=1 width=10)"
>         ->  Index Scan using date_idx on bigtable (cost=0.00..16652.77
rows=59201 width=10)"
>               Index Cond: (date > '2008-04-21 00:00:00'::timestamp
without time zone)"

That doesn't look like it should take too long. How long does it take? 
(EXPLAIN ANALYSE, in other words). It's a good plan, anyway.

> So now the asking for advice part.  I have two questions:
> What is the fastest way to copy data from the smaller table to the
larger
> table?

INSERT INTO bigtable (field1, field2) SELECT whatever FROM staging_table
        ORDER BY staging_table.date

That will do it all in Postgres. The ORDER BY clause may slow down the 
insert, but it will certainly speed up your subsequent index scans.

If the bigtable isn't getting any DELETE or UPDATE traffic, you don't need

to vacuum it. However, make sure you do vacuum the staging table, 
preferably directly after moving all that data to the bigtable and 
deleting it from the staging table.

> Can someone point me to a good page on partitioning? My
> gut tells me it should be better, but I'd like to learn more about why.

You could possibly not bother with a staging table, and replace the mass 
copy with making a new partition. Not sure of the details myself though.

Matthew

-- 
Me... a skeptic?  I trust you have proof?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@[EMAIL PROTECTED]
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
 




 5 Posts in Topic:
Question about disk IO an index use and seeking advice
nik9000@[EMAIL PROTECTED]  2008-04-24 09:59:20 
Re: Question about disk IO an index use and seeking advice
matthew@[EMAIL PROTECTED]  2008-04-24 16:27:39 
Re: Question about disk IO an index use and seeking advice
lists@[EMAIL PROTECTED]   2008-04-24 18:56:34 
Re: Question about disk IO an index use and seeking advice
nik9000@[EMAIL PROTECTED]  2008-04-24 14:19:08 
Re: Question about disk IO an index use and seeking advice
lists@[EMAIL PROTECTED]   2008-04-24 22:02:40 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Mon Sep 8 3:09:39 CDT 2008.