Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Data Bases > Pgsql General > Need to update ...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 1 of 13 Topic 15361 of 17602
Post > Topic >>

Need to update all my 60 million rows at once without transactional

by christian_behrens@[EMAIL PROTECTED] Apr 21, 2008 at 12:19 AM

Hi!

How can I make a Update of a column in a very large table for all rows
without using the double amount of disc space and without any need for
atomic operation?

I have a very large table with about 60 million rows. I sometimes need to
do a simple update to ALL rows that resets a status-flag to zero.

I don't need to have transactional integrity (but of course if the system
crashes, there should be no data corruption. A separate flag in the file
system can well save the fact that that bulk update was in progress) for
this, I don't care or want a abort or "all or nothing" like SQL mandates.
The idea is basically that either this update succeeds or it succeeds or -
there is no "not". It must succeed. It must be tried until it works. If the
update is halfway finished, that's okay.

If I just do an 
UPDATE table SET flag=0;
then Pg will make a copy of every row which must be cleaned up by vaccuum.
I understand - and don't like during this specific problem - that PG is a
MVCC database and will keep my old versions of the rows before the update
alive. This is very bad.

If I do a batched loop like this:
UPDATE table SET flag=0 where id>=0 and id <200;
UPDATE table SET flag=0 where id>=200 and id <400;
UPDATE table SET flag=0 where id>=400 and id <600;
....

then PG will seek all over my harddrive I think.

It would be much better if it could just start in the beginning of the
table and work it's way towards the end. But which sort-criteria would
satisfy this? I don't think that there is any SQL that does something like
that.


Another ideas (and I think it's a quite good idea) would be to
drop the column and recreate it with a new default value.

But the problem is that this is not actually MY database, but an appliance
(which has a harddrive that does not have the double amount of space
available btw) and it has to work reliably whenever something special
happens.

And I don't think I should create new columns (the old one would be hidden
and their internal column ids lost I think) all the time, that might have a
limit.

Can I do it maybe every day??


Is there any other way to go?

I would really like to kind of "skip transactions". Of course basic data
integrity in on disc structures, but not atomicy for this bulk update!

I read that PG has an internal command language that is used to build up a
database when all the tables for e.g. table-names are not present yet.

Could I use that to hack my way around transactions?

Basically I can do everything to this PG installation, this is an extremly
controlled, locked down environment. I don't need to care for anyone else,
it's always the same schema, the same version, same OS, etc. and I could
happily apply any hack that solves the problem.

Many thanks for any pointers or ideas how to solve this problem!
 Christian
-- 
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten 
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser

-- 
Sent via pgsql-general mailing list (pgsql-general@[EMAIL PROTECTED]
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
 




 13 Posts in Topic:
Need to update all my 60 million rows at once without transactio
christian_behrens@[EMAIL   2008-04-21 00:19:34 
Re: Need to update all my 60 million rows at once without transa
andreas.kretschmer@[EMAIL  2008-04-23 10:22:55 
Re: Need to update all my 60 million rows at once without transa
pavan.deolasee@[EMAIL PRO  2008-04-23 15:18:41 
Re: Need to update all my 60 million rows at once without transa
tometzky@[EMAIL PROTECTED  2008-04-23 13:49:30 
Re: Need to update all my 60 million rows at once
alvherre@[EMAIL PROTECTED  2008-04-23 09:04:29 
Re: Need to update all my 60 million rows at once without transa
mmoncure@[EMAIL PROTECTED  2008-04-23 11:27:14 
Re: Need to update all my 60 million rows at once without transa
pavan.deolasee@[EMAIL PRO  2008-04-23 20:48:28 
Re: Need to update all my 60 million rows at once without
scrawford@[EMAIL PROTECTE  2008-04-23 09:32:02 
Re: Need to update all my 60 million rows at once
simon@[EMAIL PROTECTED]   2008-04-23 19:15:52 
Re: Need to update all my 60 million rows at once without transa
tom@[EMAIL PROTECTED] (T  2008-04-23 17:35:14 
Re: Need to update all my 60 million rows at once without transa
dalroi@[EMAIL PROTECTED]   2008-04-24 00:36:49 
Re: Need to update all my 60 million rows at once without transa
Chris Browne <cbbrowne  2008-04-23 18:19:53 
Re: Need to update all my 60 million rows at once without transa
DCorbit@[EMAIL PROTECTED]  2008-04-23 17:02:30 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Mon Dec 1 19:31:26 CST 2008.