Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Data Bases > Pgsql Patches > Re: Verified fi...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 2 of 15 Topic 3690 of 4088
Post > Topic >>

Re: Verified fix for Bug 4137

by heikki@[EMAIL PROTECTED] ("Heikki Linnakangas") May 6, 2008 at 12:02 PM

Simon Riggs wrote:
> The problem was that at the very start of archive recovery the %r
> parameter in restore_command could be set to a filename later than the
> currently requested filename (%f). This could lead to early truncation
> of the archived WAL files and would cause warm standby replication to
> fail soon afterwards, in certain specific cir***stances.
> 
> Fix applied to both core server in generating correct %r filenames and
> also to pg_standby to prevent acceptance of out-of-sequence filenames.

So the core problem is that we use ControlFile->checkPointCopy.redo in 
RestoreArchivedFile to determine the safe truncation point, but when 
there's a backup label file, that's still coming from pg_control file, 
which is wrong.

The patch fixes that by determining the safe truncation point as 
Min(checkPointCopy.redo, xlogfname), where xlogfname is the xlog file 
being restored. That depends on the assumption that everything before 
the first file we (try to) restore is safe to truncate. IOW, we never 
try to restore file B first, and then A, where A < B.

I'm not totally convinced that's a safe assumption. As an example, 
consider doing an archive recovery, but without a backup label, and the 
latest checkpoint record is broken. We would try to read the latest 
(broken) checkpoint record first, and call RestoreArchivedFile to get 
the xlog file containing that. But because that record is broken, we 
fall back to using the previous checkpoint, and will need the xlog file 
where the previous checkpoint record is in.

That's a pretty contrived example, but the point is that assumption 
feels fragile. At the very least it should be noted explicitly in the 
comments. A less fragile approach would be to use something dummy, like 
"000000000000000000000000" as the truncation point until we've 
successfully read the checkpoint/restartpoint record and started the
replay.

-- 
   Heikki Lin****angas
   EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-patches mailing list (pgsql-patches@[EMAIL PROTECTED]
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
 




 15 Posts in Topic:
Verified fix for Bug 4137
simon@[EMAIL PROTECTED]   2008-05-06 09:30:49 
Re: Verified fix for Bug 4137
heikki@[EMAIL PROTECTED]   2008-05-06 12:02:56 
Re: Verified fix for Bug 4137
simon@[EMAIL PROTECTED]   2008-05-06 12:23:03 
Re: Verified fix for Bug 4137
simon@[EMAIL PROTECTED]   2008-05-06 14:44:04 
Re: Verified fix for Bug 4137
heikki@[EMAIL PROTECTED]   2008-05-06 15:00:17 
Re: Verified fix for Bug 4137
simon@[EMAIL PROTECTED]   2008-05-06 16:03:54 
Re: Verified fix for Bug 4137
heikki@[EMAIL PROTECTED]   2008-05-06 17:52:11 
Re: Verified fix for Bug 4137
simon@[EMAIL PROTECTED]   2008-05-06 20:41:01 
Re: Verified fix for Bug 4137
heikki@[EMAIL PROTECTED]   2008-05-06 21:51:19 
Re: Verified fix for Bug 4137
simon@[EMAIL PROTECTED]   2008-05-07 00:29:17 
Re: Verified fix for Bug 4137
heikki@[EMAIL PROTECTED]   2008-05-09 15:31:43 
Re: Verified fix for Bug 4137
tgl@[EMAIL PROTECTED] (T  2008-05-09 11:18:44 
Re: Verified fix for Bug 4137
heikki@[EMAIL PROTECTED]   2008-05-09 16:33:09 
Re: Verified fix for Bug 4137
heikki@[EMAIL PROTECTED]   2008-05-09 15:37:12 
Re: Verified fix for Bug 4137
simon@[EMAIL PROTECTED]   2008-05-09 15:52:16 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Sun Oct 12 14:26:17 CDT 2008.