Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Data Bases > Pgsql Bugs > tracking down a...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 1 of 3 Topic 3705 of 3928
Post > Topic >>

tracking down a crash bug

by orion.henry@[EMAIL PROTECTED] ("Orion Henry") Apr 13, 2008 at 02:42 PM

------=_Part_7655_22919580.1208122946480
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Hello.  I need help tracking down a crash bug.  I'm running 8.2.7
I've had my database go into recovery mode three times so far today under
user load.  /etc/init.d/postgresql-8.2 stop would stop the backend but
leave
a few processes behind like this

postgres 22318  0.0  0.0  45724  1272 ?        Ss   Apr11   0:13 postgres:
app1101 app1101 10.255.7.159(44567) idle

postgres 24365  0.0  0.0  45724  1224 ?        Ss   Apr11   0:02 postgres:
app2280 app2280 10.255.7.159(51010) idle

postgres  5649  0.0  0.0  45368  1180 ?        Ss   Apr11   0:00 postgres:
app9452 app9452 10.255.7.159(43141) idle

I would then have to kill -9 these process.  Looking at the postgres log I
find only this...

2008-04-13 12:20:10 PDT STATEMENT:  SELECT version FROM schema_info
2008-04-13 12:21:14 PDT ERROR:  relation "schema_info" does not exist
2008-04-13 12:21:14 PDT STATEMENT:  SELECT version FROM schema_info
2008-04-13 12:26:48 PDT LOG:  background writer process (PID 965) was
terminated by signal 9
2008-04-13 12:26:48 PDT LOG:  terminating any other active server
processes
2008-04-13 12:26:48 PDT WARNING:  terminating connection because of crash
of
another server process
2008-04-13 12:26:48 PDT DETAIL:  The postmaster has commanded this server
process to roll back the current transaction and exit, because another
server process exited abnormally and possibly corrupted shared memory.
2008-04-13 12:26:48 PDT HINT:  In a moment you should be able to reconnect
to the database and repeat your command.
[ repeat several hundred times ]
2008-04-13 12:28:11 PDT FATAL:  the database system is in recovery mode
[ repeat several hundred times ]
2008-04-13 12:33:00 PDT LOG:  incomplete startup packet
2008-04-13 12:33:00 PDT LOG:  received fast shutdown request
2008-04-13 12:33:12 PDT FATAL:  the database system is shutting down
[ repeat a dozen times ]
2008-04-13 12:34:00 PDT LOG:  received immediate shutdown request
2008-04-13 12:34:02 PDT LOG:  could not load root certificate file
"root.crt": No such file or directory
2008-04-13 12:34:02 PDT DETAIL:  Will not verify client certificates.
2008-04-13 12:34:20 PDT LOG:  could not create IPv6 socket: Address family
not sup****ted by protocol
2008-04-13 12:34:20 PDT LOG:  could not resolve "localhost": Name or
service
not known
2008-04-13 12:34:20 PDT LOG:  disabling statistics collector for lack of
working socket
2008-04-13 12:34:20 PDT WARNING:  autovacuum not started because of
misconfiguration
2008-04-13 12:34:20 PDT HINT:  Enable options "stats_start_collector" and
"stats_row_level".
2008-04-13 12:34:20 PDT LOG:  database system was interrupted at
2008-04-13
12:22:44 PDT
2008-04-13 12:34:20 PDT LOG:  checkpoint record is at 0/594FDF58
2008-04-13 12:34:20 PDT LOG:  redo record is at 0/5946B830; undo record is
at 0/0; shutdown FALSE
2008-04-13 12:34:20 PDT LOG:  next transaction ID: 0/2979312; next OID:
106497
2008-04-13 12:34:20 PDT LOG:  next MultiXactId: 1; next MultiXactOffset: 0
2008-04-13 12:34:20 PDT LOG:  database system was not properly shut down;
automatic recovery in progress
2008-04-13 12:34:20 PDT LOG:  redo starts at 0/5946B830
2008-04-13 12:34:21 PDT LOG:  incomplete startup packet
2008-04-13 12:34:21 PDT LOG:  record with zero length at 0/5957A3EC
2008-04-13 12:34:21 PDT LOG:  redo done at 0/5957A3C4
2008-04-13 12:34:21 PDT LOG:  database system is ready

Any advice on how I can get this bug identified and squashed?  I suspect
it's in the create/drop [database,role,schema].  I've used postgres for 7
years without issues at this point.  The only thing different now are my
usage patterns.  Since I'm offering a database as a service to my users
I'm
adding and dropping databases roles and schemas constantly.

Thanks

------=_Part_7655_22919580.1208122946480
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Hello. &nbsp;I need help tracking down a crash bug. &nbsp;I&#39;m running
8.2.7<br>I&#39;ve had my database go into recovery mode three times so far
today under user load. &nbsp;/etc/init.d/postgresql-8.2 stop would stop the
backend but leave a few processes behind like this<br>
<br><font size="1"><span style="font-family: courier
new,monospace;">postgres 22318 &nbsp;0.0 &nbsp;0.0 &nbsp;45724 &nbsp;1272
? &nbsp; &nbsp; &nbsp; &nbsp;Ss &nbsp; Apr11 &nbsp; 0:13 postgres: app1101
app1101 10.255.7.159(44567) idle &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; </span><br
style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">postgres 24365 &nbsp;0.0
&nbsp;0.0 &nbsp;45724 &nbsp;1224 ? &nbsp; &nbsp; &nbsp; &nbsp;Ss &nbsp;
Apr11 &nbsp; 0:02 postgres: app2280 app2280 10.255.7.159(51010) idle
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp;&nbsp; </span><br style="font-family: courier
new,monospace;">
<span style="font-family: courier new,monospace;">postgres &nbsp;5649
&nbsp;0.0 &nbsp;0.0 &nbsp;45368 &nbsp;1180 ? &nbsp; &nbsp; &nbsp; &nbsp;Ss
&nbsp; Apr11 &nbsp; 0:00 postgres: app9452 app9452 10.255.7.159(43141)
idle<br><br></span></font>I would then have to kill -9 these
process.&nbsp; Looking at the postgres log I find only this...<br>
<br><font size="1"><span style="font-family: courier
new,monospace;">2008-04-13 12:20:10 PDT STATEMENT:&nbsp; SELECT version
FROM schema_info</span><br style="font-family: courier
new,monospace;"><span style="font-family: courier
new,monospace;">2008-04-13 12:21:14 PDT ERROR:&nbsp; relation
&quot;schema_info&quot; does not exist</span><br style="font-family:
courier new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:21:14 PDT
STATEMENT:&nbsp; SELECT version FROM schema_info</span><br
style="font-family: courier new,monospace;"><span style="font-family:
courier new,monospace;">2008-04-13 12:26:48 PDT LOG:&nbsp; background
writer process (PID 965) was terminated by signal 9</span><br
style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:26:48 PDT
LOG:&nbsp; terminating any other active server processes</span><br
style="font-family: courier new,monospace;"><span style="font-family:
courier new,monospace;">2008-04-13 12:26:48 PDT WARNING:&nbsp; terminating
connection because of crash of another server process</span><br
style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:26:48 PDT
DETAIL:&nbsp; The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.</span><br
style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:26:48 PDT
HINT:&nbsp; In a moment you should be able to reconnect to the database
and repeat your command.</span><br style="font-family: courier
new,monospace;"><span style="font-family: courier new,monospace;">[ repeat
several hundred times ]</span><br style="font-family: courier
new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:28:11 PDT
FATAL:&nbsp; the database system is in recovery mode</span><br
style="font-family: courier new,monospace;"><span style="font-family:
courier new,monospace;">[ repeat several hundred times ]</span><br
style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:33:00 PDT
LOG:&nbsp; incomplete startup packet</span><br style="font-family: courier
new,monospace;"><span style="font-family: courier
new,monospace;">2008-04-13 12:33:00 PDT LOG:&nbsp; received fast shutdown
request</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:33:12 PDT
FATAL:&nbsp; the database system is shutting down<br>[ repeat a dozen
times ]<br style="font-family: courier new,monospace;"></span><span
style="font-family: courier new,monospace;"></span><span
style="font-family: courier new,monospace;">2008-04-13 12:34:00 PDT
LOG:&nbsp; received immediate shutdown request</span><br
style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:34:02 PDT
LOG:&nbsp; could not load root certificate file &quot;root.crt&quot;: No
such file or directory</span><br style="font-family: courier
new,monospace;"><span style="font-family: courier
new,monospace;">2008-04-13 12:34:02 PDT DETAIL:&nbsp; Will not verify
client certificates.</span><br style="font-family: courier
new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:34:20 PDT
LOG:&nbsp; could not create IPv6 socket: Address family not sup****ted by
protocol</span><br style="font-family: courier new,monospace;"><span
style="font-family: courier new,monospace;">2008-04-13 12:34:20 PDT
LOG:&nbsp; could not resolve &quot;localhost&quot;: Name or service not
known</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:34:20 PDT
LOG:&nbsp; disabling statistics collector for lack of working
socket</span><br style="font-family: courier new,monospace;"><span
style="font-family: courier new,monospace;">2008-04-13 12:34:20 PDT
WARNING:&nbsp; autovacuum not started because of
misconfiguration</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:34:20 PDT
HINT:&nbsp; Enable options &quot;stats_start_collector&quot; and
&quot;stats_row_level&quot;.</span><br style="font-family: courier
new,monospace;"><span style="font-family: courier
new,monospace;">2008-04-13 12:34:20 PDT LOG:&nbsp; database system was
interrupted at 2008-04-13 12:22:44 PDT</span><br style="font-family:
courier new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:34:20 PDT
LOG:&nbsp; checkpoint record is at 0/594FDF58</span><br
style="font-family: courier new,monospace;"><span style="font-family:
courier new,monospace;">2008-04-13 12:34:20 PDT LOG:&nbsp; redo record is
at 0/5946B830; undo record is at 0/0; shutdown FALSE</span><br
style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:34:20 PDT
LOG:&nbsp; next transaction ID: 0/2979312; next OID: 106497</span><br
style="font-family: courier new,monospace;"><span style="font-family:
courier new,monospace;">2008-04-13 12:34:20 PDT LOG:&nbsp; next
MultiXactId: 1; next MultiXactOffset: 0</span><br style="font-family:
courier new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:34:20 PDT
LOG:&nbsp; database system was not properly shut down; automatic recovery
in progress</span><br style="font-family: courier new,monospace;"><span
style="font-family: courier new,monospace;">2008-04-13 12:34:20 PDT
LOG:&nbsp; redo starts at 0/5946B830</span><br style="font-family: courier
new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:34:21 PDT
LOG:&nbsp; incomplete startup packet</span><br style="font-family: courier
new,monospace;"><span style="font-family: courier
new,monospace;">2008-04-13 12:34:21 PDT LOG:&nbsp; record with zero length
at 0/5957A3EC</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">2008-04-13 12:34:21 PDT
LOG:&nbsp; redo done at 0/5957A3C4</span><br style="font-family: courier
new,monospace;"><span style="font-family: courier
new,monospace;">2008-04-13 12:34:21 PDT LOG:&nbsp; database system is
ready</span></font><br>
<br>Any advice on how I can get this bug identified and squashed?&nbsp; I
suspect it&#39;s in the create/drop [database,role,schema].&nbsp; I&#39;ve
used postgres for 7 years without issues at this point.&nbsp; The only
thing different now are my usage patterns.&nbsp; Since I&#39;m offering a
database as a service to my users I&#39;m adding and dropping databases
roles and schemas constantly.&nbsp; <br>
<br>Thanks<br><br><br>

------=_Part_7655_22919580.1208122946480--
 




 3 Posts in Topic:
tracking down a crash bug
orion.henry@[EMAIL PROTEC  2008-04-13 14:42:26 
tracking down a crash bug
orion.henry@[EMAIL PROTEC  2008-04-13 14:45:32 
Re: tracking down a crash bug
tgl@[EMAIL PROTECTED] (T  2008-04-13 18:45:33 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan13V112 Sun Jul 20 4:08:13 CDT 2008.