Hi,
I'm running into an performance problem where a Postgres db is running
at 99% CPU (4 cores) with about 500 concurrent connection doing various
queries from a web application. This problem started about a week ago,
and has been steadily going downhill. I have been tweaking the config a
bit, mainly shared_memory but have seen no noticeable improvements.
at any given time there is about 5-6 postgres in startup
(ps auxwww | grep postgres | grep startup | wc -l)
about 2300 connections in idle
(ps auxwww | grep postgres | idle)
and loads of "FATAL: sorry, too many clients already" being logged.
The server that connects to the db is an apache server using persistent
connections. MaxClients is 2048 thus the high number of connections
needed. Application was written in PHP using the Pear DB class.
Here are some typical queries taking place
(table media has about 40,000 records and category about 40):
LOG: duration: 66141.530 ms statement:
SELECT COUNT(*) AS CNT
FROM media m JOIN category ca USING(category_id)
WHERE CATEGORY_ROOT(m.category_id) = '-1'
AND m.deleted_on IS NULL
LOG: duration: 57828.983 ms statement:
SELECT COUNT(*) AS CNT
FROM media m JOIN category ca USING(category_id)
WHERE CATEGORY_ROOT(m.category_id) = '-1'
AND m.deleted_on IS NULL AND m.POSTED_ON + interval '7 day'
System
======
cpu Xeon(R) CPU 5160 @[EMAIL PROTECTED]
3.00GHz stepping 06 x 4
L1, L2 = 32K, 4096K
mem 8GB
dbms postgresql-server 8.2.4
disks scsi0 : LSI Logic SAS based MegaRAID driver
SCSI device sda: 142082048 512-byte hdwr sectors (72746 MB)
SCSI device sda: 142082048 512-byte hdwr sectors (72746 MB)
Stats
======
top - 00:28:40 up 12:43, 1 user, load average: 46.88, 36.55, 37.65
Tasks: 2184 total, 63 running, 2119 sleeping, 1 stopped, 1 zombie
Cpu0: 99.3% us, 0.5% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.2% si
Cpu1: 98.3% us, 1.4% sy, 0.0% ni, 0.2% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu2: 99.5% us, 0.5% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu3: 99.5% us, 0.5% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si
Mem: 8166004k total, 6400368k used, 1765636k free, 112080k buffers
Swap: 1020088k total, 0k used, 1020088k free, 3558764k cached
$ vmstat 3
procs ---------memory---------- ---swap-- -----io---- --system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy id
wa
4 0 0 559428 109440 3558684 0 0 11 27 31 117 96 2 2
0
5 0 0 558996 109452 3558672 0 0 0 41 1171 835 93 1 7
0
4 0 0 558996 109452 3558740 0 0 0 38 1172 497 98 1 1
0
11 0 0 554516 109452 3558740 0 0 0 19 1236 610 97 1 2
0
25 0 0 549860 109452 3558740 0 0 0 32 1228 332 99 1 0
0
12 0 0 555412 109452 3558740 0 0 0 4 1148 284 99 1 0
0
15 0 0 555476 109452 3558740 0 0 0 23 1202 290 99 1 0
0
15 0 0 555476 109452 3558740 0 0 0 1 1125 260 99 1 0
0
16 0 0 555460 109452 3558740 0 0 0 12 1214 278 99 1 0
0
# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#data_directory = 'ConfigDir' # use data in another directory
# (change requires restart)
#hba_file = 'ConfigDir/pg_hba.conf' # host-based authentication file
# (change requires restart)
#ident_file = 'ConfigDir/pg_ident.conf' # ident configuration file
# (change requires restart)
# If external_pid_file is not explicitly set, no extra PID file is
written.
#external_pid_file = '(none)' # write an extra PID file
# (change requires restart)
#---------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#---------------------------------------------------------------------------
# - Connection Settings -
listen_addresses = 'localhost' # what IP address(es) to listen
on;
# comma-separated list of
addresses;
# defaults to 'localhost', '*' =
all
# (change requires restart)
****t = 5432 # (change requires restart)
max_connections = 2400 # (change requires restart)
# Note: increasing max_connections costs ~400 bytes of shared memory per
# connection slot, plus lock space (see max_locks_per_transaction). You
# might also need to raise shared_buffers to sup****t more connections.
superuser_reserved_connections = 3 # (change requires restart)
#unix_socket_directory = '' # (change requires restart)
#unix_socket_group = '' # (change requires restart)
#unix_socket_permissions = 0777 # octal
# (change requires restart)
#bonjour_name = '' # defaults to the computer name
# (change requires restart)
# - Security & Authentication -
#authentication_timeout = 1min # 1s-600s
#ssl = off # (change requires restart)
#password_encryption = on
#db_user_namespace = off
# Kerberos
#krb_server_keyfile = '' # (change requires restart)
#krb_srvname = 'postgres' # (change requires restart)
#krb_server_hostname = '' # empty string matches any keytab
entry
# (change requires restart)
#krb_caseins_users = off # (change requires restart)
# - TCP Keepalives -
# see 'man 7 tcp' for details
#tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds;
# 0 selects the system default
#tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds;
# 0 selects the system default
#tcp_keepalives_count = 0 # TCP_KEEPCNT;
# 0 selects the system default
#---------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------
# - Memory -
shared_buffers = 600MB # min 128kB or
max_connections*16kB
# (change requires restart)
temp_buffers = 10MB # min 800kB
#max_prepared_transactions = 5 # can be 0 or more
# (change requires restart)
# Note: increasing max_prepared_transactions costs ~600 bytes of shared
memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 8MB # min 64kB
maintenance_work_mem = 512MB # min 1MB
max_stack_depth = 8MB # min 100kB
# - Free Space Map -
max_fsm_pages = 1536000 # min max_fsm_relations*16, 6 bytes each
# (change requires restart)
max_fsm_relations = 10000 # min 100, ~70 bytes each
# (change requires restart)
# - Kernel Resource Usage -
max_files_per_process = 1024 # min 25
# (change requires restart)
#shared_preload_libraries = '' # (change requires restart)
# - Cost-Based Vacuum Delay -
#vacuum_cost_delay = 0 # 0-1000 milliseconds
#vacuum_cost_page_hit = 1 # 0-10000 credits
#vacuum_cost_page_miss = 10 # 0-10000 credits
#vacuum_cost_page_dirty = 20 # 0-10000 credits
#vacuum_cost_limit = 200 # 0-10000 credits
# - Background writer -
#bgwriter_delay = 200ms # 10-10000ms between rounds
#bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers
scanned/round
#bgwriter_lru_maxpages = 5 # 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333 # 0-100% of all buffers
scanned/round
#bgwriter_all_maxpages = 5 # 0-1000 buffers max written/round
#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------
# - Settings -
#fsync = on # turns forced synchronization on
or off
#wal_sync_method = fsync # the default is the first option
# sup****ted by the operating
system:
# open_datasync
# fdatasync
# fsync
# fsync_writethrough
# open_sync
#full_page_writes = on # recover from partial page writes
#wal_buffers = 64kB # min 32kB
# (change requires restart)
#commit_delay = 0 # range 0-100000, in microseconds
#commit_siblings = 5 # range 1-1000
# - Checkpoints -
#checkpoint_segments = 3 # in logfile segments, min 1, 16MB
each
#checkpoint_timeout = 5min # range 30s-1h
#checkpoint_warning = 30s # 0 is off
# - Archiving -
#archive_command = '' # command to use to archive a logfile
segment
#archive_timeout = 0 # force a logfile segment switch after
this
# many seconds; 0 is off
#---------------------------------------------------------------------------
# QUERY TUNING
#---------------------------------------------------------------------------
# - Planner Method Configuration -
#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on
# - Planner Cost Constants -
#seq_page_cost = 1.0 # measured on an arbitrary scale
#random_page_cost = 4.0 # same scale as above
#cpu_tuple_cost = 0.01 # same scale as above
#cpu_index_tuple_cost = 0.005 # same scale as above
#cpu_operator_cost = 0.0025 # same scale as above
effective_cache_size = 1000MB
# - Genetic Query Optimizer -
#geqo = on
#geqo_threshold = 12
#geqo_effort = 5 # range 1-10
#geqo_pool_size = 0 # selects default based on effort
#geqo_generations = 0 # selects default based on effort
#geqo_selection_bias = 2.0 # range 1.5-2.0
# - Other Planner Options -
#default_statistics_target = 10 # range 1-1000
#constraint_exclusion = off
#from_collapse_limit = 8
#join_collapse_limit = 8 # 1 disables collapsing of
explicit
# JOINs
#---------------------------------------------------------------------------
# ERROR RE****TING AND LOGGING
#---------------------------------------------------------------------------
# - Where to Log -
#log_destination = 'stderr' # Valid values are combinations of
# stderr, syslog and eventlog,
# depending on platform.
# This is used when logging to stderr:
#redirect_stderr = off # Enable capturing of stderr into
log
# files
# (change requires restart)
# These are only used if redirect_stderr is on:
#log_directory = 'pg_log' # Directory where log files are
written
# Can be absolute or relative to
PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # Log file name pattern.
# Can include strftime() escapes
#log_truncate_on_rotation = off # If on, any existing log file of the same
# name as the new log file will be
# truncated rather than appended
to. But
# such truncation only occurs on
# time-driven rotation, not on
restarts
# or size-driven rotation. Default
is
# off, meaning append to existing
files
# in all cases.
#log_rotation_age = 1d # Automatic rotation of logfiles
will
# happen after that time. 0 to
# disable.
#log_rotation_size = 10MB # Automatic rotation of logfiles
will
# happen after that much log
# output. 0 to disable.
# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'
# - When to Log -
#client_min_messages = notice # Values, in order of decreasing
detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# log
# notice
# warning
# error
#log_min_messages = notice # Values, in order of decreasing
detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# log
# fatal
# panic
#log_error_verbosity = default # terse, default, or verbose
messages
#log_min_error_statement = error # Values in order of increasing
severity:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# fatal
# panic (effectively off)
log_min_duration_statement = -1 # -1 is disabled, 0 logs all
statements
# and their durations.
#silent_mode = off # DO NOT USE without syslog or
# redirect_stderr
# (change requires restart)
# - What to Log -
#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = off
#log_connections = off
log_disconnections = on
log_duration = on
log_line_prefix = '%u@[EMAIL PROTECTED]
%h %m' # Special values:
# %u = user name
# %d = database name
# %r = remote host and ****t
# %h = remote host
# %p = PID
# %t = timestamp (no
milliseconds)
# %m = timestamp with
milliseconds
# %i = command tag
# %c = session id
# %l = session line number
# %s = session start timestamp
# %x = transaction id
# %q = stop here in non-session
# processes
# %% = '%'
# e.g. '<%u%%%d> '
#log_statement = 'none' # none, ddl, mod, all
#log_hostname = off
#---------------------------------------------------------------------------
# RUNTIME STATISTICS
#---------------------------------------------------------------------------
# - Query/Index Statistics Collector -
#stats_command_string = on
#update_process_title = on
#stats_start_collector = on # needed for block or row stats
# (change requires restart)
#stats_block_level = off
#stats_row_level = off
#stats_reset_on_server_start = off # (change requires restart)
# - Statistics Monitoring -
#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#log_statement_stats = off
#---------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#---------------------------------------------------------------------------
#autovacuum = off # enable autovacuum subprocess?
# 'on' requires
stats_start_collector
# and stats_row_level to also be
on
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 500 # min # of tuple updates before
# vacuum
#autovacuum_analyze_threshold = 250 # min # of tuple updates before
# analyze
#autovacuum_vacuum_scale_factor = 0.2 # fraction of rel size before
# vacuum
#autovacuum_analyze_scale_factor = 0.1 # fraction of rel size before
# analyze
#autovacuum_freeze_max_age = 200000000 # maximum XID age before forced
vacuum
# (change requires restart)
#autovacuum_vacuum_cost_delay = -1 # default vacuum cost delay for
# autovacuum, -1 means use
# vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for
# autovacuum, -1 means use
# vacuum_cost_limit
#---------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------
# - Statement Behavior -
#search_path = '"$user",public' # schema names
#default_tablespace = '' # a tablespace name, '' uses
# the default
#check_function_bodies = on
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = off
#statement_timeout = 0 # 0 is disabled
#vacuum_freeze_min_age = 100000000
# - Locale and Formatting -
datestyle = 'iso, mdy'
#timezone = unknown # actually, defaults to TZ
# environment setting
#timezone_abbreviations = 'Default' # select the set of available
timezone
# abbreviations. Currently, there
are
# Default
# Australia
# India
# However you can also create your
own
# file in share/timezonesets/.
#extra_float_digits = 0 # min -15, max 2
#client_encoding = sql_ascii # actually, defaults to database
# encoding
# These settings are initialized by initdb -- they might be changed
lc_messages = 'C' # locale for system error message
# strings
lc_monetary = 'C' # locale for monetary formatting
lc_numeric = 'C' # locale for number formatting
lc_time = 'C' # locale for time formatting
# - Other Defaults -
#explain_pretty_print = on
#dynamic_library_path = '$libdir'
#local_preload_libraries = ''
#---------------------------------------------------------------------------
# LOCK MANAGEMENT
#---------------------------------------------------------------------------
#deadlock_timeout = 1s
#max_locks_per_transaction = 64 # min 10
# (change requires restart)
# Note: each lock table slot uses ~270 bytes of shared memory, and there
are
# max_locks_per_transaction * (max_connections +
max_prepared_transactions)
# lock table slots.
#---------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------
# - Previous Postgres Versions -
#add_missing_from = off
#array_nulls = on
#backslash_quote = safe_encoding # on, off, or safe_encoding
#default_with_oids = off
#escape_string_warning = on
#standard_conforming_strings = off
#regex_flavor = advanced # advanced, extended, or basic
#sql_inheritance = on
# - Other Platforms & Clients -
#transform_null_equals = off
#---------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#---------------------------------------------------------------------------
#custom_variable_cl***** = '' # list of custom variable class
names
--
Bryan Buecking
--
Sent via pgsql-performance mailing list (pgsql-performance@[EMAIL PROTECTED]
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


|