Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Data Bases > Berkely DB > Re: Very slow r...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 7 of 8 Topic 1766 of 1805
Post > Topic >>

Re: Very slow record-retrieval through cursor

by Don Anderson <don.anderson@[EMAIL PROTECTED] > Aug 31, 2007 at 08:31 PM

Hello Rabbit,

Some comments inline.

On Aug 30, 10:12 pm, Rabbit <McE...@[EMAIL PROTECTED]
> wrote:
> Hi Don,
>
> Thanks for your help! It's during the initialization period of my
> program so there is no writing operations. I put detailed answers
> below.
>
> On 8 31 ,   2 58 , Don Anderson <don.ander...@[EMAIL PROTECTED]
> wrote:
>
> > Hello,
>
> > I don't know valgrind either, but I know quite a bit about BDB.  You
> > might want to ask
> > your question on the Oracle forums for BDB, in the meantime, here are
> > some thoughts.
>
> > You say 'After a while, the cursor read process becomes very slow'.  I
> > guess that means
> > it is running fast until then?  It sounds like you are doing all read
> > operations during this time?
>
> It becomes slow very quickly, after reading couple of dbs.
> Yes, all read operations.
>
> > If not, there may be some write operations holding locks - possible
> > contention.  If you
> > are doing writing, are you using transactions?
>
> No writing at this time.
>
>
>
> > Has your cache been prewarmed - that is, is there reason to believe
> > that the operations
> > before the slowdown are running fast because the cache is already
> > filled with pages
> > from previous accesses?  A quick calculation shows that raw data size
> > alone:
> >   (16+8) * 500,000 * 250 => 3G.
> > With typical overhead and page fill, the database sizes may be more
> > like 5 or 6 G (BDB
> > generally sacrifices space for speed) so if your testing is 'random'
> > and you are doing
> > complete database scans, it probably won't fit in 4G cache.
>
> I think the cache is not prewarmed (how to prewarm it?).

DB cache is a simple cache of database pages, when the
cache fills up the least recently used page is kicked out.
Prewarming just means that you've accessed the database
in advance so the pages you need are already there.

> I noticed a phenomena that, if I stop the program during the
> initializaiton and start it again,
> it will be very fast before it reaches where it's stopped last time.
> After that, the db read becomes
> slow again.
> If I clean the evironment (just delete those __db* files), the db read
> becomes slow very quickly.

This all indicates that the performance differences are explained
by things being in cache or out of cache.  Memory speeds are
a few orders of magnitude faster than disk speeds.

>
> The total physical files' size of these dbs is about 7.6G.
> Yes I'm doing a complete database scan but I don't know whether it is
> random.
> I just put a DB_NEXT flag (sequence scan?) to the cursor and keep
> reading until finished.
>
> In my thought, the cache won't fit all the data, but is enough for one
> db.

The BDB cache operates differently than you probably think.
It is not: here the guy is walking through a cursor, so I can predict
what the next access should be and prefetch the next page.
It is more like - the cursor visits all the data that is stored on
a page and when it is done, it must read the next page in the DB
file - that's a disk op that must wait until compilation.

One thing I have to ask - is this the typical operation for your
application,
reading sequentially through an entire database?  Because we are
worried
about optimizing this, let's be sure this is the im****tant use case.
What is the point for the scan?  If you are searching for something,
you might want to set up a secondary table instead to find your your
data in a fraction of the time.  Or if you need some tabulation (like
a
total of a column), you can maintain that as you are building the
data.
These things have costs of course, you just need to choose the
use cases that matter to you.

For optimizing this particular thing, you might try: using a BTREE
database if you aren't already.  That alone probably won't fix
anything,
but you could then reload the database:
  <shutdown the database environment and application>
  db_dump foo.db | db_load foo2.db
  mv foo2.db foo.db
  <start the application>

That gets the order of the pages in the DB file to closely match
the logical order of the keys, and a sequential scan would
probably work best with most OS's which will do prefetching,
even if BDB isn't.

>
>
> > If it isn't a cache issue, then maybe there's a resource that is
> > leaking.  Are all cursors
> > closed when finished, and tem****ary memory freed?  
>
> Yes
>
> > You'd probably see process size
> > starting to balloon if that is what is going on.  Perhaps there's some
> > other resource
> > contention between the threads.  What happens if you run with fewer
> > (or even one) thread?
>
> It's also very slow even I run with only one thread.
> If I clean the environment, it's very fast when read the first 6 or 7
> dbs, but becomes slow after that.

That is the data that doesn't mesh with a typical cache problem -
but if the OS is also caching than the OS would 'remember'
a previous run.

Another thing to try:
Use the DB_ENV->set_mp_mmapsize() API with a small number,
maybe 1 megabyte.  This only makes a difference if you
are opening your databases readonly.

Also -- how big is the memory on your system?  If it is not much
more than 4 GB, then you have made your BDB cache too large,
you are thra****ng.  Regardless, you might want to experiment with a
*smaller*
BDB cache to make sure.

> > Answers to any of these could be helpful clues.  Another thing to look
> > at is DB statistics.
> > The first one to look at is the output from db_stat -m, which would
> > tell you about the cache,
>
> The head part of db_stat -m:
>
> 4GB     Total cache size
> 4       Number of caches
> 4       Maximum number of caches
> 1GB     Pool individual cache size
> 0       Maximum memory-mapped file size
> 0       Maximum open file descriptors
> 0       Maximum sequential buffer writes
> 0       Sleep after writing maximum sequential buffers
> 0       Requested pages mapped into the process' address space
> 24M     Requested pages found in the cache (98%)
> 360522  Requested pages not found in the cache
> 0       Pages created in the cache
> 360521  Pages read into the cache
> 0       Pages written from the cache to the backing file
> 0       Clean pages forced from the cache
> 0       Dirty pages forced from the cache
> 0       Dirty pages written by trickle-sync thread
> 360510  Current total page count
> 360510  Current clean page count
> 0       Current dirty page count
> 524284  Number of hash buckets used for page location
> 24M     Total number of times hash chains searched for a page
> (24509250)
> 11      The longest hash chain searched for a page
> 28M     Total number of hash chain entries checked for page (28559440)
> 0       The number of hash bucket locks that required waiting (0%)
> 0       The maximum number of times any hash bucket lock was waited
> for (0%)
> 0       The number of region locks that required waiting (0%)
> 0       The number of buffers frozen
> 0       The number of buffers thawed
> 0       The number of frozen buffers freed
> 361545  The number of page allocations
> 0       The number of hash buckets examined during allocations
> 0       The maximum number of hash buckets examined for an allocation
> 0       The number of pages examined during allocations
> 0       The max number of pages examined for an allocation
> 0       Threads waited on page I/O

No big surprises.  Your cache hit rate is high - 98%.
But because of the high volume, you still have 360522 misses,
these are all waiting on disk.  To verify this is the issue,
perhaps do db_stat -m once every 10 seconds while you
are running your test, and correlate it to your response time.
The numbers in db_stat -m are ***ulative, so you should
be able to see lots of cache misses exactly during the times
of the slowdown.  With a little arithmetic, you can see what your
disk access time is.  Perhaps there are good reasons it is slow -
is the disk on a network drive?

> > and db_stat -Co, which can give some clues about contention.  Run
> > these after the program
> > slows down.  The output can be big, so capture it in a file.
>
> The output of db_stat -Co:
>
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Lock REGINFO information:
> Lock    Region type
> 8       Region ID
> data/__db.008   Region name
> 0x2affad7b6000  Original region address
> 0x2affad7b6000  Region address
> 0x2affad7b6068  Region primary address
> 0       Region maximum allocation
> 0       Region allocated
> Region allocations: 8394757 allocations, 0 failures, 0 frees, 1
> longest
> REGION_JOIN_OK  Region flags
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Locks grouped by object:
> Locker   Mode      Count Status  ----------------- Object
> ---------------
>      17d READ          1 HELD   #040.db             handle        0
>
>      10b READ          1 HELD   #002.db             handle        0
>
>      114 READ          1 HELD   #005.db             handle        0
>
>      19c READ          1 HELD   #050.db             page       3987
>
>      126 READ          1 HELD   #011.db             handle        0
>
>      117 READ          1 HELD   #006.db             handle        0
>
>      165 READ          1 HELD   #032.db             handle        0
>
>      19c READ          1 HELD   #050.db             page       2584
>
>      10e READ          1 HELD   #003.db             handle        0
>
>      13e READ          1 HELD   #019.db             handle        0
>
>      171 READ          1 HELD   #036.db             handle        0
>
>      12f READ          1 HELD   #014.db             handle        0
>
>      15f READ          1 HELD   #030.db             handle        0
>
>      144 READ          1 HELD   #021.db             handle        0
>
>      186 READ          1 HELD   #043.db             handle        0
>
>      162 READ          1 HELD   #031.db             handle        0
>
>      12c READ          1 HELD   #013.db             handle        0
>
>      177 READ          1 HELD   #038.db             handle        0
>
>      174 READ          1 HELD   #037.db             handle        0
>
>      108 READ          1 HELD   #001.db             handle        0
>
>      105 READ          1 HELD   #000.db             handle        0
>
>      17a READ          1 HELD   #039.db             handle        0
>
>      13b READ          1 HELD   #018.db             handle        0
>
>      189 READ          1 HELD   #044.db             handle        0
>
>      102 READ          1 HELD    sys.db                  handle
> 0
>
>      180 READ          1 HELD   #041.db             handle        0
>
>      156 READ          1 HELD   #027.db             handle        0
>
>      120 READ          1 HELD   #009.db             handle        0
>
>      159 READ          1 HELD   #028.db             handle        0
>
>      132 READ          1 HELD   #015.db             handle        0
>
>      138 READ          1 HELD   #017.db             handle        0
>
>      18f READ          1 HELD   #046.db             handle        0
>
>      123 READ          1 HELD   #010.db             handle        0
>
>      14d READ          1 HELD   #024.db             handle        0
>
>      19b READ          1 HELD   #050.db             handle        0
>
>      11d READ          1 HELD   #008.db             handle        0
>
>      168 READ          1 HELD   #033.db             handle        0
>
>      129 READ          1 HELD   #012.db             handle        0
>
>      198 READ          1 HELD   #049.db             handle        0
>
>      141 READ          1 HELD   #020.db             handle        0
>
>      135 READ          1 HELD   #016.db             handle        0
>
>      153 READ          1 HELD   #026.db             handle        0
>
>      147 READ          1 HELD   #022.db             handle        0
>
>      111 READ          1 HELD   #004.db             handle        0
>
>      150 READ          1 HELD   #025.db             handle        0
>
>      195 READ          1 HELD   #048.db             handle        0
>
>      11a READ          1 HELD   #007.db             handle        0
>
>      15c READ          1 HELD   #029.db             handle        0
>
>      192 READ          1 HELD   #047.db             handle        0
>
>      18c READ          1 HELD   #045.db             handle        0
>
>      16e READ          1 HELD   #035.db             handle        0
>
>      16b READ          1 HELD   #034.db             handle        0
>
>      14a READ          1 HELD   #023.db             handle        0
>
>      183 READ          1 HELD   #042.db             handle        0
>

Since everyone is a reader, there are no surprises here.
If you ever see WAIT instead of HELD, that can be an indication of
contention.

- Don
 




 8 Posts in Topic:
Very slow record-retrieval through cursor
Rabbit <McEase@[EMAIL   2007-08-30 07:54:36 
Re: Very slow record-retrieval through cursor
Don Anderson <don.ande  2007-08-30 11:58:16 
Re: Very slow record-retrieval through cursor
Rabbit <McEase@[EMAIL   2007-08-31 02:12:53 
Re: Very slow record-retrieval through cursor
Rabbit <McEase@[EMAIL   2007-08-31 02:50:14 
Re: Very slow record-retrieval through cursor
Rabbit <McEase@[EMAIL   2007-08-31 04:06:55 
Re: Very slow record-retrieval through cursor
Rabbit <McEase@[EMAIL   2007-08-31 09:12:55 
Re: Very slow record-retrieval through cursor
Don Anderson <don.ande  2007-08-31 20:31:33 
Re: Very slow record-retrieval through cursor
Rabbit <McEase@[EMAIL   2007-09-02 18:50:51 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Mon Oct 13 2:20:04 CDT 2008.