Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Data Bases > Oracle Server > Re: Differences...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 6 of 6 Topic 16522 of 17248
Post > Topic >>

Re: Differences in Range Scan Performance between Binary and NLS Indexes

by "Jonathan Lewis" <jonathan@[EMAIL PROTECTED] > May 10, 2008 at 10:49 AM

"Pat" <pat.casey@[EMAIL PROTECTED]
> wrote in message 
news:b0cf7d61-3547-48e0-a6e4-f2cdbfcde83a@[EMAIL PROTECTED]
 have a table set with about 340k rows in it. User's execute queries
against this table for arbitrary search strings which are resolved
against the name table in the column. The table contains mixed case,
but the users expect to match their search regardless of case. So, for
example, both of these should match a search term of "badg":

ARabidBadger
somebadglowercase

A typical search, for all those entries whose name contains "badg"
would look like this:

SELECT "sys_id" FROM (SELECT /*+ FIRST_ROWS(250) */ a.*, ROWNUM rnum
FROM (SELECT cmdb_ci0."sys_id" FROM cmdb_ci cmdb_ci0  WHERE
lower(cmdb_ci0."name") LIKE '%badge%' ORDER BY lower(cmdb_ci0."name"))
a where ROWNUM <= 250) WHERE rnum > 0

To make this query efficient, I added a covering index

create index cmdb_ci_lower_name_sys_id on cmdb_ci (lower("name"),
"sys_id");

And our query plan looks like this and completes in about 120 ms. This
was worst case since only 1 row was returned. Short search strings
like "b" will hit their stop key quickly and exit after finding 250
matches without having to exhaust the index.

Still, worst case, this query does a full index scan in about 120ms;

SQL> SELECT "sys_id" FROM (SELECT /*+ FIRST_ROWS(250) */ a.*, ROWNUM
rnum FROM (SELECT cmdb_ci0."sys_id" FROM cmdb_ci cmdb_ci0  WHERE
lower(cmdb_ci0."name") LIKE '%badge%' ORDER BY lower(cmdb_ci0."name"))
a where ROWNUM <= 250) WHERE rnum > 0;

Elapsed: 00:00:00.12

Execution Plan
----------------------------------------------------------
Plan hash value: 424525705

------------------------------------------------------------------------------------------------
| Id  | Operation    | Name        | Rows  | Bytes | Cost (%CPU)|
Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |   250 | 11750 |    46 (0)|
00:00:01 |
|*  1 |  VIEW    |        |   250 | 11750 |    46 (0)| 00:00:01 |
|*  2 |   COUNT STOPKEY    |        |       |       |     |        |
|   3 |    VIEW    |        |   251 |  8534 |    46 (0)| 00:00:01
|
|*  4 |     INDEX FULL SCAN| CMDB_CI_LOWER_NAME_SYS_ID | 16997 |
813K|    46 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RNUM">0)
   2 - filter(ROWNUM<=250)
   4 - filter(LOWER("name") LIKE '%badge%')


Statistics
----------------------------------------------------------
141  recursive calls
  0  db block gets
       2964  consistent gets
  0  physical reads
  0  redo size
439  bytes sent via SQL*Net to client
400  bytes received via SQL*Net from client
  2  SQL*Net roundtrips to/from client
  0  sorts (memory)
  0  sorts (disk)
  1  rows processed

Recently though, we started storing NLS characters in this column
(european customers), so we'll have values like:

badger
bädger <-- note the accented a

The users want both of the above to collate next to each other.

So we, not unnaturally, put the database in linguisitc mode:

alter session set nls_sort=german;
alter session set nls_comp=linguistic;

Then we issue the same query. The query plan here involves a sort, but
the query returns a grand total of 1 record, so the sort isn't
material to the response time.

Note that the same query, running the same index full scan against the
same index, now takes 1.23 second, roughly 10X as long as it took when
running in binary mode.

SQL> SELECT "sys_id" FROM (SELECT /*+ FIRST_ROWS(250) */ a.*, ROWNUM
rnum FROM (SELECT cmdb_ci0."sys_id" FROM cmdb_ci cmdb_ci0  WHERE
lower(cmdb_ci0."name") LIKE '%badge%' ORDER BY lower(cmdb_ci0."name"))
a where ROWNUM <= 250) WHERE rnum > 0;

Elapsed: 00:00:01.23

Execution Plan
----------------------------------------------------------
Plan hash value: 3626452865

--------------------------------------------------------------------------------------------------------------
| Id  | Operation | Name      | Rows  | Bytes |TempSpc| Cost
(%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      | 250 | 11750 |      | 870   (3)|
00:00:11 |
|*  1 |  VIEW |      | 250 | 11750 |      | 870   (3)| 00:00:11
|
|*  2 |   COUNT STOPKEY |      |      |      |      |   |      |
|   3 |    VIEW |      | 16997 | 564K|      | 870   (3)|
00:00:11 |
|*  4 |     SORT ORDER BY STOPKEY|      | 16997 | 813K| 2008K|
870   (3)| 00:00:11 |
|*  5 |      INDEX FAST FULL SCAN| CMDB_CI_LOWER_NAME_SYS_ID | 16997
| 813K|      | 659   (3)| 00:00:08 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RNUM">0)
   2 - filter(ROWNUM<=250)
   4 - filter(ROWNUM<=250)
   5 - filter(LOWER("name") LIKE '%badge%')


Statistics
----------------------------------------------------------
134  recursive calls
  0  db block gets
       2978  consistent gets
       2929  physical reads
  0  redo size
439  bytes sent via SQL*Net to client
400  bytes received via SQL*Net from client
  2  SQL*Net roundtrips to/from client
  1  sorts (memory)
  0  sorts (disk)
  1  rows processed

If I take off the order by clause so we use exactly the same plan as
our original query, we still get:

SQL> SELECT "sys_id" FROM (SELECT /*+ FIRST_ROWS(250) */ a.*, ROWNUM
rnum FROM (SELECT cmdb_ci0."sys_id" FROM cmdb_ci cmdb_ci0  WHERE
lower(cmdb_ci0."name") LIKE '%badge%' ) a where ROWNUM <= 250) WHERE
rnum > 0;

Elapsed: 00:00:01.20

Execution Plan
----------------------------------------------------------
Plan hash value: 1941712263

----------------------------------------------------------------------------------------------------
| Id  | Operation        | Name    | Rows  | Bytes | Cost (%CPU)|
Time    |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |    |   250 | 11750 | 11   (0)|
00:00:01 |
|*  1 |  VIEW        |    |   250 | 11750 | 11   (0)| 00:00:01 |
|*  2 |   COUNT STOPKEY        |    |    |    | |    |
|*  3 |    INDEX FAST FULL SCAN| CMDB_CI_LOWER_NAME_SYS_ID |   251 |
12299 | 11   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RNUM">0)
   2 - filter(ROWNUM<=250)
   3 - filter(LOWER("name") LIKE '%badge%')


Statistics
----------------------------------------------------------
134  recursive calls
  0  db block gets
       2979  consistent gets
  0  physical reads
  0  redo size
439  bytes sent via SQL*Net to client
400  bytes received via SQL*Net from client
  2  SQL*Net roundtrips to/from client
  0  sorts (memory)
  0  sorts (disk)
  1  rows processed


So my questions are is:

1) Am I doing something wrong here? Why does putting the system into
NLS_SORT and NLS_COMP make this big a difference on performance?
2) Is there an index I should add here that would allow some form of
efficient querying across this set? I know I'm going to have to do a
full scan here (contains queries being what they are), but why is my
full scan 10X slower in linguistic mode?
3) Is there some other recommended approach to getting linguistic
collation working efficiently? Shadow columns aren't really practical
here (and even then I'm not sure what I'd put in the shadow unless I
wrote my own german to english unaccenter).

Any help would be much appreciated.


A couple of points.

In starting case, you are doing an "INDEX FULL SCAN".
Because the "order by" clause matches an index that can be
used for the "like" - Oracle knows that if it walks the index
in order then it can stop as soon as it has found enough data
without sorting.

In the second case, Oracle has to do the "table scan" - but
in fact, since all the necessary data is available in an index,
it can do an "index FAST full scan", which is the tablescan
mechanism (multiblock read) applied to an index segment.
Because of this, it has to read the whole data set, and then
sort it to get the first 250.  This gives you three possible
reasons for the extra time:
a) time for reading all the data
b) time for sorting
c) physical reads - a 'full scan' will buffer the index blocks,
a 'fast full scan' may not, unless the index is a small one.
This means you have to check your test conditions very
carefully to see that you are comparing like with like, and
also creating a test that will reflect the production behaviour
closely.


In the last case (without the order by), Oracle could stop
the fast full scan early (after finding 250) rows, and doesn't
have to sort. It also happens that you don't do any physical
reads - so the index blocks were buffered for this test, when
they weren't for the previous test.

Note - your third test wasn' the same as your first - the third
one is still doing the FAST full scan of the second test.


-- 
Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com

Author: Cost Based Oracle: Fundamentals
http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html
 




 6 Posts in Topic:
Differences in Range Scan Performance between Binary and NLS Ind
Pat <pat.casey@[EMAIL   2008-05-06 13:16:56 
Re: Differences in Range Scan Performance between Binary and NLS
joel garry <joel-garry  2008-05-06 14:26:43 
Re: Differences in Range Scan Performance between Binary and NLS
Pat <pat.casey@[EMAIL   2008-05-06 14:36:47 
Re: Differences in Range Scan Performance between Binary and NLS
joel garry <joel-garry  2008-05-07 11:29:51 
Re: Differences in Range Scan Performance between Binary and NLS
Pat <pat.casey@[EMAIL   2008-05-07 12:29:06 
Re: Differences in Range Scan Performance between Binary and NLS
"Jonathan Lewis"  2008-05-10 10:49:52 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Tue Oct 7 1:20:24 CDT 2008.