Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Data Bases > Pgsql Sql > Re: in() VS exi...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 1 of 1 Topic 3345 of 3799
Post > Topic >>

Re: in() VS exists()

by Bart.Degryse@[EMAIL PROTECTED] ("Bart Degryse") Mar 13, 2008 at 03:12 PM

--=__PartB1972271.0__=
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

I think that just
select count(sp.id) from specimens sp INNER JOIN sequences s on
s.specimen_id = sp.id;
should be enough

>>> Julien <jcigar@[EMAIL PROTECTED]
> 2008-03-13 17:10 >>>
If I understood well the query plan, the planner optimize the
IN(SELECT ...) version with a JOIN (line 19-20 of the first paste) :

->  Hash IN Join  (cost=240.95..4011.20 rows=1436 width=4) (actual
time=93.971..201.908 rows=1431 loops=1)
         Hash Cond: ("outer".id = "inner".specimen_id)

so I guess that :

select count(sp.id) from specimens sp where sp.id in (select
specimen_id
from sequences);

is almost the same as :

select count(sp.id) from specimens sp INNER JOIN (select specimen_id
from sequences GROUP BY specimen_id) as foo on foo.specimen_id =
sp.id;

?

Thanks,
Julien

On Thu, 2008-03-13 at 14:46 +0100, Bart Degryse wrote:
> The chapter on indexes in the manual should give you a pretty good
> idea on the why.
> IN and EXISTS are not the only possibilities, you can also use inner
> or outer joins.
> Which solution performs best depends on the data, the database
> version, the available indexes, ...
> 
> >>> Julien <jcigar@[EMAIL PROTECTED]
> 2008-03-13 15:47 >>>
> Hello,
> 
> Does anyone has an idea why sometimes:
> - select ... where ... in (select ...)
> is faster than :
> - select ... where ... exists(select ...)
> and sometimes it's the opposite ?
> 
> I had such a situation, I've pasted the queries on:
> http://rafb.net/p/KXNZ6892.html
and http://rafb.net/p/jvo5DO38.html

> 
> It's running PostgreSQL 8.1 with an effective_cache_size of 30000.
> 
> specimens.id is the primary key and there are indexes on
> sequences(specimen_id) and specimen_measurements(specimen_id)
> 
> Is there a general "rule" to know when to use the in() version and
> when
> to use the exists() version ? Is it true to say that the exists()
> version is more scalable (with many rows) than the in() version
(from
> the little tests I made it seems the case) ?
> 
> Thanks,
> Julien
> 
> -- 
> Julien Cigar
> Belgian Biodiversity Platform
> http://www.biodiversity.be
( http://www.biodiversity.be/
)
> Université Libre de Bruxelles (ULB)
> Campus de la Plaine CP 257
> Bâtiment NO, Bureau 4 N4 115C (Niveau 4)
> Boulevard du Triomphe, entrée ULB 2
> B-1050 Bruxelles
> Mail: jcigar@[EMAIL PROTECTED]
 
> @[EMAIL PROTECTED]
 http://biobel.biodiversity.be/person/show/471

> Tel : 02 650 57 52
> 
> 
> -- 
> Sent via pgsql-sql mailing list (pgsql-sql@[EMAIL PROTECTED]
)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql

> 
-- 
Julien Cigar
Belgian Biodiversity Platform
http://www.biodiversity.be
( http://www.biodiversity.be/
)
Université Libre de Bruxelles (ULB)
Campus de la Plaine CP 257
Bâtiment NO, Bureau 4 N4 115C (Niveau 4)
Boulevard du Triomphe, entrée ULB 2
B-1050 Bruxelles
Mail: jcigar@[EMAIL PROTECTED]
 
@[EMAIL PROTECTED]
 http://biobel.biodiversity.be/person/show/471

Tel : 02 650 57 52


-- 
Sent via pgsql-sql mailing list (pgsql-sql@[EMAIL PROTECTED]
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

--=__PartB1972271.0__=
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; charset=3Dutf-8">
<META content=3D"MSHTML 6.00.2800.1264" name=3DGENERATOR></HEAD>
<BODY style=3D"MARGIN: 4px 4px 1px; FONT: 10pt Tahoma">
<DIV>I think that just</DIV>
<DIV>select count(sp.id) from specimens sp INNER JOIN&nbsp;sequences s on
=
s.specimen_id =3D sp.id;<BR>should be enough<BR><BR>&gt;&gt;&gt; Julien =
&lt;jcigar@[EMAIL PROTECTED]
 2008-03-13 17:10 &gt;&gt;&gt;<BR>If I understood
=
well the query plan, the planner optimize the<BR>IN(SELECT ...) version =
with a JOIN (line 19-20 of the first paste) :<BR><BR>-&gt;&nbsp; Hash IN =
Join&nbsp; (cost=3D240.95..4011.20 rows=3D1436 width=3D4) (actual<BR>time=
=3D93.971..201.908 rows=3D1431
loops=3D1)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
&nbsp;&nbsp;&nbsp; Hash Cond: ("outer".id =3D
"inner".specimen_id)<BR><BR>s=
o I guess that :<BR><BR>select count(sp.id) from specimens sp where sp.id
=
in (select specimen_id<BR>from sequences);<BR><BR>is almost the same as =
:<BR><BR>select count(sp.id) from specimens sp INNER JOIN (select =
specimen_id<BR>from sequences GROUP BY specimen_id) as foo on
foo.specimen_=
id =3D sp.id;<BR><BR>?<BR><BR>Thanks,<BR>Julien<BR><BR>On Thu, 2008-03-13
=
at 14:46 +0100, Bart Degryse wrote:<BR>&gt; The chapter on indexes in the
=
manual should give you a pretty good<BR>&gt; idea on the why.<BR>&gt; IN =
and EXISTS are not the only possibilities, you can also use inner<BR>&gt;
=
or outer joins.<BR>&gt; Which solution performs best depends on the data,
=
the database<BR>&gt; version, the available indexes, ...<BR>&gt; <BR>&gt;
=
&gt;&gt;&gt; Julien &lt;jcigar@[EMAIL PROTECTED]
 2008-03-13 15:47
&gt;&gt;&gt;<=
BR>&gt; Hello,<BR>&gt; <BR>&gt; Does anyone has an idea why
sometimes:<BR>&=
gt; - select ... where ... in (select ...)<BR>&gt; is faster than =
:<BR>&gt; - select ... where ... exists(select ...)<BR>&gt; and sometimes
=
it's the opposite ?<BR>&gt; <BR>&gt; I had such a situation, I've pasted =
the queries on:<BR>&gt; <A
href=3D"http://rafb.net/p/KXNZ6892.html">http://=
rafb.net/p/KXNZ6892.html</A> and <A
href=3D"http://rafb.net/p/jvo5DO38.html=
">http://rafb.net/p/jvo5DO38.html</A><BR>&gt;
<BR>&gt; It's running =
PostgreSQL 8.1 with an effective_cache_size of 30000.<BR>&gt; <BR>&gt; =
specimens.id is the primary key and there are indexes on<BR>&gt;
sequences(=
specimen_id) and specimen_measurements(specimen_id)<BR>&gt; <BR>&gt; Is =
there a general "rule" to know when to use the in() version and<BR>&gt; =
when<BR>&gt; to use the exists() version ? Is it true to say that the =
exists()<BR>&gt; version is more scalable (with many rows) than the in() =
version (from<BR>&gt; the little tests I made it seems the case) ?<BR>&gt;
=
<BR>&gt; Thanks,<BR>&gt; Julien<BR>&gt; <BR>&gt; -- <BR>&gt; Julien =
Cigar<BR>&gt; Belgian Biodiversity Platform<BR>&gt; <A
href=3D"http://www.b=
iodiversity.be/">http://www.biodiversity.be</A><BR>&gt;
Universit=C3=A9 =
Libre de Bruxelles (ULB)<BR>&gt; Campus de la Plaine CP 257<BR>&gt; =
B=C3=A2timent NO, Bureau 4 N4 115C (Niveau 4)<BR>&gt; Boulevard du =
Triomphe, entr=C3=A9e ULB 2<BR>&gt; B-1050 Bruxelles<BR>&gt; Mail: =
jcigar@[EMAIL PROTECTED]
>&gt; @[EMAIL PROTECTED]
 <A
href=3D"http://biobel.biodiversity.be/=
person/show/471">http://biobel.biodiversity.be/person/show/471</A><BR>&gt;
=
Tel : 02 650 57 52<BR>&gt; <BR>&gt; <BR>&gt; -- <BR>&gt; Sent via =
pgsql-sql mailing list (pgsql-sql@[EMAIL PROTECTED]
)<BR>&gt; To make changes
=
to your subscription:<BR>&gt; <A
href=3D"http://www.postgresql.org/mailpref=
/pgsql-sql">http://www.postgresql.org/mailpref/pgsql-sql</A><BR>&gt;
=
<BR>-- <BR>Julien Cigar<BR>Belgian Biodiversity Platform<BR><A
href=3D"http=
://www.biodiversity.be/">http://www.biodiversity.be</A><BR>Universit=C3=A9
=
Libre de Bruxelles (ULB)<BR>Campus de la Plaine CP 257<BR>B=C3=A2timent =
NO, Bureau 4 N4 115C (Niveau 4)<BR>Boulevard du Triomphe, entr=C3=A9e ULB
=
2<BR>B-1050 Bruxelles<BR>Mail: jcigar@[EMAIL PROTECTED]
>@[EMAIL PROTECTED]
 <A
href=3D"http=
://biobel.biodiversity.be/person/show/471">http://biobel.biodiversity.be/pe=
rson/show/471</A><BR>Tel : 02 650 57 52<BR><BR><BR>-- <BR>Sent via =
pgsql-sql mailing list (pgsql-sql@[EMAIL PROTECTED]
)<BR>To make changes to =
your subscription:<BR><A
href=3D"http://www.postgresql.org/mailpref/pgsql-s=
ql">http://www.postgresql.org/mailpref/pgsql-sql</A><BR></DIV></BODY></HTML=
>

--=__PartB1972271.0__=--
 




 1 Posts in Topic:
Re: in() VS exists()
Bart.Degryse@[EMAIL PROTE  2008-03-13 15:12:49 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Mon Dec 1 21:36:11 CST 2008.