"Ed Prochak" <edprochak@[EMAIL PROTECTED]
> wrote in message
news:cb72cff3-17c7-4bf9-b61d-bb5c0b268a41@[EMAIL PROTECTED]
> On Jun 27, 7:59 am, "Roy Hann" <specia...@[EMAIL PROTECTED]
>
> wrote:
> > "Ed Prochak" <edproc...@[EMAIL PROTECTED]
> wrote in message
> >
> >
news:4b1507c9-207e-4555-bd27-42dd0e84de9d@[EMAIL PROTECTED]
> >
> > > On Jun 26, 12:31 pm, "David Cressey" <cresse...@[EMAIL PROTECTED]
> wrote:
> > > When I see
> > > SELECT DISTINCT
> > > in anything but an ad hoc query, I know something is wrong with the
> > > query.
> >
> > Eh?! How can you possibly "know"any such thing? Projecting away
> > distingui****ng columns is perfectly respectable, desirable and
necessary.
> > And since SQL won't discrard the resulting duplicates automatically
(unless,
> > bizarrely, it is in a UNION) we need to tell it to do so explicitly.
> >
> > > Possibly it is joining to a wrong table (to a detail table
> > > instead of a header/parent table for example) or some filtering
> > > condition is missed (a column not in the select list needs to be
> > > used), or some join condition was missed (there is a compound key
and
> > > one component was left out). Even in an ad hoc query I tend to use
> > > SELECT COUNT(*)
> > > since that gives me a little bit more information for the same cost
as
> > > DISTINCT (i.e. it reads the same amount of data).
> >
> > > Seeing this post, I'm actually a little disappointed in you, David.
> > > Your last sentence does show you mainly follow a good approach. It's
> > > just that DISTINCT should be left to ad hoc queries IMO.
> >
> > This is not a matter to be decided on the basis of mere opinion.
> >
> > Roy
>
> Keep the context in mind. David was noting cases where he was asked to
> <quote> help with a query
> that produces wrong results, and when I try to change their "select"
> to
> "select distinct" to see if maybe that's the problem </quote>
>
> I am saying DISTINCT should be one of the last things tried in this
> situation. I usually begin by asking the programmer: so exactly what
> is wrong with the results? It is easier to debug code when you know
> what the result should be.
You weren't talking to the programmers I was talking to.
>
> But yes I stick to my opinion that DISTINCT is seldom needed in
> production queries. Right now I cannot think of any case where I
> needed it in a production application.
>
There are a large body of queries where SELECT and SELECT DISTINCT can be
shown to necessarily produce identical results. I can't help wi****ng that
the optimizer could figure it out. But that's probably way down on the
list
of things an optimizer should be able to do.
I will strongly suggest to you that SELECT DISTINCT is more faithful to
the
relational data model than SELECT ALL is. I also think that it's wierd
that
UNION defaulted to UNION DISTINCT instead of UNION ALL.
I stand by my original comment. If I can show that SELECT DISTINCT
produces
right results where SELECT (meaning SELECT ALL) produces wrong results
then
the programmer and I have saved about a half an hour of discussing "what
are
you really trying to do?".
Sometimes the DISTINCT feature had to be added to a subquery. That gets
really interesting.
And I didn't always do this when consulted by a programmer. Only when
something gave me a hunch that the programmer was thyinking in terms of
sets
while SQL was thinking in terms of bags.


|