Tom,
Thank you for your prompt reply.
On Tue, Apr 29, 2008 at 10:19 PM, Tom Lane <tgl@[EMAIL PROTECTED]
> wrote:
> Len Shapiro <len@[EMAIL PROTECTED]
> writes:
> > 1. Why does Postgres come up with a negative n_distinct?
>
> It's a fractional representation. Per the docs:
>
> > stadistinct float4 The number of distinct nonnull data
values in the column. A value greater than zero is the actual number of
distinct values. A value less than zero is the negative of a fraction of
the number of rows in the table (for example, a column in which values
appear about twice on the average could be represented by stadistinct =
-0.5). A zero value means the number of distinct values is unknown
I asked about n_distinct, whose do***entation reads in part "The
negated form is used when ANALYZE believes that the number of distinct
values is likely to increase as the table grows". and I asked about
why ANALYZE believes that the number of distinct values is likely to
increase. I'm unclear why you quoted to me the do***entation on
stadistinct.
>
>
> > The "rows=2" estimate makes sense when const = 1 or 5, but it makes
no
> > sense to me for other values of const not in the MVC list.
> > For example, if I run the query
> > EXPLAIN SELECT * from sailors where rank = -1000;
> > Postgres still gives an estimate of "row=2".
>
> I'm not sure what estimate you'd expect instead?
Instead I would expect an estimate of "rows=0" for values of const
that are not in the MCV list and not in the histogram. When the
histogram has less than the maximum number of entries, implying (I am
guessing here) that all non-MCV values are in the histogram list, this
seems like a simple strategy and has the virtue of being accurate.
Where in the source is the code that manipulates the histogram?
> The code has a built in
> assumption that no value not present in the MCV list can be more
> frequent than the last member of the MCV list, so it's definitely not
> gonna guess *more* than 2.
That's interesting. Where is this in the source code?
Thanks for all your help.
All the best,
Len Shapiro
> regards, tom lane
>
--
Sent via pgsql-performance mailing list (pgsql-performance@[EMAIL PROTECTED]
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


|