Bob Badour wrote:
> Marshall Spight wrote:
> >>Obviously, missing information is a difficult problem no matter what
> >>data model one uses. We currently have no theory regarding missing
> >>information which means we have no theory to overcome the practical
> >>problem in any data model.
> >
> > Well, I would propose that we understand cardinality-0 relations
> > pretty well, and I think that provides a lot of value as far as
> > a theoretical basis goes. (This doesn't help us when using SQL,
> > though.)
>
> I think you overstate the value of empty sets. The closed world
> assumption limits the usefulness of empty sets with respect to missing
> information.
>
> The most common situation where people demand NULL is the situation
> where we know a true statement exists for some value of an attribute,
> but we do not know for which value.
>
> One proposed solution for modelling such a situation is to replace the
> simple valued attribute that is possibly unknown with a relation valued
> attribute having an empty candidate key. Then, when the value is known,
> the relation valued attribute will have a single tuple with the known
> value, and when the value is unknown, the relation valued attribute will
> have zero tuples.
Yes; this is a fine technique. In fact, to me this technique is one of
the prime motivations for RVAs.
> But if we accept the closed world assumption, the cardinality-0 relation
> says we know that no instances of the predicate are true. This, of
> course, contradicts the initial condition where we know some value
> satisfies the predicate, but we do not know which value.
>
> And if we use this trick to model the missing information, how do we
> distinguish it from the situation where we genuinely know that no value
> satisfies the predicate?
If we have a requirement to distinguish between
exists-but-we-don't-know
and doesn't-exist, then the cardinality-0 model is insufficient. But
there
are also cases where we won't need to so distinguish, and cases
where doesn't-exist is not a possibility. In those cases, I would
propose that cardinality-0 is often a better choice than SQL's NULL.
I don't think it's a complete solution, though. I think some kind of
sum type, as in SML, is also quite desirable. This lets the
modeller create special values according to the requirements
of the domain.
> I really see no way around knowing all of the requirements and
> reluctantly choosing the least among evils for each individual
> situation. I admit this is ad hoc and risky, but I find it less risky
> and no more ad hoc than the known alternatives.
I agree.
To clarify, I am certainly *not* proposing that there is any
shortcut for doing the hard work of building a good model
for the domain in question.
Marshall


|