MESSAGE
DATE | 2015-02-26 |
FROM | Ruben Safir
|
SUBJECT | Subject: [LIU Comp Sci] [mrbrklyn@panix.com: (fwd) Re: Role of functional dependencies in
|
From owner-learn-outgoing-at-mrbrklyn.com Thu Feb 26 07:09:42 2015 Return-Path: X-Original-To: archive-at-mrbrklyn.com Delivered-To: archive-at-mrbrklyn.com Received: by mrbrklyn.com (Postfix) id 68640161174; Thu, 26 Feb 2015 07:09:42 -0500 (EST) Delivered-To: learn-outgoing-at-mrbrklyn.com Received: by mrbrklyn.com (Postfix, from userid 28) id 4DDB9161186; Thu, 26 Feb 2015 07:09:42 -0500 (EST) Delivered-To: learn-at-nylxs.com Received: from mailbackend.panix.com (mailbackend.panix.com [166.84.1.89]) by mrbrklyn.com (Postfix) with ESMTP id 8873A161174 for ; Thu, 26 Feb 2015 07:09:41 -0500 (EST) Received: from panix2.panix.com (panix2.panix.com [166.84.1.2]) by mailbackend.panix.com (Postfix) with ESMTP id 6124F1073A for ; Thu, 26 Feb 2015 07:09:41 -0500 (EST) Received: by panix2.panix.com (Postfix, from userid 20529) id 5413733C81; Thu, 26 Feb 2015 07:09:41 -0500 (EST) Date: Thu, 26 Feb 2015 07:09:41 -0500 From: Ruben Safir To: learn-at-nylxs.com Subject: [LIU Comp Sci] [mrbrklyn-at-panix.com: (fwd) Re: Role of functional dependencies in database design] Message-ID: <20150226120941.GB7105-at-panix.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) Sender: owner-learn-at-mrbrklyn.com Precedence: bulk Reply-To: learn-at-mrbrklyn.com
----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:08:29 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Re: Role of functional dependencies in database design User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- X-Received: by 10.182.126.168 with SMTP id mz8mr8530830obb.9.1424412973765; Thu, 19 Feb 2015 22:16:13 -0800 (PST) X-Received: by 10.50.1.113 with SMTP id 17mr278751igl.8.1424412973677; Thu, 19 Feb 2015 22:16:13 -0800 (PST) Path: reader1.panix.com!panix!news.linkpendium.com!news.linkpendium.com!news.glorb.com!hl2no19617434igb.0!news-out.google.com!db6ni32977igc.0!nntp.google.com!hl2no19617425igb.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.databases.theory Date: Thu, 19 Feb 2015 22:16:12 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse-at-google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=24.68.153.253; posting-account=FUDi7AoAAACOn42BTSjw9ZlWNiwGl2Nx NNTP-Posting-Host: 24.68.153.253 References: User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <80f34c94-e62e-4d02-8244-7e928acdf476-at-googlegroups.com> Subject: Re: Role of functional dependencies in database design From: compdb-at-hotmail.com Injection-Date: Fri, 20 Feb 2015 06:16:13 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Lines: 96 Xref: panix comp.databases.theory:75563
On Thursday, February 12, 2015 at 8:02:19 AM UTC-8, Nicola wrote:
> So, the question is: what is the role of FDs (and possibly other types of= dependencies) in database design?
We determine predicates that describe application situations. Basic predica= tes correspond to base tables. A FD holds when a certain constraint predica= te built from a table predicate holds. (A JD corresponds to being able to w= rite a table predicate as a conjunction of others. A FD corresponds to a ce= rtain JD.)
> In general, I don't think that it is possible for them to inform the desi= gn activity since the early stages or that they should be used as the main = design tool.
As soon as you have a predicate you can determine FDs.
> > Erwin wrote: > >There you have it. When people tell you "there can only be one can=20 > >opener on any given desert island", then what do you see ? > > > >A table {desert island, can opener} with (a.o.) a key "desert island",= =20 > >or > >A relation schema of whatever set of attributes in which you must add=20 > >the FD desert island -> can opener ? Where this then tells you "the=20 > >identity of a desert island is a determinant factor to the identity of= =20 > >the can opener that is on it" or "the relation over {desert island, can= =20 > >opener} is a function from desert island to can opener" ?
You don't need keys or entities, you just need predicates and determinants.
"If desert island I has opener Oa and desert island I has opener Ob then Oa= =3D Ob."
Ergo I -> O in "desert island I has opener O".
> At an early stage, you probably cannot say: it might also be that desert = island and can opener should be represented as two relation schemas (for w= hich we have identified no attributes yet) related through a one-one or one= -many relationship. My point here is simply that FDs are not invented, they= are discovered.
First *say something* then you can find the FDs. There is no "one-one or o= ne-many relationship" until then. First comes *a particular* relationship i= e predicate then come parameters/attributes and then arities and other cons= traints.
> >> FDs are a special type of first-order formulas. I can imagine defining= an English-like syntax for them. >=20 > (Not that it would necessarily be a good idea, I should add.) >=20 > >Hmmmmmmmm.
FD F1,... -> T holds in predicate R(F1,...,T,...) when (for all situations)= :
FORALL F1,...,Ta,...,Tb,... R(F1,...,Ta,...) AND R(F1,...,Tb,...) =3D> Ta =3D Tb
Notice that the determinant variables are unchanged between atoms, that the= re are two determined variables for one value, and that the other variables= are don't-cares.
We can describe this pattern. Eg an FD holds "when each satisfying determin= ant subtuple has just one determined value". But all we have to do is *use*= it with each predicate.
> given two lecturers L1 and L2 and a time T > when L1 is teaching at time T and L2 is teaching at time T > then L1 and L2 are in different rooms
FORALL L1,L2,T,R1,R2 teaches(L1,T,R1) AND teaches(L2,T,R2) =3D> R1 <> R2
> (this is absolutely off the top of my head: don't take it as a serious at= tempt).
No FD.
For "then it was the same room", T -> R. For "then it was the same lecturer= ", T -> L.
> >And besides, avoiding redundancy was only a relevant topic in database= =20 > >design as long as there were no feasible ways to control the=20 > >redundancies, e.g. via ASSERTIONs. Those days are gone.
But predicates, queries and updates are simpler in 5NF, ie when the only JD= s are implied by the CKs.
philip -- end of forwarded message --
----- End forwarded message ----- ----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:08:29 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Re: Role of functional dependencies in database design User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- X-Received: by 10.236.11.193 with SMTP id 41mr8795080yhx.53.1424424172307; Fri, 20 Feb 2015 01:22:52 -0800 (PST) X-Received: by 10.140.28.11 with SMTP id 11mr170251qgy.21.1424424172287; Fri, 20 Feb 2015 01:22:52 -0800 (PST) Path: reader1.panix.com!panix!bloom-beacon.mit.edu!bloom-beacon.mit.edu!newsswitch.lcs.mit.edu!ottix-news.ottix.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!j7no1842033qaq.1!news-out.google.com!n6ni12qar.0!nntp.google.com!i13no2515482qae.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.databases.theory Date: Fri, 20 Feb 2015 01:22:52 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse-at-google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=212.123.30.210; posting-account=-nQufgoAAABsreOCZNqo2Uyh8O-fYVPT NNTP-Posting-Host: 212.123.30.210 References: <80f34c94-e62e-4d02-8244-7e928acdf476-at-googlegroups.com> User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <64b9cd98-8a5e-4f95-a805-a8ab8c80ed70-at-googlegroups.com> Subject: Re: Role of functional dependencies in database design From: Erwin Injection-Date: Fri, 20 Feb 2015 09:22:52 +0000 Content-Type: text/plain; charset=ISO-8859-1 Lines: 56 Xref: panix comp.databases.theory:75566
Op vrijdag 20 februari 2015 10:08:58 UTC+1 schreef Nicola: > In article <>, > pk wrote: > > > > given two lecturers L1 and L2 and a time T > > > when L1 is teaching at time T and L2 is teaching at time T > > > then L1 and L2 are in different rooms > > > > FORALL L1,L2,T,R1,R2 > > teaches(L1,T,R1) AND teaches(L2,T,R2) => R1 <> R2 > > I think that it should be: > > FORALL L1,L2,T,R1,R2 > L1 <> L2 AND teaches(L1,T,R1) AND teaches(L2,T,R2) => R1 <> R2 > > > > (this is absolutely off the top of my head: don't take it as a serious > > > attempt). > > > > No FD. > > Correct. The FD is implied, though: > > Let X be L1 <> L2 > Let Y be teaches(L1,T,R1) AND teaches(L2,T,R2) > Let Z be R1 <> R2 > > Then the sentence above is the universal closure of > > X AND Y -> Z, > > If my coffee has worked as it should, the latter is equivalent to > > (NOT Z) AND Y -> NOT X, > > that is, > > R1 = R2 AND teaches(L1,T,R1) AND teaches(L2,T,R2) -> L1 = L2, > > or, simplifying, > > teaches(L1,T,R) AND teaches(L2,T,R) -> L1 = L2. > > The idea behind my example is that a hypothetical formalism based on > natural language might allow the user to write specifications that are > not a direct translation of the logical definition of a FD, but from > which the system might *infer* a FD, giving the user more freedom in the > way constraints may be expressed. But again, I haven't really thought > about it too much. > > Nicola > > --- news://freenews.netfront.net/ - complaints: ---
Well, even if you can manage to get that formalism fully specced (a tall order imo), the main problem will still remain that if you apply that formalism to any given input, the output produced will only be as reliable as the input it is applied to.
Which is by the way illustrated perfectly by the example used : I carelessly overlooked the possibility of L1=L2, thus neglected to explicitly specify L1<>L2 as a premisse (tacitly assuming it would be tacitly understood), and the rest is history ... -- end of forwarded message --
----- End forwarded message ----- ----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:08:29 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Re: Role of functional dependencies in database design User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- Path: reader1.panix.com!panix!goblin3!goblin.stu.neva.ru!news.netfront.net!nvitacolonna From: Nicola Newsgroups: comp.databases.theory Subject: Re: Role of functional dependencies in database design Date: Fri, 20 Feb 2015 10:08:54 +0100 Organization: --- Lines: 52 Message-ID: References: <80f34c94-e62e-4d02-8244-7e928acdf476-at-googlegroups.com> NNTP-Posting-Host: 158.110.226.55 Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Trace: adenine.netfront.net 1424423336 19422 158.110.226.55 (20 Feb 2015 09:08:56 GMT) X-Complaints-To: news-at-netfront.net NNTP-Posting-Date: Fri, 20 Feb 2015 09:08:56 +0000 (UTC) User-Agent: MT-NewsWatcher/3.5.3b3 (Intel Mac OS X) Xref: panix comp.databases.theory:75565
In article <80f34c94-e62e-4d02-8244-7e928acdf476-at-googlegroups.com>, compdb-at-hotmail.com wrote:
> > given two lecturers L1 and L2 and a time T > > when L1 is teaching at time T and L2 is teaching at time T > > then L1 and L2 are in different rooms > > FORALL L1,L2,T,R1,R2 > teaches(L1,T,R1) AND teaches(L2,T,R2) => R1 <> R2
I think that it should be:
FORALL L1,L2,T,R1,R2 L1 <> L2 AND teaches(L1,T,R1) AND teaches(L2,T,R2) => R1 <> R2 > > (this is absolutely off the top of my head: don't take it as a serious > > attempt). > > No FD.
Correct. The FD is implied, though:
Let X be L1 <> L2 Let Y be teaches(L1,T,R1) AND teaches(L2,T,R2) Let Z be R1 <> R2
Then the sentence above is the universal closure of
X AND Y -> Z,
If my coffee has worked as it should, the latter is equivalent to
(NOT Z) AND Y -> NOT X,
that is,
R1 = R2 AND teaches(L1,T,R1) AND teaches(L2,T,R2) -> L1 = L2,
or, simplifying,
teaches(L1,T,R) AND teaches(L2,T,R) -> L1 = L2.
The idea behind my example is that a hypothetical formalism based on natural language might allow the user to write specifications that are not a direct translation of the logical definition of a FD, but from which the system might *infer* a FD, giving the user more freedom in the way constraints may be expressed. But again, I haven't really thought about it too much. Nicola
--- news://freenews.netfront.net/ - complaints: news-at-netfront.net --- -- end of forwarded message --
----- End forwarded message ----- ----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:08:29 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Re: Role of functional dependencies in database design User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- Path: reader1.panix.com!panix!goblin3!goblin.stu.neva.ru!news.netfront.net!nvitacolonna From: Nicola Newsgroups: comp.databases.theory Subject: Re: Role of functional dependencies in database design Date: Mon, 23 Feb 2015 17:34:06 +0100 Organization: --- Lines: 19 Message-ID: References: <80f34c94-e62e-4d02-8244-7e928acdf476-at-googlegroups.com> <64b9cd98-8a5e-4f95-a805-a8ab8c80ed70-at-googlegroups.com> NNTP-Posting-Host: 158.110.144.157 Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Trace: adenine.netfront.net 1424709247 95448 158.110.144.157 (23 Feb 2015 16:34:07 GMT) X-Complaints-To: news-at-netfront.net NNTP-Posting-Date: Mon, 23 Feb 2015 16:34:07 +0000 (UTC) User-Agent: MT-NewsWatcher/3.5.3b3 (Intel Mac OS X) Xref: panix comp.databases.theory:75571
In article <64b9cd98-8a5e-4f95-a805-a8ab8c80ed70-at-googlegroups.com>, Erwin wrote:
> Well, even if you can manage to get that formalism fully specced (a tall > order imo), the main problem will still remain that if you apply that > formalism to any given input, the output produced will only be as reliable as > the input it is applied to.
That is the unsolvable problem of every specification: you may be guaranteed that the output is correct wrt the specs, but how do you ensure that the specs are correct? In principle, they should be easier to write and understand than the system you want to verify, but it doesn't mean that they do not require a good deal of thought and technical competence. Still, formal specifications are arguably useful: safety-critical software would not be as reliable, for example.
Nicola
--- news://freenews.netfront.net/ - complaints: news-at-netfront.net --- -- end of forwarded message --
----- End forwarded message ----- ----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:08:28 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Role of functional dependencies in database design User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- Path: reader1.panix.com!panix!goblin3!goblin.stu.neva.ru!news.netfront.net!nvitacolonna From: Nicola Newsgroups: comp.databases.theory Subject: Role of functional dependencies in database design Date: Thu, 12 Feb 2015 17:02:14 +0100 Organization: --- Lines: 74 Message-ID: NNTP-Posting-Host: 158.110.144.157 Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Trace: adenine.netfront.net 1423756936 48464 158.110.144.157 (12 Feb 2015 16:02:16 GMT) X-Complaints-To: news-at-netfront.net NNTP-Posting-Date: Thu, 12 Feb 2015 16:02:16 +0000 (UTC) User-Agent: MT-NewsWatcher/3.5.3b3 (Intel Mac OS X) Xref: panix comp.databases.theory:75523
I hope you won't mind if I start a new thread for the subject. The other thread has got quite long and somewhat off-topic.
So, the question is: what is the role of FDs (and possibly other types of dependencies) in database design? My personal take:
In general, I don't think that it is possible for them to inform the design activity since the early stages or that they should be used as the main design tool. I do think, however, that when a system is understood sufficiently well (e.g., by means of a conceptual design), they offer an aid (are my schemas well-designed according to certain criteria?) and insight (allowing us to possibly find alternative ways to design a database, for example).
>> Nicola wrote: >> FDs do not come out of void, they follow from your requirements. They >> are just a formal version of a very special (and relatively simple) >> type >> of requirement. I don't arbitrarily decide to assume that {desert >> island} -> {can opener}, unless you tell me that there cannot be two >> can >> openers in the same desert island, in the world of desert islands you >> are interested in.
> Erwin wrote: >There you have it. When people tell you "there can only be one can >opener on any given desert island", then what do you see ? > >A table {desert island, can opener} with (a.o.) a key "desert island", >or >A relation schema of whatever set of attributes in which you must add >the FD desert island -> can opener ? Where this then tells you "the >identity of a desert island is a determinant factor to the identity of >the can opener that is on it" or "the relation over {desert island, can >opener} is a function from desert island to can opener" ?
At an early stage, you probably cannot say: it might also be that desert island and can opener should be represented as two relation schemas (for which we have identified no attributes yet) related through a one-one or one-many relationship. My point here is simply that FDs are not invented, they are discovered.
>> FDs are a special type of first-order formulas. I can imagine >> defining >> an English-like syntax for them.
(Not that it would necessarily be a good idea, I should add.)
>Hmmmmmmmm. > >"The projection on {desert island, can opener} of the join of all >tables in the database, represents a function from desert island to can >opener."
Well no, like this certainly no :) Maybe something along these lines (paraphrasing the given-when-then constructs existing in some testing tools):
given two lecturers L1 and L2 and a time T when L1 is teaching at time T and L2 is teaching at time T then L1 and L2 are in different rooms
(this is absolutely off the top of my head: don't take it as a serious attempt).
>And besides, avoiding redundancy was only a relevant topic in database >design as long as there were no feasible ways to control the >redundancies, e.g. via ASSERTIONs. Those days are gone.
Could you provide a small example?
Nicola
--- news://freenews.netfront.net/ - complaints: news-at-netfront.net --- -- end of forwarded message --
----- End forwarded message ----- ----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:08:28 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Re: Role of functional dependencies in database design User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- X-Received: by 10.43.91.68 with SMTP id bl4mr8947228icc.17.1423769980808; Thu, 12 Feb 2015 11:39:40 -0800 (PST) X-Received: by 10.140.83.41 with SMTP id i38mr78531qgd.11.1423769980673; Thu, 12 Feb 2015 11:39:40 -0800 (PST) Path: reader1.panix.com!panix!usenet.stanford.edu!hl2no15126234igb.0!news-out.google.com!n6ni6qar.0!nntp.google.com!j7no154482qaq.1!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.databases.theory Date: Thu, 12 Feb 2015 11:39:40 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse-at-google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=213.219.142.118; posting-account=-nQufgoAAABsreOCZNqo2Uyh8O-fYVPT NNTP-Posting-Host: 213.219.142.118 References: User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <890a51ff-e03f-48a2-ab1d-3c301ae8b501-at-googlegroups.com> Subject: Re: Role of functional dependencies in database design From: Erwin Injection-Date: Thu, 12 Feb 2015 19:39:40 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Lines: 30 Xref: panix comp.databases.theory:75526
Op donderdag 12 februari 2015 19:58:19 UTC+1 schreef Erwin: > Op donderdag 12 februari 2015 17:02:19 UTC+1 schreef Nicola: > > Well no, like this certainly no :) Maybe something along these lines=20 > > (paraphrasing the given-when-then constructs existing in some testing= =20 > > tools): > >=20 > > given two lecturers L1 and L2 and a time T > > when L1 is teaching at time T and L2 is teaching at time T > > then L1 and L2 are in different rooms >=20 > ... convoluted ...
I now see why I feel so uncomfortable with this formulation, and why it doe= sn't reveal the real FD right away.
First, yet another alternative formulation of the same.
Given two lecturers L1 and L2 and a room R when L1 is teaching in room R and L2 is teaching in room R then this is at different times.
The FD is {room, time} on the LHS. room and time are of equal "importance"= here, they play "similar roles". But now look at the word usage for "room= " and "time". Your original speaks of "different rooms" but _NOT_ of diffe= rent times (quite the contrary) !
The _formulation_ in natural language "breaks" a certain kind of "symmetry"= that is indeed present in the math formulation. That makes the discovery/= analysis process harder ! -- end of forwarded message --
----- End forwarded message ----- ----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:08:28 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Re: Role of functional dependencies in database design User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- X-Received: by 10.50.29.6 with SMTP id f6mr4383098igh.2.1423767497936; Thu, 12 Feb 2015 10:58:17 -0800 (PST) X-Received: by 10.140.40.242 with SMTP id x105mr74619qgx.14.1423767497801; Thu, 12 Feb 2015 10:58:17 -0800 (PST) Path: reader1.panix.com!panix!usenet.stanford.edu!hl2no15102169igb.0!news-out.google.com!n6ni6qar.0!nntp.google.com!j7no145566qaq.1!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.databases.theory Date: Thu, 12 Feb 2015 10:58:17 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse-at-google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=213.219.142.118; posting-account=-nQufgoAAABsreOCZNqo2Uyh8O-fYVPT NNTP-Posting-Host: 213.219.142.118 References: User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: Subject: Re: Role of functional dependencies in database design From: Erwin Injection-Date: Thu, 12 Feb 2015 18:58:17 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Lines: 145 Xref: panix comp.databases.theory:75524
Op donderdag 12 februari 2015 17:02:19 UTC+1 schreef Nicola: > I hope you won't mind if I start a new thread for the subject. The other= =20 > thread has got quite long and somewhat off-topic. >=20 > So, the question is: what is the role of FDs (and possibly other types=20 > of dependencies) in database design? My personal take: >=20 > In general, I don't think that it is possible for them to inform the=20 > design activity since the early stages or that they should be used as=20 > the main design tool. I do think, however, that when a system is=20 > understood sufficiently well (e.g., by means of a conceptual design),=20 > they offer an aid (are my schemas well-designed according to certain=20 > criteria?) and insight (allowing us to possibly find alternative ways to= =20 > design a database, for example).
The problem is that formally speaking, you must first have that conceptual = design (entities+relships) translated into a logical one (relvars). Only a= fter that can you start thinking about FDs and normalization, and it requir= es that you somehow derived the FDs from, well what else other than the con= ceptual schema you have, perhaps incremented with a pack of tacit understan= dings of what it represents.
And what in a conceptual schema is it that you can get the FDs from ? NOT = from any key specifications that are already there, because if you already = have them, it's pointless deriving FDs from them only to find out that norm= alization theory gives you exactly the keys you started off with.
The normalization procedure/process as rooted in normalization theory, assu= mes a bottom-up methodology. First assume the database is a single giganti= c table, then start decomposing. The trouble is that most people intuitive= ly prefer top-down, in practice they also do it, and I have certainly known= a time when they got things close to 99% right without any application of = any kind of normalization theory.
>=20 > >> Nicola wrote:=20 > >> FDs do not come out of void, they follow from your requirements. They= =20 > >> are just a formal version of a very special (and relatively simple)=20 > >> type=20 > >> of requirement. I don't arbitrarily decide to assume that {desert=20 > >> island} -> {can opener}, unless you tell me that there cannot be two= =20 > >> can=20 > >> openers in the same desert island, in the world of desert islands you= =20 > >> are interested in. >=20 > > Erwin wrote: > >There you have it. When people tell you "there can only be one can=20 > >opener on any given desert island", then what do you see ? > > > >A table {desert island, can opener} with (a.o.) a key "desert island",= =20 > >or > >A relation schema of whatever set of attributes in which you must add=20 > >the FD desert island -> can opener ? Where this then tells you "the=20 > >identity of a desert island is a determinant factor to the identity of= =20 > >the can opener that is on it" or "the relation over {desert island, can= =20 > >opener} is a function from desert island to can opener" ? >=20 > At an early stage, you probably cannot say: it might also be that desert= =20 > island and can opener should be represented as two relation schemas (for= =20 > which we have identified no attributes yet) related through a one-one or= =20 > one-many relationship. My point here is simply that FDs are not=20 > invented, they are discovered.
My question, and what I would like to understand is : what exactly is the a= vailable input (its semantics and in which possible syntactic forms does it= appear, and where) that allows this "discovery" to be made ? I think JKL'= s post where he used the word "pesky" expressed the very same concern.
>=20 > >> FDs are a special type of first-order formulas. I can imagine >>=20 > defining=20 > >> an English-like syntax for them. >=20 > (Not that it would necessarily be a good idea, I should add.) >=20 > >Hmmmmmmmm. > > > >"The projection on {desert island, can opener} of the join of all=20 > >tables in the database, represents a function from desert island to can= =20 > >opener." >=20 > Well no, like this certainly no :) Maybe something along these lines=20 > (paraphrasing the given-when-then constructs existing in some testing=20 > tools): >=20 > given two lecturers L1 and L2 and a time T > when L1 is teaching at time T and L2 is teaching at time T > then L1 and L2 are in different rooms
Well, there you have it. If you think that this sentence indicates an FD {= L,T} -> {room}, then you are wrong ! This statement per se does not prohib= it the same lecturer being in two distinct rooms at the same time. (Physic= s does, perhaps, but that really is another issue. My issue is correct log= ical inferences from given requirements as stated.)
The correct conclusion here is the FD {room,T} -> {L}. But before spotting= that, I first had to twist the requirement into "only one lecturer can be = in any given room at any time". Or "if two observations of a lecturer bein= g in a room at time T, both involve the same room and the same time T, then= they involve the same lecturer". But (knowing what the FD turned out to b= e and looking back on the natural language formulation) I find those altern= atives only slightly less convoluted than your original.
(Your example being "temporal" is very tempting to go drifting off into the= temporal arena. In fact, the temporal proposals by Darwen/Date/Lorentzos = talk about keys almost all the time, but normalization theory isn't mention= ed even once. My feeling is the current normalization theory is insufficie= nt to underpin their approach soundly, and although I very much like their = approach altogether, I'm amazed I never see any complaint about this.)
(...leaving the ASSERTIONs example for another post)
>=20 > (this is absolutely off the top of my head: don't take it as a serious=20 > attempt). >=20 > >And besides, avoiding redundancy was only a relevant topic in database= =20 > >design as long as there were no feasible ways to control the=20 > >redundancies, e.g. via ASSERTIONs. Those days are gone. >=20 > Could you provide a small example? >=20 > Nicola >=20 > --- news://freenews.netfront.net/ - complaints: ---
-- end of forwarded message --
----- End forwarded message ----- ----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:08:28 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Re: Role of functional dependencies in database design User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- Path: reader1.panix.com!panix!goblin2!goblin.stu.neva.ru!newsgate.cuhk.edu.hk!news.netfront.net!nvitacolonna From: Nicola Newsgroups: comp.databases.theory Subject: Re: Role of functional dependencies in database design Date: Thu, 12 Feb 2015 22:43:20 +0100 Organization: --- Lines: 68 Message-ID: References: NNTP-Posting-Host: 151.49.175.176 Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Trace: adenine.netfront.net 1423777402 54438 151.49.175.176 (12 Feb 2015 21:43:22 GMT) X-Complaints-To: news-at-netfront.net NNTP-Posting-Date: Thu, 12 Feb 2015 21:43:22 +0000 (UTC) User-Agent: MT-NewsWatcher/3.5.3b3 (Intel Mac OS X) Xref: panix comp.databases.theory:75527
In article , Erwin wrote:
> My question, and what I would like to understand is : what exactly is the > available input (its semantics and in which possible syntactic forms does it > appear, and where) that allows this "discovery" to be made ? I think JKL's > post where he used the word "pesky" expressed the very same concern.
Formally, FDs are axioms. As such, *within* the formal system they are given facts, but the discovery must happen *outside* the formal system. Hence, the available input cannot be described formally (or, if it can, it will be in a different formal system that will have its own axioms).
You don't have any difficulty with the process of "discovering" keys, do you? I don't see the process of "discovering" other constraints as intrinsically different. > > > > >> FDs are a special type of first-order formulas. I can imagine >> > > defining > > >> an English-like syntax for them. > > > > (Not that it would necessarily be a good idea, I should add.) > > > > >Hmmmmmmmm. > > > > > >"The projection on {desert island, can opener} of the join of all > > >tables in the database, represents a function from desert island to can > > >opener." > > > > Well no, like this certainly no :) Maybe something along these lines > > (paraphrasing the given-when-then constructs existing in some testing > > tools): > > > > given two lecturers L1 and L2 and a time T > > when L1 is teaching at time T and L2 is teaching at time T > > then L1 and L2 are in different rooms > > Well, there you have it. If you think that this sentence indicates an FD > {L,T} -> {room}
Of course not.
> The correct conclusion here is the FD {room,T} -> {L}.
Yes.
> But before spotting > that, I first had to twist the requirement into "only one lecturer can be in > any given room at any time". Or "if two observations of a lecturer being in > a room at time T, both involve the same room and the same time T, then they > involve the same lecturer". But (knowing what the FD turned out to be and > looking back on the natural language formulation) I find those alternatives > only slightly less convoluted than your original.
I said not to take this example too seriously ;) The point of a hypothetical specification in natural language is that you should not bother with the mathematical formulation of the corresponding FD, because the system would take care of it. My example may fail miserably in being higher level compared to the corresponding FD, but that does not mean much. If you had a program that takes as input the sentence "no two teachers can teach in the same room at the same time" and outputs the schema Teaches(room, time, teacher) with key {room, time} as output, would you be satisfied? :)
Nicola
--- news://freenews.netfront.net/ - complaints: news-at-netfront.net --- -- end of forwarded message --
----- End forwarded message ----- ----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:08:28 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Re: Role of functional dependencies in database design User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- X-Received: by 10.66.196.11 with SMTP id ii11mr6016306pac.37.1423779068058; Thu, 12 Feb 2015 14:11:08 -0800 (PST) X-Received: by 10.140.21.146 with SMTP id 18mr95640qgl.30.1423779067797; Thu, 12 Feb 2015 14:11:07 -0800 (PST) Path: reader1.panix.com!panix!bloom-beacon.mit.edu!bloom-beacon.mit.edu!newsswitch.lcs.mit.edu!ottix-news.ottix.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!hl2no10774377igb.0!news-out.google.com!c1ni4qar.1!nntp.google.com!i13no854996qae.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.databases.theory Date: Thu, 12 Feb 2015 14:11:07 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse-at-google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=213.219.142.118; posting-account=-nQufgoAAABsreOCZNqo2Uyh8O-fYVPT NNTP-Posting-Host: 213.219.142.118 References: User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <76a7403c-cea0-43bd-9af0-3bfd89226661-at-googlegroups.com> Subject: Re: Role of functional dependencies in database design From: Erwin Injection-Date: Thu, 12 Feb 2015 22:11:07 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Lines: 99 Xref: panix comp.databases.theory:75528
Op donderdag 12 februari 2015 22:43:24 UTC+1 schreef Nicola: > In article <>, > Erwin <> wrote: >=20 > > My question, and what I would like to understand is : what exactly is t= he=20 > > available input (its semantics and in which possible syntactic forms do= es it=20 > > appear, and where) that allows this "discovery" to be made ? I think J= KL's=20 > > post where he used the word "pesky" expressed the very same concern. >=20 > Formally, FDs are axioms. As such, *within* the formal system they are=20 > given facts, but the discovery must happen *outside* the formal system.= =20 > Hence, the available input cannot be described formally (or, if it can,= =20 > it will be in a different formal system that will have its own axioms). >=20 > You don't have any difficulty with the process of "discovering" keys, do= =20 > you? I don't see the process of "discovering" other constraints as=20 > intrinsically different. > =20 > > >=20 > > > >> FDs are a special type of first-order formulas. I can imagine >>= =20 > > > defining=20 > > > >> an English-like syntax for them. > > >=20 > > > (Not that it would necessarily be a good idea, I should add.) > > >=20 > > > >Hmmmmmmmm. > > > > > > > >"The projection on {desert island, can opener} of the join of all=20 > > > >tables in the database, represents a function from desert island to = can=20 > > > >opener." > > >=20 > > > Well no, like this certainly no :) Maybe something along these lines= =20 > > > (paraphrasing the given-when-then constructs existing in some testing= =20 > > > tools): > > >=20 > > > given two lecturers L1 and L2 and a time T > > > when L1 is teaching at time T and L2 is teaching at time T > > > then L1 and L2 are in different rooms > >=20 > > Well, there you have it. If you think that this sentence indicates an = FD=20 > > {L,T} -> {room} >=20 > Of course not. >=20 > > The correct conclusion here is the FD {room,T} -> {L}. >=20 > Yes. >=20 > > But before spotting=20 > > that, I first had to twist the requirement into "only one lecturer can = be in=20 > > any given room at any time". Or "if two observations of a lecturer bei= ng in=20 > > a room at time T, both involve the same room and the same time T, then = they=20 > > involve the same lecturer". But (knowing what the FD turned out to be = and=20 > > looking back on the natural language formulation) I find those alternat= ives=20 > > only slightly less convoluted than your original. >=20 > I said not to take this example too seriously ;) The point of a=20 > hypothetical specification in natural language is that you should not=20 > bother with the mathematical formulation of the corresponding FD,=20 > because the system would take care of it. My example may fail miserably= =20 > in being higher level compared to the corresponding FD, but that does=20 > not mean much. If you had a program that takes as input the sentence "no= =20 > two teachers can teach in the same room at the same time" and outputs=20 > the schema Teaches(room, time, teacher) with key {room, time} as=20 > output, would you be satisfied? :) >=20 > Nicola >=20 > --- news://freenews.netfront.net/ - complaints: ---
I probably would.
For what it's worth : "no two teachers can ..." immediately makes me think = of a certain RENAME/JOIN combination that then in turn immediately makes me= conclude "aha, a key".
But my mental process in that regard is probably heavily influenced by the = work I've done on constraint enforcement. Give me a database [design] and = I immediately start looking for all possible [data] faults (and their defin= ing relational expressions). But somehow for keys, that's not the process = I follow. I just wonder "what makes this thing unique". It's probably jus= t too deeply instilled, but it works without FDs. -- end of forwarded message --
----- End forwarded message ----- ----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:08:28 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Re: Role of functional dependencies in database design User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- X-Received: by 10.236.41.73 with SMTP id g49mr3546893yhb.57.1424418161027; Thu, 19 Feb 2015 23:42:41 -0800 (PST) X-Received: by 10.50.1.113 with SMTP id 17mr282774igl.8.1424418160891; Thu, 19 Feb 2015 23:42:40 -0800 (PST) Path: reader1.panix.com!panix!bloom-beacon.mit.edu!bloom-beacon.mit.edu!newsswitch.lcs.mit.edu!ottix-news.ottix.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!j7no1827585qaq.1!news-out.google.com!qk8ni37643igc.0!nntp.google.com!hl2no13372894igb.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.databases.theory Date: Thu, 19 Feb 2015 23:42:40 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse-at-google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=24.68.153.253; posting-account=FUDi7AoAAACOn42BTSjw9ZlWNiwGl2Nx NNTP-Posting-Host: 24.68.153.253 References: User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: Subject: Re: Role of functional dependencies in database design From: compdb-at-hotmail.com Injection-Date: Fri, 20 Feb 2015 07:42:40 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Lines: 87 Xref: panix comp.databases.theory:75564
On Thursday, February 12, 2015 at 1:43:24 PM UTC-8, Nicola wrote: > In article , Erwin= wrote: >=20 > > My question, and what I would like to understand is : what exactly is t= he=20 > > available input (its semantics and in which possible syntactic forms do= es it=20 > > appear, and where) that allows this "discovery" to be made ? I think J= KL's=20 > > post where he used the word "pesky" expressed the very same concern. >=20 > Formally, FDs are axioms. As such, *within* the formal system they are=20 > given facts, but the discovery must happen *outside* the formal system.= =20 > Hence, the available input cannot be described formally (or, if it can,= =20 > it will be in a different formal system that will have its own axioms).
Each base table holds rows that make a corresponding designer-chosen predic= ate true of the current application situation. A user evaluates the predica= te for every row that could fit in a table and puts the ones making true pr= opositions in. They look at the tables and plug the present and absent rows= into its predicate to learn how the world is. They query by mapping a pred= icate about the application situation to a hybrid relation/predicate expres= sion that the DBMS evaluates to the rows making the predicate true.
A designer shows that particular predicates built from these base predicate= s are true in every application situation that can arise. These are the con= straints. A constraint is simultaneously a truth about each database state = and its corresponding application situation. So the designer describes the = application but this simultaneously describes table values.
> > > >> FDs are a special type of first-order formulas.
From: com... > F1,... -> T > FORALL F1,...,Ta,...,Tb,... > R(F1,...,Ta,...) AND R(F1,...,Tb,...) > =3D> Ta =3D Tb
> > > given two lecturers L1 and L2 and a time T > > > when L1 is teaching at time T and L2 is teaching at time T > > > then L1 and L2 are in different rooms
From: com... > FORALL L1,L2,T,R1,R2 > teaches(L1,T,R1) AND teaches(L2,T,R2) =3D> L1 <> L2
> > Well, there you have it. If you think that this sentence indicates an = FD=20 > > {L,T} -> {room} >=20 > Of course not. >=20 > > The correct conclusion here is the FD {room,T} -> {L}. >=20 > Yes.
No. That FD is:
FORALL L1,L2,T,R teaches(L1,T,R) AND teaches(L2,T,R) =3D> L1 =3D L2
> > But before spotting=20 > > that, I first had to twist the requirement into "only one lecturer can = be in=20 > > any given room at any time". Or "if two observations of a lecturer bei= ng in=20 > > a room at time T, both involve the same room and the same time T, then = they=20 > > involve the same lecturer". But (knowing what the FD turned out to be = and=20 > > looking back on the natural language formulation) I find those alternat= ives=20 > > only slightly less convoluted than your original.
It's not the same as the original constraint. Which isn't an FD. (Not surpr= ising.)
But the natural language for an FD is not hard:
// {room,T} -> {L} Given lecturers L1 and L2 and time T and room R, if L1 teaches at T in R and L2 teaches at T in R then L1 =3D L2
philip -- end of forwarded message --
----- End forwarded message ----- ----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:08:28 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Re: Role of functional dependencies in database design User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- X-Received: by 10.68.241.163 with SMTP id wj3mr5176663pbc.8.1423769142286; Thu, 12 Feb 2015 11:25:42 -0800 (PST) X-Received: by 10.140.43.10 with SMTP id d10mr77882qga.26.1423769142021; Thu, 12 Feb 2015 11:25:42 -0800 (PST) Path: reader1.panix.com!panix!usenet.stanford.edu!hl2no15119363igb.0!news-out.google.com!n6ni6qar.0!nntp.google.com!j7no151813qaq.1!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.databases.theory Date: Thu, 12 Feb 2015 11:25:41 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse-at-google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=213.219.142.118; posting-account=-nQufgoAAABsreOCZNqo2Uyh8O-fYVPT NNTP-Posting-Host: 213.219.142.118 References: User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <91803a70-ff44-43d8-ab9b-09c994df292a-at-googlegroups.com> Subject: Re: Role of functional dependencies in database design From: Erwin Injection-Date: Thu, 12 Feb 2015 19:25:42 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Lines: 47 Xref: panix comp.databases.theory:75525
Op donderdag 12 februari 2015 17:02:19 UTC+1 schreef Nicola: > >And besides, avoiding redundancy was only a relevant topic in database= =20 > >design as long as there were no feasible ways to control the=20 > >redundancies, e.g. via ASSERTIONs. Those days are gone. >=20 > Could you provide a small example?
I overplayed my hand, slightly.
Take an example relvar DBR {A B C} subject to the FD {A} -> {B} (and its ke= y thus being {AC}).
Rewind 40 years.
Enforcing this FD in that design involves extra checks, and in the time whe= n all that theory originated, this meant : extra coding work, precisely in = the update process that was exposed to the update "anomaly" involved. As f= ar as I can tell, all that theory was developed so that the update "anomali= es" implied by a given design, could be _identified_, so that after that an= _informed_ decision could be made regarding how to deal with it : (a) elim= inate it through redesign, or (b) live with it and keep the update processe= s that are exposed to the "anomaly" "guarded","monitored","under supervisio= n",... But in that day and age, (b) was almost never a viable option, simp= ly because the _means_ for doing so (which is "if a program gets it wrong, = make sure it won't leave its devastating effects persisted in the database"= , or iow, "declare a constraint") simply weren't available.
Fast forward 40 years.
This day and age, I can simply do the (equivalent of the) following (hypoth= etical SQL) :
CREATE VIEW PROJ AS SELECT DISTINCT A,B FROM DBR CONSTRAINT UNIQUE (A) ;
That is, I have a system to my avail where I simply declare the view (proje= ction on all the attributes mentioned anywhere in an FD), and declare a con= straint to the effect that the LHS portion of the FD constitutes a key to t= hat view. And if a program fails to obey the FD, it won't get any updates = through (unless it is accidentally a good one).
Where I "overplayed my hand", is that introducing the constraint to enforce= the FD to be obeyed, will not help in simplifying the potentially more com= plex proceedings a program has to follow to get the database updated. That= is, while there is no possibility that the update "anomaly" will lead to i= ncorrect data, the "anomaly" is still there and must still be dealt with by= the programs that are exposed to it. -- end of forwarded message --
----- End forwarded message ----- ----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:07:18 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Re: [LIU Comp Sci] Need tutoring on Relational Calculus User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- Path: reader1.panix.com!panix!bloom-beacon.mit.edu!bloom-beacon.mit.edu!newsswitch.lcs.mit.edu!ottix-news.ottix.net!Xl.tags.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!local2.nntp.dca.giganews.com!nntp.megapath.net!news.megapath.net.POSTED!not-for-mail NNTP-Posting-Date: Fri, 02 Jan 2015 17:02:33 -0600 Date: Fri, 2 Jan 2015 18:02:33 -0500 From: "James K. Lowden" Newsgroups: comp.databases.theory Subject: Re: [LIU Comp Sci] Need tutoring on Relational Calculus Message-ID: <20150102180233.fa0f3f4c.jklowden-at-speakeasy.net> References: <20141221174830.17907dd8.jklowden-at-speakeasy.net> <20141231172312.ea580d34.jklowden-at-speakeasy.net> X-Newsreader: Sylpheed 3.0.2 (GTK+ 2.20.1; x86_64--netbsd) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Lines: 109 X-Usenet-Provider: http://www.giganews.com NNTP-Posting-Host: 209.166.36.60 X-Trace: sv3-7SD9L/3QY0QJc/LWrlfOn+/FtrrtZFOR1STQDYYV9s8SLQc+gSJCvFhxQnNDPHoAZYp64UPbE4Zwsf8!n4DoihgjJpQGu9Wp6UHinRRZkHir0CM6AzU3raddRgxK01GSfoPr878Iwd/+w9lwjQ8UVCYuuJyJ!sVgK9rwJrLWF3WdhgBrQRHkjB9KZzzDC X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.3.40 X-Original-Bytes: 6786 Xref: panix comp.databases.theory:75310
On Fri, 2 Jan 2015 04:24:35 +0000 (UTC) ruben wrote:
> On Wed, 31 Dec 2014 17:23:12 -0500, James K. Lowden wrote: >=20 > > I can think of no way in which "normalization is a failure". >=20 > well, think harder about it then=20
I am sorry to have left the impression I have not thought long and hard about it. =20
> because just this morning its failure was facing me, eye to eye when > some expert normalized our database and exhausted the system > resources and dropped productivity through the floor, killing a good > part of my new year.
Normalization is a logical concept. It cannot affect physical resources. That said, it is of course possible and usually desirable to implement a normalized physical design. When that is not done, extra care must be taken to ensure the database remains internally consistent. =20
Whether or not "some expert" in fact produced a BCNF design is not an assertion I would accept at face value. I have seen many terrible designs declared "normalized" that were nothing of the kind. Like most things, normalization is easier said than done. =20
The easiest steps are the removal of repeating groups (1NF) and determination of keys (functional dependency, 2NF). If your design meets those criteria and performs badly under the intended workload, the usual reason is underpowered hardware. While it's always possible to craft a purpose-built system that outperforms a DBMS, it's usually not economical. That's why it's a $30 billion annual market. =20
Regarding, http://lambda-the-ultimate.org/node/3762, we left the reservation at:
"Nowadays it is accepted that ORM (Object-Role Modeling), Object Oriented (OO) and other post-OO methodologies, such as Agile Modeling, Agile Development (Erickson, Lyytinen, & Siau, 2005) are most appropriate not only for programming, but also for analyzing and designing information systems, including, of course, database design."
Far from being "accepted" for the purpose, software development methodologies have exactly nothing to say about database design. The author doesn't seem to know that that "OO databases" rest on the very same theoretical void that the hierarchical and network models did before them. =20
If you want to use UML or whatever to help you identify attributes and their keys, fine. Just recognize that what you're doing *is* normalization, albeit informally and likely sufficiently. =20
The article is a bit polemic and contains many factual errors or oversights. For example, while it's true that atomicity and nested relations have been subject to some to-and-fro over the years, the *theorectical* concerns cannot be divorced from implementation: because SQL has (almost) no recursion, nested relations cannot be defined and manipulated. Because any such relationship can be defined in unnested form -- and manipuated with a first-order predicate logic language like SQL (mostly) -- that's how SQL-based designs are implemented. =20
Two others, briefly: addresses *do* require decomposition if their components are to be used or verified. If you don't care if you have a zip code or otherwise an address that can be delivered to, and you have no need to collate by zip code or city etc., then fine, use a blob. The need will make itself apparent in time, and the mistake is not fatal. And the whole professor-department dustup is really just a failure to understand the problem's own terms. Professor's have departments & offices, and departments have offices. That does not imply that a professor's office's department determines the professor's department. =20
None of that, by the way, has any bearing at all on your two-day experiment with normalization. Yes, you can find articles that seem to support your opinion. One of the problems with database theory is the average quality of the literature you find online. Maybe that's because SQL is so widely used and tables seem to so easy to understand. Be that as it may, I caution you not to accept the gloss if you can't analyze the content. =20
I'm sympathetic to what I guess the author's point: that normalization is not the mechanical process it was hoped (hyped?) to be, and that the simple practitioner (or, gee, student) looking for guidance may be led astray by pointy-headed logicians on one hand and know-nothings on the other. The former bury you in language like "relation R(A, B, C) where FDs (A, B) ? C and C ? B hold", and the latter fail to acknowledge the reasons for and value of eliminating redundancy in the database. =20
Yet if we discard the fancy language and mathematical exactitude, the basic problem is embarrasingly easy. Most normalization issues can be tested with two questions:
1. If I know X do I know Y? (functional dependency) 2. Is there more than one Y for X? (repeating groups)
You don't need a degree or a class or a book to understand those questions or (if you know the universe of discourse) their answers. That's part of the beauty of the model and why Codd developed it: relations are easy to understand and develop intuition for. It takes practice to find all the attributes and ask those questions for every one and, sure, there's more to it. But those two questions, asked repeatedly of the right person, will carry you a long way in the right direction. =20
HTH. =20
--jkl -- end of forwarded message --
----- End forwarded message ----- ----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:07:18 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Re: [LIU Comp Sci] Need tutoring on Relational Calculus User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- Path: reader1.panix.com!panix!bloom-beacon.mit.edu!bloom-beacon.mit.edu!newsswitch.lcs.mit.edu!ottix-news.ottix.net!Xl.tags.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!local2.nntp.dca.giganews.com!nntp.megapath.net!news.megapath.net.POSTED!not-for-mail NNTP-Posting-Date: Fri, 02 Jan 2015 17:02:33 -0600 Date: Fri, 2 Jan 2015 18:02:33 -0500 From: "James K. Lowden" Newsgroups: comp.databases.theory Subject: Re: [LIU Comp Sci] Need tutoring on Relational Calculus Message-ID: <20150102180233.fa0f3f4c.jklowden-at-speakeasy.net> References: <20141221174830.17907dd8.jklowden-at-speakeasy.net> <20141231172312.ea580d34.jklowden-at-speakeasy.net> X-Newsreader: Sylpheed 3.0.2 (GTK+ 2.20.1; x86_64--netbsd) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Lines: 109 X-Usenet-Provider: http://www.giganews.com NNTP-Posting-Host: 209.166.36.60 X-Trace: sv3-7SD9L/3QY0QJc/LWrlfOn+/FtrrtZFOR1STQDYYV9s8SLQc+gSJCvFhxQnNDPHoAZYp64UPbE4Zwsf8!n4DoihgjJpQGu9Wp6UHinRRZkHir0CM6AzU3raddRgxK01GSfoPr878Iwd/+w9lwjQ8UVCYuuJyJ!sVgK9rwJrLWF3WdhgBrQRHkjB9KZzzDC X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.3.40 X-Original-Bytes: 6786 Xref: panix comp.databases.theory:75310
On Fri, 2 Jan 2015 04:24:35 +0000 (UTC) ruben wrote:
> On Wed, 31 Dec 2014 17:23:12 -0500, James K. Lowden wrote: >=20 > > I can think of no way in which "normalization is a failure". >=20 > well, think harder about it then=20
I am sorry to have left the impression I have not thought long and hard about it. =20
> because just this morning its failure was facing me, eye to eye when > some expert normalized our database and exhausted the system > resources and dropped productivity through the floor, killing a good > part of my new year.
Normalization is a logical concept. It cannot affect physical resources. That said, it is of course possible and usually desirable to implement a normalized physical design. When that is not done, extra care must be taken to ensure the database remains internally consistent. =20
Whether or not "some expert" in fact produced a BCNF design is not an assertion I would accept at face value. I have seen many terrible designs declared "normalized" that were nothing of the kind. Like most things, normalization is easier said than done. =20
The easiest steps are the removal of repeating groups (1NF) and determination of keys (functional dependency, 2NF). If your design meets those criteria and performs badly under the intended workload, the usual reason is underpowered hardware. While it's always possible to craft a purpose-built system that outperforms a DBMS, it's usually not economical. That's why it's a $30 billion annual market. =20
Regarding, http://lambda-the-ultimate.org/node/3762, we left the reservation at:
"Nowadays it is accepted that ORM (Object-Role Modeling), Object Oriented (OO) and other post-OO methodologies, such as Agile Modeling, Agile Development (Erickson, Lyytinen, & Siau, 2005) are most appropriate not only for programming, but also for analyzing and designing information systems, including, of course, database design."
Far from being "accepted" for the purpose, software development methodologies have exactly nothing to say about database design. The author doesn't seem to know that that "OO databases" rest on the very same theoretical void that the hierarchical and network models did before them. =20
If you want to use UML or whatever to help you identify attributes and their keys, fine. Just recognize that what you're doing *is* normalization, albeit informally and likely sufficiently. =20
The article is a bit polemic and contains many factual errors or oversights. For example, while it's true that atomicity and nested relations have been subject to some to-and-fro over the years, the *theorectical* concerns cannot be divorced from implementation: because SQL has (almost) no recursion, nested relations cannot be defined and manipulated. Because any such relationship can be defined in unnested form -- and manipuated with a first-order predicate logic language like SQL (mostly) -- that's how SQL-based designs are implemented. =20
Two others, briefly: addresses *do* require decomposition if their components are to be used or verified. If you don't care if you have a zip code or otherwise an address that can be delivered to, and you have no need to collate by zip code or city etc., then fine, use a blob. The need will make itself apparent in time, and the mistake is not fatal. And the whole professor-department dustup is really just a failure to understand the problem's own terms. Professor's have departments & offices, and departments have offices. That does not imply that a professor's office's department determines the professor's department. =20
None of that, by the way, has any bearing at all on your two-day experiment with normalization. Yes, you can find articles that seem to support your opinion. One of the problems with database theory is the average quality of the literature you find online. Maybe that's because SQL is so widely used and tables seem to so easy to understand. Be that as it may, I caution you not to accept the gloss if you can't analyze the content. =20
I'm sympathetic to what I guess the author's point: that normalization is not the mechanical process it was hoped (hyped?) to be, and that the simple practitioner (or, gee, student) looking for guidance may be led astray by pointy-headed logicians on one hand and know-nothings on the other. The former bury you in language like "relation R(A, B, C) where FDs (A, B) ? C and C ? B hold", and the latter fail to acknowledge the reasons for and value of eliminating redundancy in the database. =20
Yet if we discard the fancy language and mathematical exactitude, the basic problem is embarrasingly easy. Most normalization issues can be tested with two questions:
1. If I know X do I know Y? (functional dependency) 2. Is there more than one Y for X? (repeating groups)
You don't need a degree or a class or a book to understand those questions or (if you know the universe of discourse) their answers. That's part of the beauty of the model and why Codd developed it: relations are easy to understand and develop intuition for. It takes practice to find all the attributes and ask those questions for every one and, sure, there's more to it. But those two questions, asked repeatedly of the right person, will carry you a long way in the right direction. =20
HTH. =20
--jkl -- end of forwarded message --
----- End forwarded message ----- ----- Forwarded message from Ruben Safir -----
Date: Thu, 26 Feb 2015 07:07:18 -0500 (EST) From: Ruben Safir To: mrbrklyn-at-panix.com Subject: (fwd) Re: [LIU Comp Sci] Need tutoring on Relational Calculus User-Agent: tin/2.2.1-20140504 ("Tober an Righ") (UNIX) (NetBSD/6.1.5 (i386))
-- forwarded message -- Path: reader1.panix.com!panix!bloom-beacon.mit.edu!bloom-beacon.mit.edu!newsswitch.lcs.mit.edu!ottix-news.ottix.net!Xl.tags.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!local2.nntp.dca.giganews.com!nntp.megapath.net!news.megapath.net.POSTED!not-for-mail NNTP-Posting-Date: Fri, 02 Jan 2015 17:02:33 -0600 Date: Fri, 2 Jan 2015 18:02:33 -0500 From: "James K. Lowden" Newsgroups: comp.databases.theory Subject: Re: [LIU Comp Sci] Need tutoring on Relational Calculus Message-ID: <20150102180233.fa0f3f4c.jklowden-at-speakeasy.net> References: <20141221174830.17907dd8.jklowden-at-speakeasy.net> <20141231172312.ea580d34.jklowden-at-speakeasy.net> X-Newsread |
|