Lossless join decomposition

In database design, a lossless join decomposition is a decomposition of a relation $R$ into relations $R_{1},R_{2$ such that a natural join of the two smaller relations yields back the original relation. This is central in removing redundancy safely from databases while preserving the original data.^[1]

Criteria

Lossless join can also be called nonadditive.^[2]

If $R$ is split into $R_{1$ and $R_{2$ , for this decomposition to be lossless (i.e., $R_{1}\bowtie R_{2}=R$ ) then at least one of the two following criteria should be met.

Check 1: Verify join explicitly

Projecting on $R_{1$ and $R_{2$ , and joining them back, results in the relation you started with.^[3]^{[unreliable source?]}

Check 2: Via functional dependencies

Let $R$ be a relation schema.

Let $F$ be a set of functional dependencies on $R$ .

Let $R_{1$ and $R_{2$ form a decomposition of $R$ .

The decomposition is lossless if one of the sub-relations (i.e. $R_{1$ or $R_{2$ ) is a subset of the closure of their intersection. In other words, the decomposition is a lossless-join decomposition of $R$ if at least one of the following functional dependencies are in $F$ ⁺ (where $F$ ⁺ stands for the closure for every attribute or attribute sets in $F$ ):^[4]

$R_{1}\cap R_{2}\rightarrow R_{1$
$R_{1}\cap R_{2}\rightarrow R_{2$

Criteria for multiple sub-relations

Multiple sub-relations $R_{1},R_{2},...,R_{n$ have a lossless join if there is some way in which we can repeatedly perform lossless joins until all the relations have been joined into a single relation. Once we have a new sub-relation made from a lossless join, we are not allowed to use any of its isolated sub-relations to join with any of the other relations. For example, if we can do a lossless join on a pair of relations $R_{i},R_{j$ to form a new relation $R_{i,j$ , we use this new relation (rather than $R_{i$ or $R_{j$ ) to form a lossless join with another relation $R_{k$ (which may already be joined (e.g., $R_{k,l$ )).

Examples

Let $R=(A,B,C,D)$ be the relation schema, with attributes $A$ , $B$ , $C$ and $D$ .
Let $F=\{A\rightarrow BC\$ be the set of functional dependencies.
Decomposition into $R_{1}=(A,B,C)$ and $R_{2}=(A,D)$ is lossless under $F$ because $R_{1}\cap R_{2}=A)$ . $A$ is a superkey in $R_{1$ , meaning we have a functional dependency $\{A\rightarrow BC\$ . In other words, now we have proven that $(R_{1}\cap R_{2}\rightarrow R_{1})\in F^{+$ .

^[5]^[6]

References

^ Pohler, K (2015). "Lossless-Join Decomposition: applications in quantitative computing metrics". International Journal of Applied Computer Science. 21 (4): 190–212.
^ Elmasri, Ramez (2016). Fundamentals of database systems (Seventh ed.). Hoboken, NJ: Pearson. p. 461. ISBN 978-0133970777.
^ "Lossless Join Property". Stackoverflow.com. Retrieved 2016-02-07.
^ "Lossless Join Decomposition" (PDF). University at Buffalo. Jan Chomicki. Retrieved 2012-02-08.
^ "Lossless-Join Decomposition". Cs.sfu.ca. Retrieved 2016-02-07.
^ "www.data-e-education.com - Lossless Join Decomposition". Archived from the original on 2014-02-21. Retrieved 2014-02-12.

[1] Pohler, K (2015). "Lossless-Join Decomposition: applications in quantitative computing metrics". International Journal of Applied Computer Science. 21 (4): 190–212.

[Elmasri-2] Elmasri, Ramez (2016). Fundamentals of database systems (Seventh ed.). Hoboken, NJ: Pearson. p. 461. ISBN 978-0133970777.

[3] "Lossless Join Property". Stackoverflow.com. Retrieved 2016-02-07.

[4] "Lossless Join Decomposition" (PDF). University at Buffalo. Jan Chomicki. Retrieved 2012-02-08.

[5] "Lossless-Join Decomposition". Cs.sfu.ca. Retrieved 2016-02-07.

[6] "www.data-e-education.com - Lossless Join Decomposition". Archived from the original on 2014-02-21. Retrieved 2014-02-12.

[1]

[2]

[3]

[4]

[5]

[6]