We know how to create a paper representation of a database, but how do we create a real-life database out of that diagram?
What we used to do is make a hierarchical model like so, where we map the ER diagram directly to a tree-like structure:
This sounds nice, but the issue is that we have some redundancy due to how some aspects don’t have clear hierarchical relations.
The other problem is with many-to-many relationships since they don’t fit neatly into a hierarchical model, so database designers had to form weird workarounds.
What revolutionized the database was a paper by Ted Codd in 1970, which explained the relational data model, the most used database model today.
Since the relational model is based around keys, it’s useful to know what a key actually is.
A key is an attribute (or combination of attributes) that differentiates two tuples/records.
The benefit of having keys be what differentiates records is that it makes relations easy to represent.
What we do is have a foreign key as an attribute representing a relation.
In order to be a primary key, an attribute must be unique and NON-NULL (called the entity integrity constraint), with most DBMS enforcing these rules.
This is essentially how relational databases function in a structural sense, but how do we know they work? For that, we need some math and some new terms.
To explain what relations are in the mathematical sense, we need a few terms.
First off, we look at the term domain, which is a set of atomic values (e.g., Department_IDs, Employee_Ages, etc.) with a data type specified for each (e.g., integer, number, 10-character string, etc.).
Next, we look at schemas, which are defined as such:
A relation schema (aka table) $R$ denoted by $R(A_1, A_2, …, A_n)$ is made of a relation name and a list of attributes, where each attribute $A_i$ is the name of a role played by some domain $D$ in the relation schema $R$.
Now we can define a relation as such:
A relation (or relation state) $r$ of a relation schema $R(A_1, A_2, …, A_n)$ denoted by $r(R)$ is a set of tuples $r={t_1, t_2, …,t_m}$ where each n-tuple $t$ is an ordered list of $n$ values $t=<v_1,v_2,…,v_n>$ where each value $v_i, 1 \leq i \leq n$ is an element of $dom(A_i)$ or a special null value.
We can elaborate on these definitions by connecting them to terms we already know:
YearPublished
is integers.Mathematically, a relation is a subset of all possible tuples between the Cartesian products of a collection of sets.
For example:
This represents all the possible states of our table.
Now that we have an idea of the relational model, we can take our ER diagrams and convert them into a relational database with only tables, rows, columns, and cells.
For entities and attributes, it’s fairly simple; the only thing we need to do is decide on a domain for each attribute.
Name
is a composite attribute, we can split it into FirstName
and LastName
.We underline the primary key for readability.
Relationships are slightly more complex.
For a one-to-many relationship, it’s as simple as creating a new attribute in the “many” side with the keys of the associated “one” side as values.
DeptID is highlighted as a foreign key by the * beside it.
For attributes of the relationship itself, we simply add the attributes onto the “many” side.
In a one-to-one relationship, the same principle of adding foreign keys as attributes applies, but the question is which side we store it in.
Since some profs aren’t chairs of any department, we make the corresponding cell NULL.
For a many-to-many relationship, to remain in first normal form (i.e., every cell only has one value), we create a new table with the foreign keys of both sides matching up in each record.
Since each combination is unique, this still works.
For a ternary relationship, the same logic applies:
rattr is just an attribute of the relationship.
For weak entities, since they only have partial keys, we need another way to represent them in our relational model.
We use the foreign key with the partial key as the primary key for Section.
For multivalued attributes, we create a new table with the attributes of the foreign key and the attribute itself.
The combination of the two attributes is unique, so it still counts as a primary key for the table LanguagesSpoken.
For mapping ER diagrams, there are 7 rules listed out by Codd:
We can ensure database integrity through constraints.
Enforce these using triggers or assertions:
Trigger.
Assertion in MySQL.