Models

Making Actual Databases

We know how to create a paper representation of a database, but how do we create a real-life database out of that diagram?

What we used to do is make a hierarchical model like so, where we map the ER diagram directly to a tree-like structure:

Untitled

This sounds nice, but the issue is that we have some redundancy due to how some aspects don’t have clear hierarchical relations.

The other problem is with many-to-many relationships since they don’t fit neatly into a hierarchical model, so database designers had to form weird workarounds.

What revolutionized the database was a paper by Ted Codd in 1970, which explained the relational data model, the most used database model today.

Since the relational model is based around keys, it’s useful to know what a key actually is.

Keys

A key is an attribute (or combination of attributes) that differentiates two tuples/records.

The benefit of having keys be what differentiates records is that it makes relations easy to represent.

What we do is have a foreign key as an attribute representing a relation.

In order to be a primary key, an attribute must be unique and NON-NULL (called the entity integrity constraint), with most DBMS enforcing these rules.

This is essentially how relational databases function in a structural sense, but how do we know they work? For that, we need some math and some new terms.

Theoretical Stuff

To explain what relations are in the mathematical sense, we need a few terms.

First off, we look at the term domain, which is a set of atomic values (e.g., Department_IDs, Employee_Ages, etc.) with a data type specified for each (e.g., integer, number, 10-character string, etc.).

Next, we look at schemas, which are defined as such:

A relation schema (aka table) $R$ denoted by $R(A_1, A_2, …, A_n)$ is made of a relation name and a list of attributes, where each attribute $A_i$ is the name of a role played by some domain $D$ in the relation schema $R$.

Now we can define a relation as such:

A relation (or relation state) $r$ of a relation schema $R(A_1, A_2, …, A_n)$ denoted by $r(R)$ is a set of tuples $r={t_1, t_2, …,t_m}$ where each n-tuple $t$ is an ordered list of $n$ values $t=<v_1,v_2,…,v_n>$ where each value $v_i, 1 \leq i \leq n$ is an element of $dom(A_i)$ or a special null value.

We can elaborate on these definitions by connecting them to terms we already know:

Example Table

Untitled

Mathematically, a relation is a subset of all possible tuples between the Cartesian products of a collection of sets.

For example:

This represents all the possible states of our table.

Properties of Relations

Doing Actual Modelling

Now that we have an idea of the relational model, we can take our ER diagrams and convert them into a relational database with only tables, rows, columns, and cells.

Entities and Attributes

For entities and attributes, it’s fairly simple; the only thing we need to do is decide on a domain for each attribute.

Untitled

Untitled

We underline the primary key for readability.

Relationships

Relationships are slightly more complex.

One-to-Many Relationships

For a one-to-many relationship, it’s as simple as creating a new attribute in the “many” side with the keys of the associated “one” side as values.

Untitled

DeptID is highlighted as a foreign key by the * beside it.

For attributes of the relationship itself, we simply add the attributes onto the “many” side.

Untitled

One-to-One Relationships

In a one-to-one relationship, the same principle of adding foreign keys as attributes applies, but the question is which side we store it in.

Untitled

Since some profs aren’t chairs of any department, we make the corresponding cell NULL.

Untitled

Many-to-Many Relationships

For a many-to-many relationship, to remain in first normal form (i.e., every cell only has one value), we create a new table with the foreign keys of both sides matching up in each record.

Untitled

Since each combination is unique, this still works.

Ternary Relationships

For a ternary relationship, the same logic applies:

Untitled

rattr is just an attribute of the relationship.

Weak Entities

For weak entities, since they only have partial keys, we need another way to represent them in our relational model.

Untitled

We use the foreign key with the partial key as the primary key for Section.

Multivalued Attributes

For multivalued attributes, we create a new table with the attributes of the foreign key and the attribute itself.

Untitled

The combination of the two attributes is unique, so it still counts as a primary key for the table LanguagesSpoken.

7 Rules of Mapping ER Diagrams to Relational Models

For mapping ER diagrams, there are 7 rules listed out by Codd:

  1. For each regular entity type $E$, create a relation $R$ that includes all simple attributes of $E$, excluding composite attributes but including their simple components. Choose a key attribute as the primary key of $R$.
  2. For each weak entity type $W$, create a relation $R$ and include all attributes of $W$. Also, include the primary key of the owner entity as a foreign key in $R$. Combine the foreign key with the partial key of $W$ to form the primary key.
  3. For each 1:1 relationship, include the primary key of one side as a foreign key in the other. Preference is given to the side with total participation.
  4. For each 1:N relationship, include the primary key of the “one” side as a foreign key in the “many” side.
  5. For each M:N relationship, create a new relation to represent the relationship. Include the primary keys of both participating entities as foreign keys and combine them to form the primary key.
  6. For each multivalued attribute, create a new relation that includes the attribute and the primary key of the parent entity as a foreign key.
  7. For each n-ary relationship ($n > 2$), create a new relation to represent the relationship. Include the primary keys of all participating entities as foreign keys and combine them to form the primary key.

Constraints

We can ensure database integrity through constraints.

Key Constraints

Referential Integrity

Semantic Constraints

  1. State constraints: Define valid states (e.g., hours worked ≤ 60).
  2. Transition constraints: Define valid state changes (e.g., wages can only increase).

Enforce these using triggers or assertions:

Untitled

Trigger.

Untitled

Assertion in MySQL.