Entity integrity and referential integrity are two forms of data integrity that are particularly important in relational databases. Relational databases break the storage of data down into elements that usually have to be joined back together again in order to produce meaningful results. Without guarantees of these two types of integrity, data would get dropped or duplicated.
Relational Databases
Video of the Day
Relational database management systems represent the most widely implemented database model in the world today. The relational model relies on "Normalized" data. Normalization is a process carried out in the database design process to ensure that the tables holding the data do not have repeating data and cannot create repetition or loss when they are joined together
Video of the Day
Reference Data
Most database applications use two different types of data: transaction data and reference data. As an example, a very basic warehouse database will record stock movement transactions. Imagine a transaction record such as: Item 1010, Quantity 5, Direction Out. A manager may wish to see a report of this data giving the item name and the item description as well as the item number. These details are attributes of a different entity that describes each part in the warehouse. This is a reference table.
Normalization
In the warehouse database example, the transactions table only needs to hold an identifier for the part. If details such as name and description were held in the transactions table, this would cause needless repetition. Any item always has the same name and description, and these details relate to the part and not to the transaction. In this example system, the parts details are held in a different table. Every record in the table has to be uniquely identified by one field, or a combination of fields, called the primary key. Splitting out repeating groups is called "normalization."
Entity Integrity
In the warehouse database example, the primary key of the parts table should be an attribute that uniquely identifies each record. This is entity integrity. Generally, numeric fields are better for a primary key than text fields. In the parts table, the best field for the primary key would be the Item number. The organization may sell a variety of products with the same name, like "small wrench." The Item number will always be unique per item, and so this ensures entity integrity for the parts table. If several records had the same primary key value, referencing details from this table by joining it to the transaction table would cause transaction records to duplicate in order to match with each record with the same identifier.
Referential Integrity
Referential integrity ensures that the link between the transactions table and the parts table is correctly formed. In the transaction table, the item number forms the foreign key, which is the attribute that links to the primary key of the parts table. Only item numbers listed in the parts table can appear in the transaction table. This is referential integrity. If the transactions table contains an item number that is not also in the parts table, joining the tables together would cause transactions with the missing item number to be left out of the results.