E-R model and normalization of our plugin
Simply, data redundancy means a field repeats in two or more tables. It may cause inconsistency and should be avoided when designing our database. Normalization helps preventing redundancy and makes the best possible usage of storage. However, sometimes redundant data makes our operation more efficient. We should make a balance between space and time.
I will explain the process of normalizing the tables in our plugin.
The E-R model of our plugin is
The original table in our plugin is as follows.
Name |
ID |
Weight |
Height |
Date |
Time |
Duration |
Meal_calorie |
Sports_info |
Note that Name, ID, Date and Time are primary key.
We notice that Sports_info is not an atomic attribution and we should decompose it into atomic attributes.
So the tables turn to
Name |
ID |
Weight |
Height |
Date |
Time |
Duration |
meal_calorie |
Exercise type |
Exercise calorie |
Exercise intesity |
Upper limbs |
lower limbs |
abdominal muscle |
muscles of back |
psoas |
Note that these two are one table.
Now all the attributes become atomic. This is 1NF.
To reduce the redundancy, we should eliminate all the partially functional dependency. The result is as follows:
Table 1 – basic user information
ID |
Name |
Weight |
Height |
Table 2 – user activity and meals information
ID |
Date |
Time |
Duration |
Meal calorie |
Exercise type |
Exercise calorie |
Table 3 – exercise information
Exercise type |
Exercise intensity |
Upper limbs |
lower limbs |
abdominal muscle |
muscles of back |
psoas |
This is 2NF.
Now, we should eliminate the transitive functional dependency. The only transitive functional dependency occurs that “Exercise calorie” is transitively functional dependent on PK “ID” “Date” and “Time”.
However, we notice that
Exercise calorie= duration×Exercise intensity.
So there is no need to keep record of exercise calorie.
Moreover, we can divide the information of meals and sports in two tables because at one time only one type of event could happen. If they are in the same table, there may be many NULL elements.
Then table 2 is changed into
ID |
Date |
Time |
Meal calorie |
And
ID |
Date |
Time |
Duration |
Exercise type |
The final design of our tables are as follows:
Table 1 – basic user information
ID |
Name |
Weight |
Height |
Table 2 – user meals information
ID |
Date |
Time |
Meal calorie |
Table 3 – user exercise information
ID |
Date |
Time |
Duration |
Exercise type |
Table 4 – sports information
Exercise type |
Exercise intensity |
Upper limbs |
lower limbs |
abdominal muscle |
muscles of back |
psoas |
Table 5 – food information(this is added for the sake of comprehensiveness)
Food type |
Food calorie |
Now I will analysis the domain of each attributes.
Table 1 – basic user information
ID |
Name |
Weight |
Height |
int |
Char(20) |
float |
float |
Table 2 – user meals information
ID |
Date |
Time |
Meal calorie |
int |
Char(12) |
Char(5) |
float |
Table 3 – user exercise information
ID |
Date |
Time |
Duration |
Exercise type |
int |
Char(12) |
Char(5) |
float |
Char(30) |
Table 4 – sports information
Exercise type |
Exercise intensity |
Upper limbs |
lower limbs |
abdominal muscle |
muscles of back |
psoas |
Char(30) |
Char(8) |
bool |
bool |
bool |
bool |
bool |
Table 5 – food information (this is added for the sake of comprehensiveness)
Food type |
Food calorie |
Char(30) |
float |
Without doubt, ID should be int and the number of weight, height, calorie, duration of sports should be float. The name of sports and food is also char.
We record the date in a type of ‘Dec.27th, 2011’, so its size is char and no more than 12bit. And, we record the time in a type of ’00:00’, so its size is char and no more than 5bit.
The use of bool in exercise part is to illustrate whether this part is strengthened.
When we added the new function, data visualization, we found that it is inconvenient for us to read net calorie from one table and do operations on them. It is a waste of time. So we consider creating a new table called accumulate calorie.
Date |
Time |
Accumulate calorie |
This table is for drawing accumulative curve.
We add the number of calorie today to yesterday’s sum and get the sum of the number of calorie until today. This can save time for us and won’t occupy much space.
But time limited, this is not realized.
This is the process of designing table for our plugin.
Though I am in charge of database implementation for our plugin, i was not familiar with these theoretical things then. So many thanks to Brother Ben. Thanks for his clarity of thinking and good understanding of normalization.
Share your knowledge and creativity with the world, protect your work with CC Licenses and discover other creative works and CC activities around the world. For Chinese CC, please go to: (中文版知识共享协议网站请点击) http://cn.creativecommons.org.
Creative Commons licenses provide a flexible range of protections and freedoms for authors, artists, and educators.
Toyhouse.cc is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.
© 2012 Created by Toyhouse Administrator.

You need to be a member of Toyhouse to add comments!
Join Toyhouse