If you work in a traditional software company probably you will have a team. Depending how the project is organized (scrum, agile or canban) you may be asked to create requirements and presentations for the project.
If the project use large amount of data requirements will include a database and will make recommendations about the database structure but will not go in much details. Sometimes requirements will include a "Conceptual Model" (CM).
A database model is a graphical representation of database elements. This is used by database designers to communicate high level design of database structure. There are 3 model types you can make before you implement a large database. You need to understend how to read these models if you wish to create a large project.
Each model has its own merits but the most valuable is the Conceptual Model. This is the only model that I will show in details how to make using entry level tools.
Most database designers are using tools to create a database model diagrams. These tool are based on IBM UML notation: (Unified Modeling Language) and can connect to existing databases. You can use them to generate code for you. However some companies that are less traditional and more efficient are using paper and pencil for design.
On-line tools are implemented as SaaS (Softare as a Service) platform. Yes there are websites that enable you to design on-line for a small price. One of these is DB Designer. Honestly I have no intention to use this tool for my open source projects. It can not export your diagram as SVG so I will pass thank you.
To represent an entity use a rectangle. To represent relations use a line. If you wish to represent the relation cardinality, use a conceptual notation for the line end. This consist of a bar, a circle or a crow foot like in the picture below:
Conceptual Symbols
The value of a good conceptual model is very high. One picture can value as 1000 words. However the most advanced the tools you use the more money you spend. The most simple tool you can use is: Microsoft paint. It is not very professional to use Paint becouse it is producing raster type of images: (jpg/png).
A second tool I found useful is Google Drawings. With this tool you can create workflow diagrams and export them as SVG. As you know, SVG is a vector format of higher quality for Internet pages. We have selected for Sage-Code diagrams a set of symbols similar to the official ones. These symbols will be used in our diagrams to make efficient SVG based models.
Sage Symbols
One of the most strong principles in relational database design is the principle of normalization. This is the process of organizing data in a database. There are two goals of the normalization process:
Edgar Codd is the inventor of the relational model (RM). He has introduced the concept of normalization and what we now know as the First normal form (1NF) in 1970. Codd went on to define the Second normal form (2NF) and Third normal form (3NF) in 1971
References to Wikipedia
The normalization process is using several normalization rules:
First normal form (1NF): This is a property of a relation in a relational database. A relation is in first normal form if and only if the domain of each attribute contains only atomic indivisible values, and the value of each attribute contains only a single value from that domain.
First normal form enforces these criteria:
Second normal form (2NF): A table that is in first normal form (1NF) must meet additional criteria for second normal form. Specifically: a table is in 2NF if it is in 1NF and no non-prime attribute is dependent on any proper subset of any candidate key of the table.
A non-prime attribute of a table is an attribute that is not a part of any candidate key of the table. Atrbutes that are part of a candidate kye are called prime attributes.
Third normal form (3NF): is a normal form that is used in normalizing a database design to reduce the duplication of data and ensure referential integrity by ensuring that:
1. the entity is in second normal form
2. all the attributes in a table are determined only by the candidate keys of that table and not by any non-prime attributes.
Most of applications use Third Normal Form (3NF).
These principles are very scientific and difficult to understand. However a good designer must know how to apply these principles. This comes within experience, with many try and errors.
In Relational Algebra the concepts are Logical or Abstract. In a Relational database, we implement the Logical or Conceptual model using Tables. So here are some example of conceptual diagrams patterns that you can use do make a database model:
1:1 Relations
1:M Relations
Association
Inheritance
Extension
Note: Detail design of database do not contain description of stored procedures. This design is too complex to be included in database design document. So most Architects let the developement team to decide what packages to implement.
For implementing a database, developers will create DDL scripts. This is a subset of SQL that means: "Data Definition Language". This language enable creation and modification of objects in a database.
The order of scripts is very important. You can not create indexes for a table that do not exist. So here is the natural order of making objects in a database. First you make all the tables with primary keys and check constraints. Second you make the indexes. Third you make the foreign key costraints.<>
Security: Critical databases must have good design for data access rules. This is done using a special script that is using DCL (Data Control Language). You will learn this language in our Programming Language tutorial for SQL.
Maintenance: Database administrator will peform no optimizations if this is not required by software developement team or database architect. Therefore a good design document should include directives for database maintenance tasks, like frequency fo index rebuid, and frequency for compacting and reorganization of table data to reset the free space.
In an agile company one team is assigned a single project. A project consist of several epics and each epic is represented by several stories. One story may have several tasks to be completed. A developer can work in a single team and can be assigned to resolve multiple stories. A story need a resolver, so when we create a story it must be assigned immediatly, otherwise we can not create stories.
Logic Model
At first we design tables with primary key and foreign keys. This is the high level design of database table. Tables may have additional attributes that will be part of Phisical Model.
Table Type | Name | Primary Key | Foreign Keys |
---|---|---|---|
Entity | PROJECT | ID | OWNER_ID |
Entity | DEVELOPER | ID | PROJECT_ID |
Association | TEAM | ID | MASTER_ID |
Detail | EPIC | ID | PROJECT_ID |
Association | STORY | ID | EPIC_ID |
REZOLVER_ID | |||
Detail | TASK | ID | STORY_ID |
REVIEWER_ID |
These relations are implementing several fundamental Agile rules. For some of the rules we need to implement triggers and attribute constraints. This can be done in the phisical model. We do not show you the phisical model in this article.
ER diagram is used to show attributes, data types, primary keys and references. This is also called physical model. This diagram is used by developers to create the database structure script using SQL.
Also can be used by developers who make use the database to write better SQL.
ER-Diagram
This has demonstrate how to make a data model to resolve a real problem. We have shown the Logical Model, and the Phisical Model. Using these models you can understend the design of a database. Having a model will help you build better SQL to analyze your data.
Read next: Testing