Understanding about relational database m-square systems inc

Knowledge Sharing Session
Understanding about Relational
DATABASE – IBM DB2
October
27, 2014
Presented By: Muthukumaran Natarajan
1

Introduction to Relation Database
Management System (RDMS)
— Relationship database is developed with a collection of one or more relation and
Data
— Leading (commercial) manufacturers of relational DB-products:
ü Oracle
ü Microsoft(MS-Access, SQL server)
ü IBM(DB2 LUW & Z/os,Informix)
ü Sybase
— A language called SQL (Structured Query
Language) was developed to work with relational
databases.
— All data is stored in the form of tables (relations)
comprised of rows (records) and columns (fields).
2

INTRODUCTION TO RELATIONSHIP
DATABASE – DB2
3

RELATIONSHIP DATABASE
A well-designed database should:
ü Eliminate Data Redundancy: the same piece of data
shall not be stored in more than one place. This is
because duplicate data not only waste storage spaces
but also easily lead to inconsistencies &
Performance issues.
ü Ensure Data Integrity and Accuracy.
4

There are several types of database relationships.
ü One to One Relationships
ü One to Many and Many to One
Relationships
ü Many to Many Relationships
ü Self Referencing Relationships
5

ONE TO MANY RELATIONSHIP
— A one-to-many (1:m) relationship is
where, for each instance of table A, many
instances of the table B exist, but for each
instance of table B, only once instance of
table A exists.
— For example:
◦ Each artist, there are many paintings. Since it
is a one-to-many relationship, and not many-to-
many.
6

Continued..
author(id,otherAttributtes)
books(id,authorid,otherAttributes)
Or
author(id,otherAttributtes) books(id,otherAttributes)
authorConnectsBooks(authorid,booksid)
7

Examples one to Many relationships
This is the most commonly used relationship
type: Consider an e-commerce website, with
the following:
— Customers can make many orders.
— Orders can contain many items.
— Items can have descriptions in many
languages.
8

Relationship & Understanding
One-to-many relationships
◦ The most common relationship used when creating relational
databases. A row in a table in a database can be associated with one or
(likely) more rows in another table. An example of a one-to-many
relationship is a single order has many items on that order. And since
relationships work both ways it is not uncommon to hear reference to
many-to-one-relationships as well.
One-to-one relationship
◦ A row in a table is associated to one and only one row in another table. An
example of a one-to-one relationship is a person can have one social security
number and a social security number can only be assigned to one person.
◦ In most cases there is no need for a one-to-one relationship as the contents of
the two tables can be combined into one table.
Many-to-many relationships
◦ When one or more rows in a table are associated with one or more rows in
another table. An example of a many-to-many relationship is a table of
customers who can purchase many different products and a table of products
that can be purchased by many different customers.
9

PRIMARY KEY
— A primary key is a field in a table which
uniquely identifies each row/record in a
database table.
— Primary keys must contain unique
values.
— A primary key column cannot have
NULL values.
— For example, an unique
number customerID can be used as the
primary key for the Customers table.
11

Types of Constraints
— The following types of constraints are
available:
ü NOT NULL constraints
ü Unique (or unique key) constraints
ü Primary key constraints
ü Foreign key (or referential integrity) constraints
ü (Table) Check constraint
12

Referential integrity
— Referential integrity is a relational
database concept in which multiple
tables share a relationship based on
the data stored in the tables, and that
relationship must remain consistent.
— Referential integrity enforces the
following rules:
13

— Referential Integrity Rule: Each foreign key value
must be matched to a primary key value in the table
referenced (or parent table).
◦ You can insert a row with a foreign key in the child table only if the value
exists in the parent table.
◦ If the value of the key changes in the parent table (e.g., the row updated or
deleted), all rows with this foreign key in the child table(s) must be handled
accordingly. You could either (a) disallow the changes; (b) cascade the
change (or delete the records) in the child tables accordingly; (c) set the key
value in the child tables to NULL.
◦ Most RDBMS can be setup to perform the check and ensure the referential
integrity, in the specified manner.
— Business logic Integrity: Beside the above two
general integrity rules, there could be integrity
(validation) pertaining to the business logic, e.g., zip
code shall be 5-digit within a certain range, delivery
date and time shall fall in the business hours; quantity
ordered shall be equal or less than quantity in stock,
etc. These could be carried out in validation rule (for
the specific column) or programming logic.
14

SET RULES –RI Types
Cascading can be defined for UPDATE and DELETE. There are
four different options available:
— 1. SET NULL:
This action specifies that the column will be set to NULL
when the referenced column is updated/deleted.
— 2. CASCADE:
CASCADE specifies that the column will be updated when the
referenced column is updated, and rows will be deleted when
the referenced rows are deleted.
— 3. SET DEFAULT:
Column will be set to DEFAULT value when UPDATE/
DELETE is performed on referenced rows.
— 4. NO ACTION:
This is the default behaviour. If a DELETE/UPDATE is
executed on referenced rows, the operation is denied. An
error is raised.
15

Parent Table - RI
— A FOREIGN KEY in one table points to a PRIMARY KEY in
another table.
Consider the structure of the two tables as follows:
Customers & Department table
CREATE TABLE Department( BranchID Integer NOT NULL,
Branch Name Varchar (20) NOT NULL,
Branch Start-Date Date ,
PRIMARY KEY (BranchID) ); or
To create a PRIMARY KEY constraint on the ”Bracnh ID" column when CUSTOMERS table
already exists, use the following SQL syntax:
— ALTER TABLE CUSTOMER ADD PRIMARY KEY (ID);
16

Parent-Child Relations (RI)
CREATE TABLE Customer (
CustID Integer NOT NULL,
Name Varchar (20) NOT NULL,
AccNo Varchar (20)
Branchid Integer );
db2 alter table Customer add foreign key
(Custid) references department on delete
cascade
18

Delete Rule
◦ Delete Rule indicates the rule for deleting from the child table when
a row in the parent table is deleted or updated.
◦ Cascade Delete All child rows are deleted when the parent row is
deleted. Cascade Set Null Foreign key columns are set to NULL
when the parent row is deleted.
◦ Note: When you delete or update a row in a parent table for which a
Cascade Delete or Cascade Set Null rule is defined, the related rows
in the child table will be adjusted appropriately, whether or not
explicitly included in the Access Definition or process.
Table Name
— Table Name identifies the table affected by
the delete or update of parent rows.
19

Normalization
— Normalization is a technique of organizing
the data in a table.
◦ Mainly used for two purpose
– Eliminating redundant data
– Ensuring Data Dependencies i.e. logically stored.
Problem without Normalization:
It becomes difficult to handle and update the database
without facing data loss.
20

Normalization
— First Normal Form (1NF):
◦ No Two rows of data must contain repeating group of information.
◦ Each Set of column must have a unique value.
◦ Each row should have primary key (unique column)
Student Age Subject
Adam 15 Biology, Mathematics
Alex 14 Mathematics
Stuart 16 Mathematics
— In First normal form, any row must not have a column in which more
than one value is saved, like separated commas. We should separate
such data into multiple rows.
21

First Normal Form
Student Age Subject
Adam 15 Biology
Adam 15 Mathematics
Alex 14 Mathematics
Stuart 17 Mathematics
Using First Normal Data Redundancy increases, many columns with same
data in multiple rows.
22

Second Normal Form
— Second Normal Form must not have any
partial dependency of any column on
Primary key. Each column in the table
that is not part of the primary key must
depend upon the entire concatenated key
for its existence.
Student Age
Adam 15
Alex 14
Stuart 17
23

New Subject Table for 2NF:
Student Subject
Adam Biology
Adam Mathematics
Alex Mathematics
Stuart Mathematics
Both above tables qualifies for Second Normal Form. But in Second Normal
form the updates and insertion may have few complex cases, updating in two
places.
24

Third Normal Form
— Non Prime Attribute of table must be
dependent on primary key.
St-id St-Name DOB Add1 Add2 City State Zipcode
In the above table, street, city, state depends upon zip code and this is
called as Transitive dependency. We need to apply 3NF to move the
street, city and state to new table with zip as primary key.
St-id St-Name DOB ZIP
ZIP Add1 Add2 City State
25

Normal Form
— Higher Normal Form: 3NF has its
inadequacies, which leads to higher
Normal form, such as Boyce/Codd
Normal form, Fourth Normal Form (4NF)
26

Schema
— A schema is a collection of named database objects.
— Schemas provide a way to logically classify objects such as tables,
views, triggers, routines, or packages.
— A schema name is used as the first part of a table.
— A schema is itself a database object that is created using the
CREATE SCHEMA statement. The syntax of the CREATE
SCHEMA statement is as follows:
— CREATE SCHEMA { <schema-name> | AUTHORIZATION
<authorization-name> |
<schema-name> AUTHORIZATION <authorization-name> }
[ <schema-SQL-statement> ... ]o-part object name
27

Performance
— DB2 has a number of performance optimization capabilities that
given the insight and ability to optimize workload execution.
— These capabilities can save money and lower your risks by helping
you to do more work with your existing hardware, ensure Service
Level Agreements (SLAs) are met or exceeded and increase DBA
productivity.
There are different types:
— Server Performance
— Database Performance
— Query Performance
◦ Index Scan
◦ Table scan
◦ Sorting
◦ Access Methods
28

Naming Standards
Database Naming Conventions:
— Database object naming standards should be
developed in conjunction with all other IT
naming standards in your organization.
— In all cases, database naming standards should
be developed in cooperation with the data
administration department (if one exists) and,
wherever possible, should peacefully coexist
with other IT standards, but not at the expense
of impairing the database environment.
29

Data Definition Language (DDL)
The DDLs are:
— Create
— Drop
— Rename
30

Data Manipulation Language (DML)
The DMLs are:
— Select
— Insert
— Delete
— Update
31

JOINS
The different types of joins are:
— Inner Join
— Outer Join
v Left Outer Join
v Full Outer Join
32

Inner Join Example
— An inner join of A and B gives the result of A intersect B, i.e. the
inner part of a venn diagram intersection.
— An outer join of A and B gives the results of A union B, i.e. the
outer parts of a venn diagram union.
◦ Examples
◦ Suppose you have two Tables, with a single column each, and data as follows:
A B
◦ - -
◦ 1 3
◦ 2 4
◦ 3 5
◦ 4 6
Note that (1,2) are unique to A, (3,4) are common, and (5,6) are unique to B.
◦ Inner join
◦ An inner join using either of the equivalent queries gives the intersection of the two tables,
i.e. the two rows they have in common.
◦ select * from a INNER JOIN b on a.a = b.b;
– a | b
– --+--
– 3 | 3
– 4 | 4
34

Left Outer Join Example
Left outer join
— A left outer join will give all rows in A, plus any
common rows in B.
— select * from a LEFT OUTER JOIN b on a.a = b.b;
— select a.*,b.* from a,b where a.a = b.b;
– a | b
– -------
– 1 | null
– 2 | null
– 3 | 3
– 4 | 4
36

FULL OUTER JOIN EXAMPLE
— Full outer join
— A full outer join will give you the union of A and B, i.e. All the rows in A
and all the rows in B. If something in A doesn't have a corresponding datum
in B, then the B portion is null, and vice versa.
— select * from a FULL OUTER JOIN b on a.a = b.b;
– a | b
– -----+-----
– 1 | null
– 2 | null
– 3 | 3
– 4 | 4
– null | 6
– null | 5
38

Column Selection
— Specify only the columns needed
— Avoid SELECT *
— Extra columns increases row size of
the result set
— Retrieving very few columns can
encourage index-only access
39

Use For Fetch Only
— When a SELECT statement is used
only for data retrieval - use FOR
FETCH ONLY
— FOR READ ONLY clause provides
the same function –
40

Avoid Sorting
— DISTINCT -always results in a sort
— UNION -always results in a sort
— UNION ALL -does not sort, but
retains any duplicates
41

SQL TUNING TIPS
— ORDER BY
— –may be faster if columns are indexed
— – use it to guarantee the sequence of the
Data GROUP BY
— –specify only columns that need to be
grouped
— –may be faster if the columns are indexed
— – do not include extra columns in SELECT
list or GROUP BY because DB2 must sort
the rows
42

Indexes
— Create indexes for columns you
frequently:
–ORDER BY
— –GROUP BY (better than a
DISTINCT)
— –SELECT DISTINCT
— –JOIN
43

Join Predicates
— Response time - determined
mostly by the number of rows
participating in the join
— Provide accurate join predicates
— Never use a JOIN without a
predicate Join ON indexed
columns.
— Use Joins over sub queries
44

Use BETWEEN
— BETWEEN is usually more efficient
than = predicate and the =
predicate
45

Use IN Instead of Like
— If you know that only a certain
number of values exist and can- be
put in a list Use IN or BETWEEN
— IN (‘ Value1’, ‘ Value2’, ‘ Value3’)
— BETWEEN :valuelow
AND :valuehigh
— – Rather than:
— LIKE ‘ Value ’
46

Avoid Percentage
— Avoid the % or at the beginning
because it prevents DB2 from using
matching index and may cause a
table scan.
— Use the % or the at the end to
encourage index usage
47

Avoid NOT
— Predicates formed using NOT are
not indexable
— For Subquery -when using negative
logic:
— –Use NOT Exists
48

Use EXISTS
— Use EXISTS to test for a condition
and get a True or False returned by
DB2 and not return any rows to the
query:
— SELECT col1 FROM table1
— WHERE EXISTS
— (SELECT 1 FROM table2
— WHERE table2.col2 = table1.col1)
49

Avoid Arithmetic in Predicates
— An index is not used for a column
when the column is an arithmetic
expression.
SELECT col1 FROM table1
— WHERE col2 = :hostvariable + 10
50

Limit Scalar Function Usage
— Scalar functions are not indexable
— But you can use scalar functions to
offload work from the application
program
— Examples:
— –DATE functions
— – SUBSTR
— –CHAR
— –etc.
51

Other Cautions
— Predicates that contain
concatenated columns are not
indexable
— SELECT Count(*) can be expensive
— CASE Statement -powerful but can
be expensive
52

Difference between OLAP VS OLTP.
53

Database Design Process
Steps in designing a Database:
1. Determine the purpose of your database
2. Determine the tables you need
3. Determine the fields, data type, size and
primary/foreign key constraints required for
each table.
4. Determine the Relationships
5. Refine your design
54

Discussions Scenarios
Scenario: 1
SELECT Name, NVL (Salary, 0)
FROM TBL_EMP
WHERE Salary is NULL
ORDER BY Name
Question: What is displayed when the salary is
NULL?
55

Discussions Scenarios …
Scenario: 2
SELECT Name
FROM TBL_EMP
WHERE Name LIKE ‘_a%’
Question: Which names are displayed?
56

Scenario: 3
Which two relationships exist for patient and
doctor if a patient can have many doctors, a doctor
can have many patients and a doctor can have a
patient?
Scenario: 4
Which type of entity relationship exists between
patient and doctor if a patient can have only one
doctor but a doctor can have many patients?
Note: Doctor cannot be a patient
57

Scenario: 5
List the employee names, their role, respective
manager who are working under each manager
group by Department?
Scenario: 6
Write a query to analyze how long your orders be shipped from the
date the order was placed. Create a report that should display customer
number, order date, date shipped and the number of months in whole
numbers from the time the order is placed to the time the order is
shipped.
58

SELECT Customer_ID, Order_Dt, Ship_Dt,
ROUND(MONTHS_BETWEEN(Ship_Dt,
Order_Dt)) as “Days Taken”
FROM TBL_Order
59

Getting Started or Support –
Muthu Natarajan
info@msquaresystems.com
www.msquaresystems.com
Phone: 703-222-5500/212-941-6000
60

Understanding about relational database m-square systems inc

More Related Content

What's hot

Similar to Understanding about relational database m-square systems inc

More from Muthu Natarajan

Recently uploaded

Understanding about relational database m-square systems inc