Data Management
Goals
Know the differences between traditional file organization methods
and the database approach.
Know how database management systems are used.
Know the importance of a relational database.
Understand how databases are changing business operations across industries.
File Management
Historical Problems
Lacked data mechanisms for:
tagging
retrieving
manipulating
In other words, everything was custom coded and program specific.
File System Problems
Program/Data Dependence
Data Redundancy
Lack of Data Integrity
Lack of Data Protection
Databases: A Structured Solution
Databases offer a structured solution.
A structured collection of data is called a Database.
A collection of tools to implement and manipulate the data is called
a Data Base Management System
Database Components
The smallest unit within a database is a field
A related collection of fields is called a record
A related collection of records may be a file
A related collection of files, or a single file, may be a database
Database Functionality
All databases offer analogous functionality including:
Queries
Reports
Forms
Security
This functionality is implemented in a variety of different ways.
Database Improvements
Reduced data redundancy
Application/data independence
Better control
Flexibility
Database Models
The Hierarchical Model
Sometimes called Tree Model
Single Parent, multiple child records
The Network Model
Multiple parents, multiple child records
The Relational Model
Structured Tables with well defined relationships
Relational Model Terms
Because Relational Technology is derived from mathematical set theory,
its terminology is different than conventional computer terminology. For
example:
Objects -- Entities
Tuples -- records
Fields -- attribute
Relation -- Table
Joins -- Logical structures that can be dealt with like tables.
Keys
Keys can be used to identify unique records within a table
Keys can also used to define the relationship between (join) tables
When a key is composed of multiple fields it is called a composite key
The Object-Oriented Structure
Some Object Characteristics
Encapsulation
Inheritance
OO is currently being used primarily in Graphics but will become more mainstream
in the future.
As operating systems both Windows 95 and Windows NT are said to have
object characteristics.
Entity-Relationship Diagrams
Conceptual blueprint for the database
E-R diagrams are graphical representations of all entity relationships.
No standard methodology exists for creating E-R diagrams.
DBMS Components
The Schema
Describes the structure of the database, the names and types of fields
in each record type, and the general relationships among different sets
of records.
Metadata -- anything that describes the data, but is not data is called
metadata (data about data).
DBMS Components
Types of Data
Numeric
Alpha
Graphic
Each type of data can also have a format
Data types and formats are important because they determine what
type of future functionality that your Data Base will have as well as how
much storage space that it will need.
DBMS Components
The Data Dictionary
Contains all the information supplied by the database developer when
constructing the schema
The Data Definition Language (DDL)
Normally internal to the program and not visible
The Data Manipulation Language (DML)
Used to query the database
Relational Operations
Popular Relational Data Bases include:
Oracle
Access
SQL Server
Structured Query Language
The DDL and DML of choice.
Originally developed by IBM, it is now an open standard.
SQL commands have an English like structure such as:
Insert
Delete
Select
Update
Structured Query Language
SQL statements can be embedded in languages such as COBOL and Visual
Basic.
SQL routines should be portable across different Data Bases effecting
a DB independence.
Graphical front ends are available for some SQL Data Bases.
Popular Database Management Systems
IMS -- A hierarchical DBMS developed by IBM.
FOCUS -- Information Builder
DB2 -- A relational Data Base from IBM
SQL Server -- A relational Data Base from Microsoft
Database Architecture
Distributed Databases
A response to changing environments.
Countered existing mainframe logic.
Distributed Databases may use Replication.
Replication is where a copy of the Data Base is replicated at each
site that might use it.
Replication
Used by many popular programs:
NT Server
Lotus Note
WIN95
Leverages investment in distributed computers.
Client/Server Systems
Shares work between two or more computers.
In two level client server, the front end handles the interface and
the back end handles the database.
Multilevel client server generally splits the interface onto the client,
the business rules onto a middle layer, and the data base onto the back
end.
Data Warehousing and Data Mining
Data Warehousing supports the concept that all of a companies information
needs to reside upon a single system.
This system is called the data warehouse.
Once, this is accomplished, the data can be mined for relationships
that would not be apparent if the data were on different systems.
Jobs
DBA -- Data Base Administrator A person that maintains a data base
Data Base Designer A person that designs and sets up a data base.
Privacy
Large amounts of data in databases can lead to concerns about privacy