Download Distributed Databases: Definition, Advantages, Problems, and Solutions and more Slides Introduction to Database Management Systems in PDF only on Docsity!
Database Management Systems
1
Objectives
• Why do you need a distributed database?
• What are distributed databases?
• How is data distributed with client/server
systems?
• Can a Web approach solve the data distribution
issues?
• How much data can you send to a client form?
• How do you transfer data across diverse systems?
• How will Sally’s employees access the database?
2
Distributed Database Definition
- Multiple independent databases
- Each DBMS is a complete DBMS (engine, queries, locking, transactions, etc.)
- Usually on different machines.
- Usually in different locations.
- Connected by a network.
- Might be different environments
- Hardware
- Operating System
- DBMS Software
4
Database Zeus
Database Apollo
Database Athena
United States
England
France
Distributed Database Rules
- C.J. Date
- Rule 0: Transparency: the user
should not know or care that the
database is distributed.
- Local autonomy.
- No reliance on a central site.
- Continuous operation.
- Location independence.
- Fragmentation independence (physical storage).
- Replication independence.
- Distributed query processing.
- Distributed transaction management.
- Hardware independence.
- Operating system independence.
- Network independence.
- DBMS independence.
5
Advantages and Applications
- Business operations are often
distributed
- Work and data are segmented by department.
- Work and data are segmented by geographical location.
- Improved performance
- Most updates and queries are performed locally.
- Maintain local control and responsibility over data.
- Can still combine data across
the system.
- Scalability and expansion
7
local transactions
future expansion
Creating a Distributed Database
• Design administration plan.
• Choose hardware and DBMS vendor, and
network.
• Set up network and DBMS connections.
• Choose locations for data.
• Choose replication strategy.
• Create backup plan and strategy.
• Create local views and synonyms.
• Perform stress test: loads and failures.
8
Distributed Query Processing• Example
- NY: Customers: 1 M rows
- LA: Production: 10 M rows
- Chicago: Sales: 20 M rows
- Query: List customers who bought blue products on March 1
- Bad idea #
- Transfer all rows to Chicago
- Then JOIN and select.
- Better idea #2 (probably)
- Transfer blue products from LA to Chicago
- Better idea #
- Get sale items on March 1
- Get blue products from LA
- Send C# to NY
10
Customers(C#, …) 1,000,
NY
Products(P#, Color…) 10,000,
Sales(S#, C#, Sdate) 20,000, SaleItem(S#, P#,…) 50,000,
Chicago
LA
P# sold on March 1 Blue P# sold on March 1
C# list from desired P# Matching Customer data
Data Replication
- Goals
- Minimize transmissions
- Improve performance
- Support heavy multiuser access.
- Problems
- Updating copies
- Bulk transmissions
- Site unavailable
- Concurrency
- Easier for two people to change the same data at the same time.
- Decision support systems.
- Data warehouse.
11
Britain: Customers & Sales France: Customers & Sales
Spain: Customers & Sales
Britain
Britain: Customers & Sales France: Customers & Sales
Spain: Customers & Sales
Spain
Update data.
Market research & data corrections.
Periodic updates
Transactions & Two-Phase Commit
- Two (or more) separate lock
managers.
- DBMS initiating update serves
as the coordinator.
- Two phases
- Coordinator sends message and data to all machines to “get ready.”
- Local machines save data in logs, verify update status and return message.
- If all locals report OK, then coordinator writes log and instructs others to proceed. If any fail, it sends Rollback message.
13
Database 1 Initiate Transaction
Database 2 Database 3
- Prepare to commit. All agree? 2. Commit
Lock tables. Save log. Update all tables.
Distributed Transaction Managers
14
Transaction Processing Monitor
Transaction Manager
Resource Manager
Transaction Manager
Resource Manager
Transaction Manager
Resource Manager
DBMS DBMS
The distributed transaction^ DBMS coordinator/transaction processing monitor handles the transaction decisions and coordinates across the participating systems.
Client-Server
17
Server
Server
Clients Clients
Shared
Database
Front-end
User Interface
LAN File Server
- Not a distributed database.
- Data file stored on server.
- Server is passive, appears as giant disk drive to PC.
- PC processes all data.
- Retrieves all needed data across the network.
- Performance improvements.
- Indexes are crucial.
- Store some data on each PC (replication).
- Store applications on PC (graphics & forms).
- Convert to SQL-Server
18
File Server
DBMS data file
Application SharedData
SELECT Name, SaleDate FROM Customer INNER JOIN Sales ON Customer.C# = Sales.C# WHERE SaleDate BETWEEN #1-Mar-97# AND #9-Mar-97#;
All data from all tables are read by PC, which performs JOIN and WHERE test. If available, reads index first.
Client-Server Databases
dominant (server) and handles
data for many clients.
- Client machines handle front-end
tasks and small data tables that
are not shared.
20
File Server
DBMS
SQL Server
Shared Data
application
Send SQL statement.
Return matching data.
Three-Tier Client-Server
- Server Databases
- Client front-end
- Middle
- Locate databases
- Business rules
- Program code
22
Client
Middleware
Database
Servers
Application.
Front-end.
User Interface.
Databases.
Transactions.
Legacy applications.
Database links.
Business rules.
Program code.