Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Distributed Databases: Definition, Advantages, Problems, and Solutions, Slides of Introduction to Database Management Systems

An overview of distributed databases, including their definition, advantages, and problems. It covers topics such as data distribution, client-server systems, network transfer rates, distributed query processing, data replication, concurrency, and transactions. It also discusses distributed database rules and features, as well as creating and managing a distributed database.

Typology: Slides

2012/2013

Uploaded on 04/27/2013

prakash
prakash 🇮🇳

4.6

(10)

63 documents

1 / 33

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Database Management Systems
1
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21

Partial preview of the text

Download Distributed Databases: Definition, Advantages, Problems, and Solutions and more Slides Introduction to Database Management Systems in PDF only on Docsity!

Database Management Systems

1

Objectives

• Why do you need a distributed database?

• What are distributed databases?

• How is data distributed with client/server

systems?

• Can a Web approach solve the data distribution

issues?

• How much data can you send to a client form?

• How do you transfer data across diverse systems?

• How will Sally’s employees access the database?

2

Distributed Database Definition

  • Multiple independent databases
    • Each DBMS is a complete DBMS (engine, queries, locking, transactions, etc.)
    • Usually on different machines.
    • Usually in different locations.
  • Connected by a network.
  • Might be different environments
    • Hardware
    • Operating System
    • DBMS Software

4

Database Zeus

Database Apollo

Database Athena

United States

England

France

Distributed Database Rules

  • C.J. Date
  • Rule 0: Transparency: the user

should not know or care that the

database is distributed.

  • Local autonomy.
  • No reliance on a central site.
  • Continuous operation.
  • Location independence.
  • Fragmentation independence (physical storage).
  • Replication independence.
    • Distributed query processing.
    • Distributed transaction management.
    • Hardware independence.
    • Operating system independence.
    • Network independence.
    • DBMS independence.

5

Advantages and Applications

  • Business operations are often

distributed

  • Work and data are segmented by department.
  • Work and data are segmented by geographical location.
  • Improved performance
  • Most updates and queries are performed locally.
  • Maintain local control and responsibility over data.
  • Can still combine data across

the system.

  • Scalability and expansion
    • Add on, not replacement.

7

local transactions

future expansion

Creating a Distributed Database

• Design administration plan.

• Choose hardware and DBMS vendor, and

network.

• Set up network and DBMS connections.

• Choose locations for data.

• Choose replication strategy.

• Create backup plan and strategy.

• Create local views and synonyms.

• Perform stress test: loads and failures.

8

Distributed Query Processing• Example

  • NY: Customers: 1 M rows
  • LA: Production: 10 M rows
  • Chicago: Sales: 20 M rows
  • Query: List customers who bought blue products on March 1
  • Bad idea #
    • Transfer all rows to Chicago
    • Then JOIN and select.
  • Better idea #2 (probably)
    • Transfer blue products from LA to Chicago
  • Better idea #
    • Get sale items on March 1
    • Get blue products from LA
    • Send C# to NY

10

Customers(C#, …) 1,000,

NY

Products(P#, Color…) 10,000,

Sales(S#, C#, Sdate) 20,000, SaleItem(S#, P#,…) 50,000,

Chicago

LA

P# sold on March 1 Blue P# sold on March 1

C# list from desired P# Matching Customer data

Data Replication

  • Goals
    • Minimize transmissions
    • Improve performance
    • Support heavy multiuser access.
  • Problems
    • Updating copies
      • Bulk transmissions
      • Site unavailable
    • Concurrency
      • Easier for two people to change the same data at the same time.
  • Decision support systems.
  • Data warehouse.

11

Britain: Customers & Sales France: Customers & Sales

Spain: Customers & Sales

Britain

Britain: Customers & Sales France: Customers & Sales

Spain: Customers & Sales

Spain

Update data.

Market research & data corrections.

Periodic updates

Transactions & Two-Phase Commit

  • Two (or more) separate lock

managers.

  • DBMS initiating update serves

as the coordinator.

  • Two phases
    • Coordinator sends message and data to all machines to “get ready.”
    • Local machines save data in logs, verify update status and return message.
    • If all locals report OK, then coordinator writes log and instructs others to proceed. If any fail, it sends Rollback message.

13

Database 1 Initiate Transaction

Database 2 Database 3

  1. Prepare to commit. All agree? 2. Commit

Lock tables. Save log. Update all tables.

Distributed Transaction Managers

14

Transaction Processing Monitor

Transaction Manager

Resource Manager

Transaction Manager

Resource Manager

Transaction Manager

Resource Manager

DBMS DBMS

The distributed transaction^ DBMS coordinator/transaction processing monitor handles the transaction decisions and coordinates across the participating systems.

Client-Server

17

Server

Server

Clients Clients

Shared

Database

Front-end

User Interface

LAN File Server

  • Not a distributed database.
    • Data file stored on server.
    • Server is passive, appears as giant disk drive to PC.
    • PC processes all data.
    • Retrieves all needed data across the network.
  • Performance improvements.
    • Indexes are crucial.
    • Store some data on each PC (replication).
    • Store applications on PC (graphics & forms).
    • Convert to SQL-Server

18

File Server

DBMS data file

Application SharedData

SELECT Name, SaleDate FROM Customer INNER JOIN Sales ON Customer.C# = Sales.C# WHERE SaleDate BETWEEN #1-Mar-97# AND #9-Mar-97#;

All data from all tables are read by PC, which performs JOIN and WHERE test. If available, reads index first.

Client-Server Databases

  • One machine machine is

dominant (server) and handles

data for many clients.

  • Client machines handle front-end

tasks and small data tables that

are not shared.

20

File Server

DBMS

SQL Server

Shared Data

application

Send SQL statement.

Return matching data.

Three-Tier Client-Server

  • Server Databases
  • Client front-end
  • Middle
    • Locate databases
    • Business rules
    • Program code

22

Client

Middleware

Database

Servers

Application.

Front-end.

User Interface.

Databases.

Transactions.

Legacy applications.

Database links.

Business rules.

Program code.