Content

  1. Overview
  2. Memory Hierarchy
  3. Reliability & Efficiency
    1. Disk Failure
    2. Mirroring
    3. Data Striping
    4. Parity Check
    5. RAID
  4. File Organization
    1. Records
      1. Fixed-Length Records
      2. Variable-Length Records
    2. Organizing Records in Files
      1. Heap File
      2. Sequential File
      3. Hashing
      4. Multitable Clustering
  5. Buffer Manager

Overview

  • Cache
    • Query processing data structure/algorithm design
    • KB ~ MB
  • Memory
    • Too small for the entire enterprise database
    • 2 GB ~ 8 GB
    • The buffer for disk so that DBMS can operate on
  • Magnetic disk
    • 2 surfaces per platter; track 1 on surface 1, track 2 on surface 2, etc.
    • 50000 ~ 100000 track on a platter
    • 500 ~ 1000 sectors per inner track
    • 1000 ~ 2000 sectors per outer track
    • Sector size typically 512 B
    • Steps
      1. Seek track
      2. Rotate disk to place data under head
      3. Transfer data
    • Access time: request ~ start of data transfer
      • Seek + rotational delay
    • Data transfer rate: rate retrieving/storing data from/to disks
      • Disk controller may be shared across multiple disks; its processing speed is crucial
    • Data Block: data transfer unit between memory & disk
      • 4 KB ~ 16 KB (multiple sectors)
      • I/O operation: reading/writing of data blocks
        • Seek time + rotational delay + transfer time
  • Magnetic tape
    • Backup storage
    • Only sequential accesses (no seeks)
    • ~5 TB can be achieved
  • Optical storage
    • Slower than disk
    • CD-ROM (640 MB), DVD (4.7~17 GB)
    • Usually write-once read-many (WORM)
  • Flash memory
    • Reads roughly as fast as memory; writes slow
    • Each location write-once, or erase the entire bank
    • Limit of number of writes at each location (100000~1000000)

Memory Hierarchy

  • Data put on disk
  • Data manipulated on memory
  • Data backup on tertiary storage

Reliability & Efficiency

Disk Failure

  • Mean time to failure (MTTF): average time the disk is expected to run continuously without failure
  • Mean time to data loss: depends on MTTF & disk organization
    • Shortened if multiple disks are present with the same MTTF; MTTF/number_of_disks

Mirroring

Storing redundant copies in other disks.

  • Reliability
    • High yet expensive
  • Efficiency
    • Read request handling rate doubled
    • Speed of processing each read request remains the same

Data Striping

Partition data into blocks and store in different disks.

  • Reliability
    • No improvement
  • Efficiency
    • High data transfer rate by parallel read

Parity Check

Extra parity disk storing the XOR results of all disks.

  • Reliability
    • Tolerates failure of 1 disk
  • Efficiency
    • No improvement

RAID

Redundant Arrays of Independent Disks, different combinations of mirroring, striping, and error-correction strategies.

  • RAID 0: striping only
  • RAID 1/ RAID 10/ RAID 1+0: striping + mirroring
  • Others: RAID 2, 3, 4, 5, 6
RAID 5

All disks share the same workload of read requests.

File Organization

Large-scale DB systems don't usually rely on the underlying OS for file management; one large OS file is allocated to the DB system instead, and all relations are stored in this file and are managed by the DB system.

  • Blocks: storage/data transfer unit
  • Records: several records in a block

Records

Fixed-Length Records

  • Disallow records to cross block borders to avoid doubling the I/O requests

Variable-Length Records
  • Different types of record may appear in the same block
  • Variable lengths are allowed for the same record types
  • Slotted-page structure

Organizing Records in Files

Heap File

Stores records in any block with empty space.

  • Adv.
    • Simplicity
  • Disadv.
    • Blocks in a file scattered over the disk
    • Need to scan all data to locate a record
Sequential File

Records in sequential order of the search key.

  • Adv.
    • Faster processing for search keys in sorted order
    • e.g. SELECT * FROM T WHERE ID < xxx;
  • Disadv.
    • Hard to maintain
  • Insertion
    1. Locate the record just before
    2. If free slot in the same block, insert there
    3. Otherwise, insert into overflow block
Hashing

Hash values computed based on some attributes of each record.

Multitable Clustering

Put >= 2 related relations in the same file to achieve faster joins.

  • Adv.
    • Faster processing for joins and finding related records in relations
  • Disadv.
    • Suffers data on only 1 relation
    • Hard to deal with variable-length records

Buffer Manager

  • Buffer: portion of memory storing copies of disk blocks
    • Buffer miss
    • Buffer hit
    • Buffer write
      • Data written to buffer will be updated to disk when buffer is flushed or upon other reliability requirements
    • Buffer eviction
  • Buffer replacement policy
    • LRU, LFU, ...
  • Buffer manager: program responsible for buffer allocation in memory & moving blocks between memory and disk to minimize I/O
  • Data dictionary/system catalog: stores metadata; frequently accessed by buffer manager & query optimizer, hence stay in memory
    • Info about relations e.g. names/types of relations/attributes, integrity constraints, views
    • Statistical data e.g. number of tuples
    • Physical file organization

results matching ""

    No results matching ""