Database Management System - Relational Design Theory

CS564 - DBMS - Relational Design Theory

SQL

Relational algebra and Calculus

Keys

  • Asuperkey is a set of one or more attributes that, taken collectively, allow us to identify uniquely a tuple in the relation
  • Such minimal superkeys are called candidate keys.
  • We shall use the term primary key to denote a candidate key that is chosen by the database designer as the principal means of identifying tuples within a relation.
  • Foreign Key is a set of attributes in a referencing relation, such that for each tuple in the referencing relation, the values of the foreign key attributes are guaranteed to occur as the primary key value of a tuple in the referenced relation.

Join

  • natural join combines two tables based on identical columns
  • Cartesian product operation combines tuples from two relations, but unlike the join operation, its result contains all pairs of tuples from the two relations

SQL Language

  • SQL allows us to use the keyword all to specify explicitly that duplicates are not removed
  • varchar(n): A variable-length character string with user-specified maximum length n. The full form, character varying, is equivalent.
  • numeric(p, d):Afixed-point numberwith user-specified precision. The number consists of p digits (plus a sign), and d of the p digits are to the right of the decimal point.
  • Percent (%): The % character matches any substring.
  • Underscore (_): The character _ matches any character.
  • The set operation automatically eliminates duplicates

Database Design (ER Model)

Terms

  • Entity
  • Attribute
    • Domain
    • Key
    • Primary Key can not be null
  • Relationship
    • Descriptive attribute, the attribute of the entity in between two entities
    • A relationship should be uniquely identified
    • Instance of relationship set is a set of relationships
    • Sometimes a relationship can involve two identities in the same enetity set
  • Constraints
    • Participation constraint, total, partial
    • weak enetity set
    • ISA inheritance relationship set
      • specialzation, generalization
      • overlap, covering constraint

The Relational Model

  • Integrity Constriant
    • Foreign key constraint

Relational Design Theory

函数依赖 Functional Dependency

  • 考虑如下的university database schema

    将表instructor和department替换为其自然连接的结果不是一个好的选择

    因为对于每个instructor都有部门budget信息大量重复 我们需要保证任何更新操作都要同步budget。 另一个缺点是natural inner join会去除左右两边为null的情况,所以没有instructor的部门就无法在表中被表示。
    但如果我们先有inst_dept表,我们该如何知道这个设计不好并且应该被分解成instructor和department呢?
    我们需要发现每个department必须只有1个building,每个department必须只有一个budget。
    Not every ER design can be precise enough to avoid such problems,so we need to allow designers to specify rules such as “each specific value of dept_name corresponds to at most one budget” even in cases where dept_name is not the primary key for the schema.
    In this case, we need to write a rule “if there were a schema(dept_name,budget),then dept_name is able to serve as the primary key.” This rule is specified as a functional dependency:

    This gives sufficient information to recognize the problem of inst_dept schema. Therefore functional dependency says what relational instances are allowed under certain constraints.
    Functional depedency generalizes notions of key in a relation. It depicts relationship between attributes.

  • Definition: A relational instance satisfies functional dependency from attribute A->B if (any) two tuples having same value of attribute A also have same value of attribute B. i.e. A uniquely determines B

  • A trivial functional dependency is the one which will always hold in a relation. A->B and B is a subset of A.

  • Non-Trivial functional dependency may or may not hold in a relation. A->B and B is NOT a subset of A.

  • Properties of functional dependency
    “Armstrong’s Axioms”

    Other

  • "A set of functional dependencies" F is a set of FD constraints on legal relational instances in a relation.

  • "The closure of FDs set F" F+ is a set of all FDs that can be inferred from given the set F (Note that this includes all FDs in F itself).

  • Attribute Closure: Attribute closure of an attribute set A is a set of all attributes that can be functionally determined from elements of set A (Note that this includes attributes in A itself).
    Examples:

  • Determining equivalence of functional dependencies
    Check whether 2 FD sets are subset of each other’s closure.

  • Computing minimal cover
    A minimal cover of a FD set A is the smallest set of FDs that covers A.

  • Other notes
    所有非主属性都完全函数依赖于每个候选键
    所有主属性都完全函数依赖于每个不包含它的候选键
    没有任何属性完全函数依赖于非候选键的任何一组属性

数据异常 Data Anomalies

以下的学生课程关系的函数依赖为 Sno, Cname -> Sname, Sdept, Mname, Grade,键码为 {Sno, Cname}。也就是说,确定学生和课程之后,就能确定其它信息。

Sno Sname Sdept Mname Cname Grade
1 学生-1 学院-1 院长-1 课程-1 90
2 学生-2 学院-2 院长-2 课程-2 80
2 学生-2 学院-2 院长-2 课程-1 100
3 学生-3 学院-2 院长-2 课程-2 95

不符合范式的关系,会产生很多异常,主要有以下四种异常:

  • 冗余数据:例如 学生-2 出现了两次。
  • 修改异常:修改了一个记录中的信息,但是另一个记录中相同的信息却没有被修改。
  • 删除异常:删除一个信息,那么也会丢失其它信息。例如删除了 课程-1 需要删除第一行和第三行,那么 学生-1 的信息就会丢失。
  • 插入异常:例如想要插入一个学生的信息,如果这个学生还没选课,那么就无法插入。

范式 Normal Form

范式理论是为了解决以上提到四种异常。
高级别范式的依赖于低级别的范式,1NF 是最低级别的范式。

分解

Universal Relation: A relation that captures all the information in schema and is decomposable into smaller relations.

  • Decomposition of a relation is done when a relation in relational model is not in appropriate normal form. A good/detailed ER model should end up directly as a 3NF or BCNF. The functional dependencies guide us to determine entities and their relationships

  • Ideal decomposition should be lossless and dependency preserving.
    For relation R decomposed into R1 and R2:

  • Example:


    employee表被分解为employee1表和employee2表:因为employee可以重名,所以我们将分解结果natural join会无法还原employee表。

  • Lossless decomposition(lossless join decomposition) Requirements

  • Dependency preserving decomposition
    If we decompose a relation R into relations R1 and R2, All dependencies of R either must be a part of R1 or R2 or must be derivable from combination of FD’s of R1 and R2.

    • Example:
    1. R (A, B, C, D) with FD set{A->BC} is decomposed into R1(ABC) and R2(AD) which is dependency preserving because FD A->BC is a part of R1(ABC).
    2. R(A,B,C,D) and functional dependencies A->B and C->D. Then the decomposition of R into R1(AB) and R2(CD) is dependency preserving but not lossless join because it violates the second condition of lossless join and A->B can be ensured in R1(AB) and C->D can be ensured in R2(CD). Hence it is dependency preserving decomposition.

1. 第一范式 (1NF) Atomicity(原子性)

表的属性不可分(表的属性不可为复合属性)。然而注意复合属性有时也是有用的并被大量实际使用于面向对象数据库。

2. Boyce–Codd normal form(BCNF) 每个表中只有一个候选键

BCNF is a slightly stronger version of 3NF.
Every BCNF satisfies 3NF. It eliminates all redundancy that can be discovered based on functional dependencies.
A database design is in BCNF if each member of the set of relation schemas that constitutes the design is in BCNF.

Definition:
A relational schema R is in BCNF with respect to a set of functional dependencies F if for all functional dependencies in F+ of form A->B where A,B are subsets of R, at least one of the following holds:

  • A->B is a trivial functional dependency (B is a subset of A)
  • A is a super key of schema R

Another equivalent definition (important):
For all nontrivial FD A->B, A must be a super key of schema R.
The negated definition
A relational schema R is not in BCNF w.r.t FDs set F if there exists a FD in F+ of form A->B where A,B are subsets of R s.t.
A->B is a nontrivial FD (B is not a subset of A) AND A is not a super key of schema R.
Example:
inst_dept (ID, name, salary, dept_name, building, budget) is not in BCNF because dept_name->budget is a nontrivial FD and dept_name is not a super key of inst_dept.
The schema instructor and department are not in BCNF because for all nontrivial FDs, either left side is not super key, or is super key. (in that case: ID, dept_name can be super key for each schema).
Decomposing relational schema to BCNF
For a relational schema R not in BCNF, we must have at least 1 nontrivial FD A->B s.t. A is not a super key for R.
We decompose R into 2 schemas:

For inst_dept, A=dept_name, B={building,budget}, it can be decomposed to (dept_name, building, budget) AND (ID,name,salary,dept_name)
Problem of BCNF:

  • BCNF is not dependency preserving.

3. 第三范式 (3NF) 2NF&消除传递函数依赖

BCNF requires A to be a superkey for nontrival FD A->B. 3NF relaxes the constraints by allowing A to not be a super key.
Definition:
A relational schema R is in 3NF with respect to a set of functional dependencies F if for all functional dependencies in F+ of form A->B where A,B are subsets of R, at least one of the following holds:

  • A->B is a trivial functional dependency (B is a subset of A)
  • A is a super key of schema R
  • Each attribute X in B-A is contained in a candidate key in R
Sno Sname Sdept Mname
1 学生-1 学院-1 院长-1
2 学生-2 学院-2 院长-2
3 学生-3 学院-2 院长-2

存在以下transitive functional dependency:

  • Sno -> Sdept -> Mname

可以进行以下分解:

Sno Sname Sdept
1 学生-1 学院-1
2 学生-2 学院-2
3 学生-3 学院-2
Sdept Mname
学院-1 院长-1
学院-2 院长-2

Multi-dimensional database

Think spreadsheets and reporting but generalized.

  • A star schema is the simplest form of a dimensional model, in which data is organized into facts and dimensions. A fact is an event that is counted or measured, such as a sale or login. A dimension contains reference information about the fact, such as date, product, or customer. A star schema is diagramed by surrounding each fact with its associated dimensions. The resulting diagram resembles a star.
  • Example:

Levels of abstraction in DBMS

  • Physical Schema is the way the relations are actually stored in SSD/HDD. It also defines indexes, statistics etc. defined on the table. (Indexes are defined using DDL.)
  • Logical (Conceptual) Schema is the DDL for creating the table. (It can sometimes specify the physical schema.)
  • External Schema is the DDL that define’s the external user’s view of the database. Though some “base” tables are directly visible, most of it is protected through views.

Storage and File Structure

Storage Overview

  • Persistent storage nonvolatile memory
    • Data in a DBMS has to be persistent
    • SSD: flash memory based, faster and more expensive
    • HDD: magnetic storage, slower and cheaper
    • Tapes: disks are the new tapes, still great for archiving
    • Cloud server
  • Volatile storage
    • DRAM

Memory hierarchy

Disk

  • The disk surface is logically divided into tracks, which are subdivided into sectors.

    • A sector is the smallest unit of information that can be read from or written to the disk.
    • The read–write head stores information on a sector magnetically as reversals of the direction of magnetization of the magnetic material.
    • The platters that are fixed on a spindle rod spin, say 90 rps
    • Arm assembly moves in or out to position a head on desired track.
    • The tracks make an imaginary 空心cylinder: tracks with the same radius on all surfaces of disks platters #cylinder = #tracks/surface
    • #surface = #tracks/#cylinder = 2#platters
    • Only one head used at a time
    • Block size is a multiple of sector size (fixed, usually 512 bytes)
    • capacity = heads(#surfaces) x cylinder x sectors x 512 (typical size of one sector in bytes) = #surfaces * #tracks * #sectors per track * bytes per sector
  • Seek time: the time for repositioning the arm to change track.

  • Rotational delay: the time for getting to the right sector = 0.5(60s/磁盘转速rpm) 0.5为average

  • transfer time: actual overhead for transfering data = moving head across block

Seek time + rotational delay is the major contributor to delay.
Formatting: defining polar coordinate system on disk surface.
wear leveling: storing hot data on less weared sectors and storing cold data on more weared sector to extend service time of the disk.

  • RAID 0 Data stripping
    RAID 0 consists of striping, but no mirroring or parity. Compared to a spanned volume, the capacity of a RAID 0 volume is the same; it is the sum of the capacities of the disks in the set. But because striping distributes the contents of each file among all disks in the set, the failure of any disk causes all files, the entire RAID 0 volume, to be lost. A broken spanned volume at least preserves the files on the unfailing disks. The benefit of RAID 0 is that the throughput of read and write operations to any file is multiplied by the number of disks because, unlike spanned volumes, reads and writes are done concurrently,[11] and the cost is complete vulnerability to drive failures.

  • RAID 1 Mirroring
    RAID 1 consists of data mirroring, without parity or striping. Data is written identically to two drives, thereby producing a “mirrored set” of drives. Thus, any read request can be serviced by any drive in the set. If a request is broadcast to every drive in the set, it can be serviced by the drive that accesses the data first (depending on its seek time and rotational latency), improving performance. Sustained read throughput, if the controller or software is optimized for it, approaches the sum of throughputs of every drive in the set, just as for RAID 0. Actual read throughput of most RAID 1 implementations is slower than the fastest drive. Write throughput is always slower because every drive must be updated, and the slowest drive limits the write performance. The array continues to operate as long as at least one drive is functioning.

  • RAID 5 Stripping with Distributed Parity
    Upon failure of a single drive, subsequent reads can be calculated from the distributed parity such that no data is lost. RAID 5 requires at least three disks.

  • RAID 6 two parity disks
    RAID 6 consists of block-level striping with double distributed parity. Double parity provides fault tolerance up to two failed drives. This makes larger RAID groups more practical, especially for high-availability systems, as large-capacity drives take longer to restore. RAID 6 requires a minimum of four disks. As with RAID 5, a single drive failure results in reduced performance of the entire array until the failed drive has been replaced.[11] With a RAID 6 array, using drives from multiple sources and manufacturers, it is possible to mitigate most of the problems associated with RAID 5. The larger the drive capacities and the larger the array size, the more important it becomes to choose RAID 6 instead of RAID 5.

Database Buffer

  • A major goal of the database system is to minimize the number of block transfers between the disk and memory.
  • The subsystem responsible for the allocation of buffer space is called the buffer manager.

File Organization

A fixed length record is one where the length of the fields in each record has been set to be a certain maximum number of characters long. Suppose a field that was going to contain a name was set to be 25 characters long. This means that the field could only ever contain up to 25 characters. If all the fields in the record have a fixed length like this then the record is said to be a fixed length record. The problem with fixed length records is that each field very rarely contains the maximum number of characters allowed. This means that a lot of space is needlessly set aside and wasted. Also, values sometimes cannot be entered because they are too large to fit inside the allowed space in a field. The advantage of fixed length records is that they make file processing much easier because the start and end of each record is always a fixed number of characters apart. This makes it much easier to locate both indicidual records and fields.


A variable length record is one where the length of a field can change to allow data of any size to fit. The advantage of variable length records is that space is not wasted, only the space needed is ever used. The main problem with variable length records is that it is much more difficult to locate the start and end of individual records and fields. This is because they are not separated by a fixed amount of characters. To separate variable length recordseach field has a special character to mark where it ends- called an end- of- field marker. When records need to be located the computer must count through the end- of- field markers to locate individual records and fields.




  • Columnar storage
    A column-oriented DBMS (or columnar database management system) is a database management system (DBMS) that stores data tables by column rather than by row. Practical use of a column store versus a row store differs little in the relational DBMS world. Both columnar and row databases can use traditional database query languages like SQL to load data and perform queries. Both row and columnar databases can become the backbone in a system to serve data for common extract, transform, load (ETL) and data visualization tools. However, by storing data in columns rather than rows, the database can more precisely access the data it needs to answer a query rather than scanning and discarding unwanted data in rows. Query performance is increased for certain workloads.

Storage summary

  • Databases must have persistent storage: SSD, HDD Their performance characteristics affect database design
  • Buffer manager tries to keep the optimal set of data blocks (pages) in memory to minimize I/O
  • must be able to “lock” (pin) a page in memory
  • must be able to write / flush page to disk on demand
  • Rows comprise both fixed and variable length fields
  • Slotted page format is flexible to keep records organized and accessible on a page
  • Single column values could be stored in fixed length records, but are usually compressed via encodings
  • Compression is used at both column and block level Heap files are ok but most databases use B+ trees
  • Columnar storage is another tool in the database tool-kit, soon all DBMS vendors will have it

Indexing and Hashing

B+ tree

  • Properties
    • It can be shown that the number of I/O operations needed in the worst case for an insertion is proportional to logn/2(N), where n is the maximum number of pointers in a node, and N is the number of records in the file being indexed.

    • It contains up to n − 1 search-key values K1 , K2 , . . . , Kn − 1 , and n pointers P1 , P2 , . . . , Pn . The search-key values within a node are kept in sorted order; thus, if i < j, then Ki < Kj

    • Each non-leaf node in the tree has between n/2 and n children, where n is fixed for a particular tree.

    • A non-leaf node may hold up to n pointers, and must hold at least n/2 pointers. The number of pointers in a node is called the fanout of the node.

    • Each leaf can hold up to n − 1 values. We allow leaf nodes to contain as few as (n − 1)/2 values. With n = 4 in our example B±tree, each leaf must contain at least 2 values, and at most 3 values.

    • Since, n = 4 and 1 < (n − 1)/2, we must either merge the node with a sibling node, or redistribute the entries between the nodes, to ensure that each node is at least half-full.

  • Insertion
    • Split
    • Coalesce
  • Deletion
    • if the occupancy of a node falls below 2n/3, the system attempts to borrow an entry from one of the sibling nodes
    • Borrow left, pick max one
    • Borrow right, pick min one
    • Merge left, shift left, change parent
    • Merge right, shift right, change parent
  • Non-unique search key: If a relation can have more than one record containing the same search key value (that is, two or more records can have the same values for the indexed attributes), the search key is said to be a non-unique search key.
  • Bulk-loading in B+ tree
    • Insertion of a large number of entries at a time into an index is referred to as bulk loading of the index.
    • sort the file on the search key of the index being constructed
    • There is a significant benefit to sorting the entries before inserting them into the B+tree.
    • nodes will never have to be read from disk during bulk load, if the B+ tree was empty to start with. Each leaf node will thus incur only one I/O operation even though many entries may be inserted into the node.
  • bottom-up B+ tree construction
    • it can be constructed faster by building it bottom-up, from the leaf level, instead of using the usual insert procedure.
    • we break up the sorted entries into blocks, keeping as many entries in a block as can fit in the block; the resulting blocks form the leaf level of the B±tree. The minimum value in each block, alongwith the pointer to the block, is used to create entries in the next level of the B±tree, pointing to the leaf blocks

HASHING

Static Hashing

  • We use the term bucket to denote a unit of storage that can store one or more records.
  • let K denote the set of all search-key values, and let B denote the set of all bucket addresses. A hash function h is a function from K to B. Let h denote a hash function.
  • Hash function should be random and uniform
  • Algorithm is similiar to Data Structure Hash Table

Dynamic Hashing

BITMAP

  • A bitmap is simply an array of bits. In its simplest form, a bitmap index on the attribute A of relation r consists of one bitmap for each value that A can take. Each bitmap has as many bits as the number of records in the relation. The ith bit of the bitmap for value vj is set to 1 if the record numbered i has the value vj for attribute A. All other bits of the bitmap are set to 0.

  • To recognize deleted records,we can store an existence bitmap, in which bit i is 0 if record i does not exist and 1 otherwise.
    • If some records have been deleted, however, just computing the complement of a bitmap is not sufficient. Bits corresponding to such records would be 0 in the original bitmap, but would become 1 in the complement, although the records don’t exist.
  • Limitation
    • If there are further conditions, the fraction of records satisfying all the conditions is likely to be quite small.
    • If the fraction is large, scanning the entire relation would remain the cheaper alternative.
  • Counting the number of bits that are 1 in a bitmap can be done quickly by a clever technique. We can maintain an array with 256 entries, where the ith entry stores the number of bits that are 1 in the binary representation of i.

储存与索引总结

atomicity problem: transaction is executed totally or not at all
integrated storage structure: index file 和record file一起
seperate file: 相反

clustered index 与原数据同顺序
clustered indexes do not guarantee sequential storage on the disk. Managing exactly where data is placed on the disk is the job of the OS, not the DBMS. But it suggests that items are ordered generally according to the clustering key.

With a non clustered index there is a second list that has pointers to the physical rows. You can have many non clustered indexes, although each new index will increase the time it takes to write new records.

It is generally faster to read from a clustered index if you want to get back all the columns. You do not have to go first to the index and then to the table.

Writing to a table with a clustered index can be slower, if there is a need to rearrange the data

primary index: using primary key for indexing

secondary index: otherwise

index以block为单位进行index within block用offset

hash index 通过哈希函数生成hash address to a bucket with possible overflow chain for managing collision
cheaper than B+tree if no overflow occurs Access: O(1+#overflow buckets)
所以hash索引的最大的特点就是等值查询快,不能进行范围索引。
位图索引适合静态low-cardinality重复数据
b树索引同时支持范围及等值查询

b tree m-way(order m, m fanout, m-1info fields) search tree with additional constraints: 叶子层高度相同 root 2 key 其他节点至少半满ceiling(order/2)来尽量减少高度 若想要插入的位置已满 recursively按中序遍历顺序将中点上移 同时将前驱后继节点分开 始终保持节点半满的要求
b+ tree 更贴近多级索引,是在b树基础上, nonleaf node sparse index 减少disk page access 支持equality search 在叶子层将nonleaf节点key按中序遍历顺序拷贝下来 叶子层包含record ptrs 保持中序遍历顺序建立链表 形成dense & clustered index 从而支持range search
删除: 左合并 右合并 来满足半满的限制 split if necessary can propagate to root.
order=#ptr fields = p /node
#k,v fields = p-1 /node

(p-1)(key_ptr_size + record_ptr_size) + p(block_ptr_size) <= blocksize=512

static hashing: linear congruential hash function with fixed #hash buckets use overflow chain to manage contention

extendible hashing: nonlinear hashing congruential function such as h_k(v)=h(v) mod 2^k use directory of size 2^k to store ptrs to hash buckets
when collisions happen increment k value and maps it elsewhere

打赏
  • 版权声明: 本博客所有文章除特别声明外,均采用 Apache License 2.0 许可协议。转载请注明出处!
  • © 2016-2020 th2zz

请我喝杯咖啡吧~

支付宝
微信