Conjunctive Queries
a query (mapping) is from (or over) its input schema and to its output schema.
- query: a syntactic object
- query mapping: a function defined by a query interpreted under a specified semantics
- its domain: the family of all instances of an input schema
- its range: the family of instances of an output schema
- we often blur query and query mapping when the meaning is clear from context
- input schema: a specified relation or database schema
- output schema: a relation schema or database schema
- For a relation schema, the relation name may be specified as part of the query syntax or by the context
- denotes two queries and over are equivalent, i.e., they have the same output schema and for each instance over .
Logic-Based Perspective
Three versions of conjunctive queries:
- Rule-Based Conjunctive Queries
- Tableau Queries
- Conjunctive Calculus
Rule-Based Conjunctive Queries
Definition
A rule-based conjunctive query (or often more simply called rules) over a relation schema is an expression of the form where , are relation names in ; is a relation name not in ; and are free tuples (i.e., may use either variables or constants).
- : the set of variables occurring in .
- body:
- head:
- range restricted: each variable occurring in the head also occurs in the body
- all conjunctive queries considered here are range restricted
- valuation: a valuation over , a finite subset of , is a total function from to (of constants)
- extended to be identity on (so that the domain can contain both variables and constants)
- extended to map free tuples to tuples (so that the domain/range can be in the tuple form)
Semantics
Let be the query, and let be a database instance of schema . The image of under is:
- active domain
- of a database instance , denoted , is the set of constants that occur in
- of a relation instance , denoted , is the set of constants that occur in
- of a query , denoted , is the set of constants that occur in
- is an abbreviation for
- extensional relations: relations in the body of the query, i.e.,
- because they are known/provided by the input instance
- intensional relation: the relation in the head of the query, i.e.,
- because it is not stored and its value is computed on request by the query
- extensional database (edb): a database instance associated with the extensional relations
- intensional database (idb): the rule itself
- idb relation: the relation defined by the idb
Properties
Conjunctive queries are:
- monotonic: a query over is monotonic if for each over ,
- satisfiable: a query is satisfiable if there exists a database instance such that
Tableau Queries
A tableau query is simply a pair where is a tableau and each variable in also occurs in .
This is closest to the visual form provided by Query-By-Example (QBE).
- summary: the free tuple representing the tuples included in the answer to the query
- embedding: a valuation for the variables occurring in such that
- the output of on consists of all tuples for each embedding of into
- typed: a tableau query under the named perspective, where is over relation schema and , is typed if no variable of is associated with two distinct attributes in
Examples
Table Movies
| Title | Director | Actor |
|---|---|---|
| “Bergman” |
Table Location
| Theater | Address | Phone |
|---|---|---|
Table Pariscope
| Theater | Title | Schedule |
|---|---|---|
The above tableau query is typed because each variable is associated with only one attribute:
- :
- :
- :
- :
- :
- :
However, the following tableau query is untyped:
Table Movies
| Title | Director | Actor |
|---|---|---|
Because is associated with both and .
Conjunctive Calculus
The conjunctive query
can be expressed as the following conjunctive calculus query that has the same semantics:
where are all the variables occurring in the body and not the head.
Conjunctive Calculus Formula
Let be a relation schema. A (well-formed) formula over for the conjunctive calculus is an expression having one of the following forms:
- an atom over ;
- , where and are formulas over ; or
- , where is a variable and is a formula over .
An occurrence of a variable in formula is free if:
- is an atom; or
- and the occurrence of is free in or ; or
- , , and the occurrence of is free in .
: the set of free variables in .
An occurrence of a variable that is not free is bound.
Conjunctive Calculus Query
A conjunctive calculus query over database schema is an expression of the form
where
- is a conjunctive calculus formula,
- is a free tuple, and
- the set of variables occurring in is exactly .
For named perspective, the above query can be written as:
Note: In the previous chapter, a named tuple was usually denoted as:
But here we denote the named tuple as:
Not sure which one is better or whether the author did this intentionally.
Semantics
Valuation
A valuation over is a total function from to , which can be viewed as a syntactic expression of the form:
where
- is a listing of ,
- for each .
Interpretation as a Set
If , and , then is the valuation over that agrees with on and maps to .
Satisfaction
Let be a database schema, a conjunctive calculus formula over , and a valuation over . Then , if
- is an atom and ; or
- and and ; or
- and there exists a constant such that .
Image
Let be a conjunctive calculus query over . For an instance over , the image of under is:
- active domain
- of a formula , denoted , is the set of constants that occur in
- is an abbreviation for
- If , then the range of is contained in
- to evaluate a conjunctive calculus query, one need only consider valuations with range constrained in , i.e., only a finite number of them
- equivalent: conjunctive calculus formulas and over are equivalent,
if
- they have the same free variables and,
- for each over and valuation over , if and only if
Normal Form
Each conjunctive calculus query is equivalent to a conjunctive calculus query in normal form.
Expressiveness of Query Languages
Let and be two query languages.
- : is dominated by (or is weaker than ), if for each query , there exists a query such that .
- : and are equivalent, if and .
The rule-based conjunctive queries, tableau queries, and conjunctive calculus queries are all equivalent.
Incorporating Equality
The conjunctive query
can be expressed as:
Problem 1: Infinite Answers
Unrestricted rules with equality may yield infinite answers:
A rule-based conjunctive query with equality is a range-restricted rule-based conjunctive query with equality.
Problem 2: Unsatisfiable Queries
Consider the following query:
where is a unary relation and with .
Each satisfiable rule with equality is equivalent to a rule without equality.
No expressive power is gained if the query is satisfiable.
Query Composition and Views
a conjunctive query program (with or without equality) is a sequence of rules having the form:
where
- each is a distinct relation name not in ,
- for each , the only relation names that may occur in are in .
Closure under Composition
If conjunctive query program defines final relation , then there is a conjunctive query , possibly with equality, such that on all input instances , .
If is satisfiable, then there is a without equality.
Composition and User Views
views are specified as queries (or query programs), which may be
- materialized: a physical copy of the view is stored and maintained
- virtual: relevant information about the view is computed as needed
- queries against the virtual view generate composed queries against the underlying database
Algebraic Perspectives
The Unnamed Perspective: The SPC Algebra
Unnamed conjunctive algebra:
- Selection ()
- Projection ()
- Cross-Product (or Cartesian Product, )
Selection (“Horizontal” Operator)
Primitive Forms: and
where are positive integers, and
(constants are usually surrounded by quotes).
is sometimes called atomic selection.
Projection (“Vertical” Operator)
General Form:
where and are positive integers
(empty sequence is written ).
Cross-Product (or Cartesian Product)
Formal Inductive Definition
Let be a relation schema.
There are two kinds of base SPC (algebra) queries:
- Input Relation: Expression ; with arity equal to
- Unary Singleton Constant: Expression , where ; with arity equal to 1
The family of SPC (algebra) queries contains all above base SPC queries and, for SPC queries with arities , respectively,
- Selection: and , whenever and ; these have arity .
- Projection: , where ; this has arity .
- Cross-Product: ; this has arity .
Unsatisifiable Queries
E.g. where and .
This is equivalent to .
Generalized SPC Algebra
- Intersection (): is easily simulated by selection and cross-product.
- Generalized Selection (): permits the specification of multiple conditions.
- Positive Conjunctive Selection Formula: where each is either or
- Positive Conjunctive Selection Operator: an expression of the form , where is a positive conjunctive selection formula
- can be simulated by a sequence of selections
- Equi-Join (): a binary operator that combines cross-product and selection
- Equi-Join Condition: where each is of the form
- Equi-Join Operator: an expression of the form , where is an equi-join condition
- can be simulated by , where is obtained from by replacing each condition with
Normal Form
where
- ;
- ;
- ;
- are relation names (repeats permitted);
- is a positive conjunctive selection formula.
For each (generalized) SPC query, there is an equivalent SPC query in normal form.
The proof is based on repeated application of the following equivalence-preserving SPC algebra rewrite rules (or transformations):
- Merge-Select:
- Merge-Project:
- Push-Select-Though-Project:
- Push-Select-Though-Singleton:
- Associate-Cross:
- Commute-Cross: where and
- Push-Cross-Though-Select: where is obtained from by replacing each with
- Push-Cross-Though-Project: where is obtained from by replacing each with
Notation for Rewriting
- for a set of rewrite rules and algebra expressions ,
- if is understood from the context,
- if is the result of replacing a subexpression of according to one of the rules in .
: the reflexive and transitive closure of .
A family of rewrite rules is sound if implies . If is sound, then implies .
The Named Perspective: The SPJR Algebra
Named conjunctive algebra:
- Selection ()
- Projection (); repeats not permitted
- Join (); natural join
- Renaming ()
Selection
where
These operators apply to any instance with .
Projection
where and
(Natural) Join
where and have sorts and , respectively.
- When , .
- When , .
Renaming
An attribute renaming for a finite set of attributes is a one-one mapping from to .
where
- input is an instance over
- is an attribute renaming for
- which can be described by , where
- usually written as to indicate that for each
- which can be described by , where
Formal Inductive Definition
The base SPJR algebra queries are:
- Input Relation: Expression ; with sort equal to
- Unary Singleton Constant: Expression , where ; with
The remainder of the syntax and semantics of the SPJR algebra is defined in analogy to those of the SPC algebra.
Normal Form
where
- ;
- ;
- occurs in ;
- the s are distinct;
- are relation names (repeats permitted);
- is a renaming operator for ;
- no occur in any ;
- the sorts of are pairwise disjoint;
- is a positive conjunctive selection formula.
For each SPJR query, there is an equivalent SPJR query in normal form.
Equivalence Theorem
- Lemma: The SPC and SPJR algebras are equivalent.
- Equivalence Theorem: The rule-based conjunctive queries, tableau queries, conjunctive calculus queries, satisifiable SPC algebra, and satisifiable SPJR algebra are all equivalent.
Adding Union
The following have equivalent expressive power:
- the nonrecursive datalog programs (with single relation target)
- the SPCU queries
- the SPJRU queries
- union can only be applied to expressions having the same sort
Nonrecursive Datalog Program
A nonrecursive datalog program (nr-datalog program) over schema is a set of rules:
where
- no relation name in occurs in a rule head
- the same relation name may appear in more than one rule head
- there is some ordering of the rules such that the relation name in the head of does not occur in the body of any rule with
Union and the Conjunctive Calculus
Simply permitting disjunction (denoted ) in the formula of a conjunctive calculus along with conjunction can have serious consequences.
E.g.,
The answer of on nonempty instance will be
Because decides that the first two components in are in and can be anything in , and decides that the last two components in are in and can be anything in .