Theory and Construction of
Object-Oriented Query Languages.
Editors of the Polish-Japanese
Institute of Information Technology, Warsaw 2004, ISBN 83-89244-29-2 (522
pages)
Extended abstract
and Content
in English
Polish
Edition:
Kazimierz Subieta. Teoria i konstrukcja
obiektowych języków zapytań. Wydawnictwo PJWSTK, Warszawa 2004, ISBN
83-89244-29-2 (522 stron)
Słowniczek terminów polsko-angielski
Słowniczek terminów angielsko-polski
English translation
in preparation, but with no firm deadlines.
Query languages are
user-friendly interfaces enabling the users to search a database according to
any criteria and freely determined retrieval goals. The most popular query
language is SQL. It is considered by many professionals as an essential factor
of the commercial success of relational databases. SQL is a subject of several
standards, in particular SQL-89, SQL-92 and recently SQL-99/SQL2003. It is a
basis for many application programming interfaces, such as ODBC and JDBC. SQL
has been extended in many ways - through database updating clauses, database
views, stored procedures, triggers and in direction of programming languages.
Several theories
have been developed for query languages, in particular relational algebra,
relational calculus, and variants of mathematical logic. Many empirical methods
have also appeared, in particular concerning query optimization. There is,
however, controversy concerning their role and usability. In particular, the
relational algebra covers at most a few percents of SQL functionalities and it
is inconsistent with it, and query optimization methods are tightly bound to
physical properties of systems, they have restricted scope, and they have some
conceptual gaps and inconsistencies.
Object-oriented and
object-relational database systems have created new quality in the domain. This
also concerns Web technologies based on the XML standard. New data models,
standards, practical solutions, methods and theories have created the state
that could be characterised as total chaos, lack of consistent, homogeneous
theoretical background and casualty of practical solutions. The evidence of the
chaos is by standards SQL-99 and ODMG OQL, which are considered by many
professionals as entirely non-implementable for specification flaws.
This book presents
a consistent approach to theory and construction of query languages. It is an
attempt to discipline the chaos on the ground of homogeneous theory referred to
as the stack-based approach (SBA). In SBA a query language is considered
a special kind of a programming language. Thus, the semantics of queries is
based on mechanisms well known from programming languages like the environment
stack (thus the stack-based approach). SBA extends this concept for the case of
query operators, such as selection, projection/navigation, join, quantifiers
and others. Using SBA one is able to determine precisely the operational
semantics (abstract implementation) of query languages, including relationships
with object-oriented concepts, embedding queries into imperative constructs,
and embedding queries into programming abstractions: procedures, functional
procedures, views, methods, modules, etc.
The stack-based
approach is a theory of query languages that is independent of a specific data
model. It can be applied to relational, object-oriented, object-relational
databases, and to XML repositories. Because various object models introduce a
lot of incompatible notions, SBA assumes some family of object store models
which are enumerated M0, M1, M2 and M3[1].
The simplest is M0, which covers relational, nested-relational and XML-oriented
databases. M0 assumes hierarchical objects with no limitations concerning
nesting of objects and collections. M0 covers also binary links (relationships)
between objects. Higher-level store models introduce classes and static
inheritance (M1), object roles and dynamic inheritance (M2), and encapsulation
(M3).
As an inherent part
of SBA the book introduces the query language SBQL (Stack-Based Query
Language), which has abstract syntax and rigorous, clean formal semantics. SBQL
is incomparably more powerful than ODMG OQL and W3C XML query languages. SBQL,
together with imperative extensions and abstractions, has the computational
power of programming languages, similarly to Oracle PL/SQL or SQL-99/SQL-2003
standards.
SBQL can also be
considered a theoretical frame in the spirit of the relational algebra and
relational calculus known from the relational model and their object-oriented
counterparts, such as object algebras and F-logic. In comparison, however, SBQL
offers much wider conceptual basis concerning both data structures to be
queried and query operators. Clean, rigorous semantics of SBQL, supporting the
principle of object relativism and full orthogonality of SBQL operators create
big potential for query optimization and for strong static type checking of
SBQL queries and programs integrated with SBQL queries. The book presents how
SBQL can be extended by imperative statements (programming options such as
updating, inserting and deleting), programming abstractions (functions,
procedures and methods) and updateable object-oriented database views defined
similarly to “instead of” trigger views of Oracle and MS SQL
Server.
The book is
designated to students, Ph.D. students, lecturers, and to all computer
professionals who would like to know on query languages a bit more than one can
find in popular database textbooks. Understanding of introduced notions and
methods will allow one to design and quick implement an own, very powerful
query language for practically any purpose. Efficiency of the theory has been confirmed
by many implementations, in particular, for the object-oriented database
management system Loqis, for the XML DOM model, for the object-oriented DBMS
Objectivity/DB, for the European project ICONS (Intelligent Content Management
Systems), for workflow systems based on the XPDL standard, for experimental
object-oriented DBMS YAOD and ODRA and for several other academic projects.
Content (translated from Polish)
1 Introduction 7
2 Motivation
and general assumptions of query languages 17
2.1 What
are query languages? 20
2.2 The
significance of query languages 25
2.3 Applications
of query languages 25
2.4 Properties
of query languages 27
2.5 Impedance
mismatch 29
2.6 Data
schema and organization - inherent features of a query languages 32
2.7 Between
complexity of a data model and complexity of queries 36
2.8 Architecture
of DBMS involving query processing 40
3 Introduction to object-orientedness in databases 49
3.1 Object 50
3.2 Methods
associated with an object 54
3.3 Complex
object 56
3.4 Object
relativism 57
3.5 The
principle of internal identification 58
3.6 Links
between objects 60
3.7 Encapsulation
and information hiding 65
3.8 Mechanism
of messages 69
3.9 Classes
and static inheritance 73
3.10 Polymorphism 87
3.11 Roles
and dynamic inheritance 92
3.12 Collections 98
3.13 Persistence 110
3.14 Modules 115
4 Semantic foundations of query languages 117
4.1 Syntax,
semantics and pragmatics of formal languages 117
4.2 Abstract
syntax and syntax-driven semantics 120
4.3 Semantics
of a query language in a nutshell 121
4.4 Query
optimization 123
4.5 Principles
of of query languages 125
4.6 Elliptic
queries 126
4.7 Other
formal approaches to query languages 128
4.8 Typical
flaws of theories of query languages 134
4.9 What
is (or should be) a theory of query languages? 138
5 Semantic assumptions of the stack-based approach 141
5.1 Names,
scopes and binding 141
5.2 Operational
semantics of queries and programs 142
5.3 Syntactic
assumptions of the Stack-Based Query Language (SBQL) 143
5.4 Abstract
object-oriented store model 144
5.5 Models
M0, M1, M2 i M3 for an object store 146
5.6 Environmental
stack and name binding 174
6 SBQL (Stack-Based Query Language) 193
6.1 SBQL
syntax 193
6.2 Query
result stack (QRES) 197
6.3 General
architecture of the query processing mechanism 199
6.4 Procedure
of query evaluation eval 201
6.5 Algebraic
operators 204
6.6 Non-algebraic
operators 218
6.7 SBQL
- examples of queries and comparisons with SQL and OQL 230
7 Further properties of SBQL 243
7.1 Extensions
of SBQL in the model M1 243
7.2 Extensions
of SBQL in the model M2 252
7.3 Sorting
operator order by 258
7.4 Operator
group by - do we need it? 265
7.5 Processing
irregular data structures 270
7.6 Transitive
closures and fixed-point equations 284
8 Extensions of query languages by imperative constructs 299
8.1 Declarative
and imperative constructs 302
8.2 The
correspondence principle 305
8.3 Elementary
language constructs in imperative languages 306
8.4 Procedures,
functional procedures, methods 327
8.5 Procedure
parameters 329
8.6 Procedures
in SBQL 331
8.7 Extensions
of SBQL in the model M3 341
8.8 Scoping
rules 344
8.9 Parameters
of procedures, functions and methods 345
8.10 Recursive
procedures 351
8.11 Optimization
via query modification 353
9 Updateable views 359
9.1 Motivation 359
9.2 Definition
of views 368
9.3 Examples
of views 383
9.4 Optimization
of queries invoking views 390
9.5 Comparison
with views based on instead of
triggers 393
10 Query optimization 397
10.1 Methods
of data organization and access 403
10.2 The
method of independent subqueries 409
10.3 Utilizing
of operators distributivity property 430
10.4 Utilizing
of indices 433
10.5 Removing
dead subqueries 443
10.6 The
order of using optimization methods 449
11 Strong type checking 451
11.1 Types
and database schemata 452
11.2 Static
and dynamic type checking 455
11.3 Types
in query languages 456
11.1 The
concept of type 458
11.2 External
and internal type system 460
11.3 Utilizing
of the procedure static_type_check 463
11.4 Involving
attributes of signatures 465
11.5 Automatic
coercions and dynamic typechecking 468
11.6 Type
checking of imperative operators 470
11.7 Type
checking of procedures 472
11.8 Dynamic
object checking 473
11.9 Referential
and parameterized types 474
Glossary of terms 481
Polish-English 481
English-Polish 491
Bibliography 501
Index 517
Last modified: January 1, 2008
[1] In later sources devoted to SBA and SBQL store
models M0,
M1, M2 and M3 are denoted AS0, AS1, AS2 and AS3.