Kazimierz Subieta.

Theory and Construction of Object-Oriented Query Languages.

Editors of the Polish-Japanese Institute of Information Technology, Warsaw 2004, ISBN 83-89244-29-2 (522 pages)

Extended abstract and Content in English


Polish Edition:

 

 Kazimierz Subieta. Teoria i konstrukcja obiektowych języków zapytań. Wydawnictwo PJWSTK, Warszawa 2004, ISBN 83-89244-29-2 (522 stron)

 

Streszczenie

Spis treści

Podziękowania

Wprowadzenie

Notka historyczna

Bibliografia

Słowniczek terminów polsko-angielski

Słowniczek terminów angielsko-polski


 English translation in preparation, but with no firm deadlines.


Extended Abstract

Query languages are user-friendly interfaces enabling the users to search a database according to any criteria and freely determined retrieval goals. The most popular query language is SQL. It is considered by many professionals as an essential factor of the commercial success of relational databases. SQL is a subject of several standards, in particular SQL-89, SQL-92 and recently SQL-99/SQL2003. It is a basis for many application programming interfaces, such as ODBC and JDBC. SQL has been extended in many ways - through database updating clauses, database views, stored procedures, triggers and in direction of programming languages.

Several theories have been developed for query languages, in particular relational algebra, relational calculus, and variants of mathematical logic. Many empirical methods have also appeared, in particular concerning query optimization. There is, however, controversy concerning their role and usability. In particular, the relational algebra covers at most a few percents of SQL functionalities and it is inconsistent with it, and query optimization methods are tightly bound to physical properties of systems, they have restricted scope, and they have some conceptual gaps and inconsistencies.

Object-oriented and object-relational database systems have created new quality in the domain. This also concerns Web technologies based on the XML standard. New data models, standards, practical solutions, methods and theories have created the state that could be characterised as total chaos, lack of consistent, homogeneous theoretical background and casualty of practical solutions. The evidence of the chaos is by standards SQL-99 and ODMG OQL, which are considered by many professionals as entirely non-implementable for specification flaws.

This book presents a consistent approach to theory and construction of query languages. It is an attempt to discipline the chaos on the ground of homogeneous theory referred to as the stack-based approach (SBA). In SBA a query language is considered a special kind of a programming language. Thus, the semantics of queries is based on mechanisms well known from programming languages like the environment stack (thus the stack-based approach). SBA extends this concept for the case of query operators, such as selection, projection/navigation, join, quantifiers and others. Using SBA one is able to determine precisely the operational semantics (abstract implementation) of query languages, including relationships with object-oriented concepts, embedding queries into imperative constructs, and embedding queries into programming abstractions: procedures, functional procedures, views, methods, modules, etc.

The stack-based approach is a theory of query languages that is independent of a specific data model. It can be applied to relational, object-oriented, object-relational databases, and to XML repositories. Because various object models introduce a lot of incompatible notions, SBA assumes some family of object store models which are enumerated M0, M1, M2 and M3[1]. The simplest is M0, which covers relational, nested-relational and XML-oriented databases. M0 assumes hierarchical objects with no limitations concerning nesting of objects and collections. M0 covers also binary links (relationships) between objects. Higher-level store models introduce classes and static inheritance (M1), object roles and dynamic inheritance (M2), and encapsulation (M3).

As an inherent part of SBA the book introduces the query language SBQL (Stack-Based Query Language), which has abstract syntax and rigorous, clean formal semantics. SBQL is incomparably more powerful than ODMG OQL and W3C XML query languages. SBQL, together with imperative extensions and abstractions, has the computational power of programming languages, similarly to Oracle PL/SQL or SQL-99/SQL-2003 standards.

SBQL can also be considered a theoretical frame in the spirit of the relational algebra and relational calculus known from the relational model and their object-oriented counterparts, such as object algebras and F-logic. In comparison, however, SBQL offers much wider conceptual basis concerning both data structures to be queried and query operators. Clean, rigorous semantics of SBQL, supporting the principle of object relativism and full orthogonality of SBQL operators create big potential for query optimization and for strong static type checking of SBQL queries and programs integrated with SBQL queries. The book presents how SBQL can be extended by imperative statements (programming options such as updating, inserting and deleting), programming abstractions (functions, procedures and methods) and updateable object-oriented database views defined similarly to “instead of” trigger views of Oracle and MS SQL Server.

The book is designated to students, Ph.D. students, lecturers, and to all computer professionals who would like to know on query languages a bit more than one can find in popular database textbooks. Understanding of introduced notions and methods will allow one to design and quick implement an own, very powerful query language for practically any purpose. Efficiency of the theory has been confirmed by many implementations, in particular, for the object-oriented database management system Loqis, for the XML DOM model, for the object-oriented DBMS Objectivity/DB, for the European project ICONS (Intelligent Content Management Systems), for workflow systems based on the XPDL standard, for experimental object-oriented DBMS YAOD and ODRA and for several other academic projects.

 


Content (translated from Polish)

1   Introduction  7

2   Motivation and general assumptions of query languages           17

2.1   What are query languages?    20

2.2   The significance of query languages  25

2.3   Applications of query languages       25

2.4   Properties of query languages            27

2.5   Impedance mismatch 29

2.6   Data schema and organization - inherent features of a query languages     32

2.7   Between complexity of a data model and complexity of queries    36

2.8   Architecture of DBMS  involving query processing 40

3   Introduction to object-orientedness in databases 49

3.1   Object 50

3.2   Methods associated with an object   54

3.3   Complex object          56

3.4   Object relativism        57

3.5   The principle of internal identification          58

3.6   Links between objects           60

3.7   Encapsulation and information hiding          65

3.8   Mechanism of messages         69

3.9   Classes and static inheritance            73

3.10 Polymorphism 87

3.11 Roles and dynamic inheritance          92

3.12 Collections      98

3.13 Persistence      110

3.14 Modules          115

4   Semantic foundations of query languages            117

4.1   Syntax, semantics and pragmatics of formal languages       117

4.2   Abstract syntax and syntax-driven semantics           120

4.3   Semantics of a query language in a nutshell  121

4.4   Query optimization    123

4.5   Principles of of query languages        125

4.6   Elliptic queries            126

4.7   Other formal approaches to query languages            128

4.8   Typical flaws of theories of query languages            134

4.9   What is (or should be) a theory of query languages?            138

5   Semantic assumptions of the stack-based approach        141

5.1   Names, scopes and binding   141

5.2   Operational semantics of queries and programs        142

5.3   Syntactic assumptions of the Stack-Based Query Language (SBQL)         143

5.4   Abstract object-oriented store model            144

5.5   Models M0, M1, M2 i M3 for an object store          146

5.6   Environmental stack and name binding        174

6   SBQL (Stack-Based Query Language)      193

6.1   SBQL syntax  193

6.2   Query result stack (QRES)    197

6.3   General architecture of the query processing mechanism     199

6.4   Procedure of query evaluation eval   201

6.5   Algebraic operators    204

6.6   Non-algebraic operators         218

6.7   SBQL - examples of queries and comparisons with SQL and OQL           230

7   Further properties of SBQL                243

7.1   Extensions of SBQL in the model M1          243

7.2   Extensions of SBQL in the model M2          252

7.3   Sorting operator order by       258

7.4   Operator group by - do we need it?  265

7.5   Processing irregular data structures   270

7.6   Transitive closures and fixed-point equations           284

8   Extensions of query languages by imperative constructs            299

8.1   Declarative and imperative constructs          302

8.2   The correspondence principle            305

8.3   Elementary language constructs in imperative languages     306

8.4   Procedures, functional procedures, methods            327

8.5   Procedure parameters 329

8.6   Procedures in SBQL  331

8.7   Extensions of SBQL in the model M3          341

8.8   Scoping rules  344

8.9   Parameters of procedures, functions and methods   345

8.10 Recursive procedures 351

8.11 Optimization via query modification 353

9   Updateable views      359

9.1   Motivation      359

9.2   Definition of views    368

9.3   Examples of views     383

9.4   Optimization of queries invoking views        390

9.5   Comparison with views based on instead of triggers            393

10 Query optimization   397

10.1    Methods of data organization and access     403

10.2    The method of independent subqueries        409

10.3    Utilizing of operators distributivity property            430

10.4    Utilizing of indices     433

10.5    Removing dead subqueries    443

10.6    The order of using optimization methods     449

11 Strong type checking           451

11.1    Types and database schemata            452

11.2    Static and dynamic type checking     455

11.3    Types in query languages       456

11.1    The concept of type   458

11.2    External and internal type system     460

11.3    Utilizing of the procedure static_type_check            463

11.4    Involving attributes of signatures      465

11.5    Automatic coercions and dynamic typechecking  468

11.6    Type checking of imperative operators         470

11.7    Type checking of procedures 472

11.8    Dynamic object checking       473

11.9    Referential and parameterized types 474

Glossary of terms    481

Polish-English           481

English-Polish           491

Bibliography          501

Index   517


 

Last modified: January 1, 2008

 

 



[1] In later sources devoted to SBA and SBQL store models M0, M1, M2 and M3 are denoted AS0, AS1, AS2 and AS3.