Stack-Based Approach (SBA) and Stack-Based Query Language (SBQL)
by Kazimierz Subieta (May 2006)
Back to Description of SBA
and SBQL.
Preface
The Stack-Based Approach (SBA) is a formal
approach to object-oriented database query and programming languages. In SBA we
reconstruct query languages’ concepts from the point of view of programming
languages (PLs). The approach is motivated by our belief that there is no
definite border-line between querying and programming. All attempts to
establish it failed; see the relational completeness, being essentially
a random, poorly motivated concept on the scale of the universality of query
languages. Query languages, as facilities for database programming, absorb a
lot of PLs’ functionalities; see imperative programming extensions of
SQL-92 (update, insert, delete, stored procedures, etc.), a new SQL
standard known as SQL-99, Oracle PL/SQL, MS SQL Server Transact-SQL, the recent
J2EE Hybernate tool, the Microsoft LINQ.NET project, and a lot of Rapid
Application Development tools. Another stream of persistent and/or polymorphic
database PLs follows this line through integrating queries with programming
languages. SBA is an attempt to create a unified conceptual and semantic basis
for queries and programs involving queries, including programming abstractions
such as procedures, functions, classes, types, methods, views, etc.
SBA and SBQL are neutral with respect to data
models. SBA covers all the database models that we are aware of, starting from
the relational model, through XML-oriented data model, RDF-oriented data model,
up to sophisticated object-oriented models with static and dynamic inheritance,
collections, associations, polymorphism, etc. Our fundamental assumption is
that SBA and SBQL address data structures
rather than specific ideological assumptions and constraints known as data
models. Once one would determine how particular concepts in a data model are to
be mapped as abstract data structures, we could propose a corresponding subset
of SBA/SBQL that will be able to handle these structures with full algorithmic
universality and precision. In this way the discussion of query language we
have shifted on another level: we can talk how particular features of data
structures are to be served by SBQL rather than sticking a concrete query
language with a concrete data model. For instance, when one will determine how
XML files will be mapped as abstract data structures we can propose SBQL to
serve these structures. In this way we achieved the unique universality,
flexibility and performance optimization potential. In particular SBQL is the
first in the history query language that deals with dynamic object roles and
dynamic inheritance. Moreover, powerful query optimization methods that are
developed for SBQL are prepared to work with such advanced features.
SBA can be considered as a theoretical approach with a strong and
complete bridge to practice. Because development of SBA was preceded by several
implementations of query languages, it can also be considered as a practical
approach resulting in a consistent and universal theory.
The design of modern and universal
database PLs having querying capabilities requires methods and principles that
are already acknowledged by the common practice of developing compilers and
interpreters. Practically useful PLs must deal with object-orientedness, procedures
and functional procedures (including recursive ones), parameter passing,
various control statements, binding strategies, scoping rules, modularization,
typing, etc. They should follow software engineering principles such as
orthogonality, modularity, minimality, universality, genericity, typing safety,
and clean, precise semantics.
The above issues turn out to be very
severe for theoretical concepts developed in the database domain for dealing
with query languages, including the relational algebra, relational calculus and
formal logics. SBA is an alternative to theoretical concepts emerging on the
object-orientedness wave, such as nested relational algebras, object algebras,
object calculi, F-logic, comprehensions, structural recursion, monoid calculus,
functional approaches, etc. Careful analysis of these theoretical frames has
led us to the conclusion that all of them are too limited (and sometimes
totally inadequate, cf. object algebras)
as the semantic basis for this kind of query languages that we intend to
develop.
The SBA solution relies on adopting the
classical run-time mechanism of PLs, and then, introducing to it necessary
improvements. The main syntactic decision of our approach is unification of PL
expressions and queries; queries remain as the only kind of PL expressions. For
instance, in SBA there is no conceptual difference between expressions such as
2+2 and (x+y)*z, and queries such as
Employee where
Salary = 1000
or
(Employee where Salary = (x+y)*z).Name
All such expressions (or queries, if you like)
can be used as arguments of imperative statements, as parameters of procedures,
functions or methods, as a return from a functional procedure, etc. Note that
in our expressions/queries we avoid the extensive SQL-like syntactic sugar,
which is non-orthogonal, sometimes illogical and makes complex queries
illegible.
Concerning semantics, we focus on the classical
naming-scoping-binding paradigm. Each name occurring in a query
(or in a program involving queries) is bound to run-time programming
entities (persistent data, procedures, actual parameters of procedures, local
procedure objects, etc.) according to the actual scope for the name. The
common PLs’ approach (that we follow in SBA) is that the scopes are
organized in an environmental stack with the “search from the
top” rule (thus the stack-based approach). Some extensions to the
structure of stacks used in PLs are necessary to accommodate the fact that in a
database we have persistent and bulk data structures and the fact that data are
on a server machine, while a stack is on a client machine. Hence the stack
contains data identifiers rather than data themselves (i.e., we separate the
stack from a store of objects), and possibly multiple objects can be
simultaneously bound to a name occurring in a query, which makes the many-data-at-a-time
processing possible. The operational semantics (abstract implementation) of
query operators, imperative programming constructs and procedures (functions,
methods, views, etc.) is defined in terms of an abstract object store
and operations on two classical stacks: environmental stack (abbreviated
as ENVS) and query results stack (abbreviated as QRES).
Almost all the issues presented below are
already discussed in detail in the Polish version of the SBA/SBQL
book and in a lot of papers and reports. They will be consecutively inserted here as hyperlinks during the
progress in writing the English version of the book. Majority of language
concepts and constructs presented below are already implemented in our
prototypes and commercial systems.
Last modified: July 30, 2006