AS1 Store Model: Classes and Inheritance
by Kazimierz Subieta (March 2006)
Back to Description
of SBA and SBQL.
Back to Abstract
Object Store Models
Back to AS0
Store Model
The concept of class is an abstraction in thinking and programming which
intention is to capture both static properties of objects (i.e. their
structure) and dynamic properties of objects (i.e. operations that can be
performed on objects or by objects). The definition of a class addresses human minds
(i.e. it supports conceptual modeling) and software engines, i.e. it allows to
maintain correctly software-side data structures that are called objects. The conceptual modeling role of
a class (emphasized, in particular, in UML) is very important, but has
secondary meaning for the semantics of query languages. Hence we do not discuss
it. We consider class as a programming entity that is associated somehow with
the definition and maintenance of the data structures called objects.
In this role the concept of class has two different meanings. The first
one (popular among theoreticians) has its origin in mathematics and says that
from the semantic point of view a class is a set of objects (c.f. abstraction
classes implied by an equivalence relation). This is wrong definition for
classes understood as software units, because it does not reflects important
properties of classes, such as methods. All known object-oriented programming
languages, standards and systems assume (usually implicitly) another definition
of the class concept, which says that:
A class is a container
storing invariants (common features)
for some population of objects.
Usually, object-oriented literature distinguishes two kinds of
invariants that are stored within classes: typing information (i.e. names of
object’s attributes together with their types) and methods (operations)
that can be fired on objects (together with typing information concerning
methods’ parameter and output). This is not obligatory. For instance,
Smalltalk objects have no types, hence classes in Smalltalk contain no typing
information. On the other hand, some object-based models do not involve
methods. We can also present other invariant kinds that are stored within
classes. In particular, CORBA IDL interfaces may also specify exceptions and
reactions on exceptions, ODMG interfaces and classes may contain additionally
relationships among objects, extents and keys. Some database models as an
object invariant stored within a class include also the objects’ name.
Inheritance means that
more general invariants are stored in more general classes and more specific
invariants are stored in their sub-classes.
A class that is most specific for an object inherits more general
invariants from its super-classes. For instance, a class FirstYearStudentClass inherits more general invariants from the
class StudentClass, and yet more
general invariants from the class
PersonClass.
As follows from the discussion, we consider classes and types as two
different semantic beings, with different roles. Typing information is
necessary to strong static type checking of queries or programs acting on
objects and/or for dynamic checking of objects’ structure. Classes may
contain types as a particular kind of invariants, but in principle this is not
obligatory (although desirable). Classes contain all the invariants that can be
factored out as a common part of objects’ semantics, in particular,
methods, objects’ name, etc.
Typing information is also the main component of an object interface,
but again types and interfaces are different programming beings, with different
semantic roles.
An interface is a
programming being that has to inform the programmer how a corresponding object
can be manipulated, what are constraints concerning the manipulations, and what
will be the results of the manipulations.
Again, interfaces can also be defined without typing information,
although the mix of typing information with interfaces we consider desirable.
Also classes and interfaces are different programming beings which
should not be confused (as e.g. in the ODMG standard). Classes contain
implementation, while interfaces are specifications only. Perhaps the most
fundamental difference between classes and interfaces is that classes can be
the subject of trade (they can be sold and bought), while interfaces cannot.
This subdivision is independent on keywords that are used in a particular
artifact. In the ODMG standard no classes can be specified; both keywords class and interface denote interfaces and the semantic difference between the
concepts that the ODMG suggests is artificial and invented.
Interfaces are important as a pragmatic part of a query language, but
essentially they have little significant for the semantics of query languages. More
precisely, typing information that is a component of an interface has some
influence on the semantics, but not the major one; we return to this issue
later. The AS1 store model that we intend to define will not be associated with
types and interfaces. Interfaces, however, is a usual way to deal with encapsulation; we return to this issue
when we will consider the AS3 store model. Types and schemata (as more
difficult features that usually expected) will be introduced much later.
Concerning classes, we can distinguish two forms of them:
·
Classes
that are parts of a source text file prepared by the programmer in some text
editor. In some languages and systems (e.g. C++) this is the only form in which
the classes exist. No concept of a class exists in the run-time environment;
after compilation a class looses its identity, it is a part of an executable
code and cannot be identified by any programming means. In such cases we will
say that classes have the second-class citizenship: they exist in the source code,
but they cannot be identified or manipulated during run-time.
·
Classes
are run-time entities that can be identified and manipulated (e.g. tested,
bound, created, removed or altered) during run time. Such a class must possess
its identity on the same principle as the identity for objects.
If only the first form exists, the binding of all names referring to
properties of a class and occurring in a query must be done during the
compilation time, i.e. a query must be compiled and linked together with the compilation
and linking of a class it refers to. However, this is contrary to a basic
property of queries, which in many cases must be created and interpreted during
run-time. For instance, one can create and execute an ad-hoc query during
operation of the database, when the whole program containing classes is already
compiled and is currently executed. Because queries occur within client
applications rather than within database servers, the second class citizenship
means that classes are not properties of a database, which is contrary to the data
independence principle.
Hence, concerning the semantics of a query language, the second class
form is essential. Of course, the first form nevertheless must exist - the
programmer determines classes within a source text file, including source codes
of implementations of methods. After compilation such a class is converted into
the second form, which is then used by the query engine.
In the AS1 store model with deal with classes in the second form only. A
class is an object recorded in an object store. We associate with this object a
special meaning and operations, but it will be clear after we define semantics
of query operators.
ODMG essentially assumes no class stored as an object on the side of the
database server. The standard presents only the first form of the
class/interface representation; the same concerns the meta-model of the
database (which is introduced informally and extremely obscurely). Absence of
an object representing a class during run-time causes that binding of the
properties of classes (e.g. names of methods) within OQL queries (which are
run-time rather than compile time entities) has unknown addressee. Hence, in
our opinion, the ODMG standard violates the assumptions of typical programming
languages’ early binding mechanisms. Practically, this means that binding
methods in OQL is non-implementable for majority of cases.
In AS1 an object store is defined as a five-tuple <S, C,
R, CC, SC>, where:
·
S is a set of (perhaps nested and linked)
objects, as in AS0.
·
C
is a set of classes. Classes are objects too.
·
R is a set of identifiers of root (start)
objects, as in AS0. Usually we assume that identifiers of classes are not among
root identifiers.
·
Relation
CC Ì IC
× IC
determines inheritance among classes. IC
Ì I denotes
identifiers of classes. If <i1,
i2> Î CC,
then the class identified by i1 inherits
from the class identified by i2.
The relation CC should not contain cycles.
·
Relation
SC Ì IS
× IC
determines membership of objects in classes. IS Ì I denotes identifiers of objects
which are not classes. If <i1,
i2> Î SC,
then the object identified by i1 is
a member of the class identified by i2.
Each invariant stored within a
class should be decorated by a flag determining its kind (a method, an object
name, an export list, a trigger, etc.) but in our examples we skip these flags
treating them as self-evident.
Note that the AS1 model is a superset
of the AS0 model. We do not require that each object must belong to a class. It
makes a sense to establish classes only in cases when they store some
non-trivial invariant of a population of objects. In no such an invariant can
be established, determining a class makes no sense because it does not change
anything in the semantics.
In Fig.8 and Fig.9 we present an
example of an AS1 object store.
S – Objects
< i1,
Person, {< i2, name,
”Doe”>, ... } >,
< i5,
Emp, {< i6, name,
”Poe”>, < i7,
sal, 2000>, < i8,
worksIn, i22>, ... }
>,
< i9,
Emp, {< i10, name,
”Lee”>, < i11,
sal, 900>, < i16,
worksIn, i33>, ...}
>
C - Classes
< i40,
PersonClass, {< i41,
age, (...the code of the method age)>,
... other invariants of the PersonClass...
} >,
< i50,
EmpClass, { < i51,
changeSal, (... the code of the method changeSal...)>,
< i52, netSal, (... the
code of the method netSal ...)>, ... other invariants of the EmpClass... } >
R - Start identifiers
i1, i5, i9
CC - Inheritance among
classes
< i50,
i40>
SC - Membership of
objects within classes
< i1,
i40>, < i5, i50>, < i9,
i50>
Fig.8. Example of an AS1 object store

Fig.9. Graphical representation of the example
AS1 object store
As before, in the graphical representation the identifiers of root
objects are within circles. Identifiers of classes are not among root objects,
hence we assume that queries cannot directly refer to classes. For some
purposes, e.g. administration of the store, we can imagine that class objects
such PersonClass and EmpClass can be manipulated, e.g.
removed or altered. Under this
assumption, identifiers of them should belong to root identifiers, but perhaps
only in the special administrative mode. An arrow with a big white triangle end
denotes inheritance (CC) and thick gray arrows denote membership of objects
within classes (SC). Classes contain methods, together with their compiled
implementation. Methods are understood as procedures with some specific scoping
rules; this will be explained later. To simplify the picture in this representation
we do not present formal parameters of the methods; this feature will also be
introduce later. Note that this is an abstract view; the relations CC and SC
can be implemented physically in many ways. For instance, the SC relation can
be implemented by containers storing objects belonging to particular classes.
So far we also say nothing about how the relations CC and SC will be used by
the query execution engine; this will be considered later. Our intention is to
define the abstract store in which such relations can be recorded.
The AS1 store model covers also multiple inheritance and the possibility
that one object is a direct member of more than one class. We allow for pairs
<i, i1>, <i,
i2> Î CC
such that i1 ≠ i2; similarly for SC. Such
situations can be handled by the defined query engine with no difficulties.
There are cases when multiple inheritance and multiple membership are
reasonable, thus we do not forbid them.
The AS1 model is an abstraction over the most popular models of
object-oriented programming languages, modeling tools and database systems. It
allows for accomplishing the substitutability
principle. The principle is quite easy to implement through a proper name
binding algorithm within the query execution engine. Although the model seems
to be simple and natural, it leads to problems concerning, in particular,
multiple inheritance and repeating inheritance. In particular, if we assume
that classes A and B are developed independently, they may contain methods
having the same name and type and we define a class C that inherits from A and
B, then one of the two fundamental principles of object-orientedness - the
substitutability principle or the open-close
principle - must be violated. The AS1 store model has also severe
disadvantages as a database model, because (as we have argued before)
substitutability is in contradiction with the concept of collections of objects
and the open-close principle. For these reasons we introduce the AS2 store
model, which is the cure for all the conceptual shortcomings of AS1.
Last modified: December 31, 2007