Next: 4 Tuple Trees Up: Schema and Tuple Trees: Previous: 2 A motivating example

3 Schema trees

A schema tree describes the structure of tuples in the database. It indicates which attributes are structured and/or expandable. It also contains type information for leaf attributes. As shown above, schema trees represent many linked relational schemas. Schema trees have the following properties.

Each node in the schema tree has a unique path name.
A subtree of the schema tree headed by any node contains all the attributes logically associated with that node.
By traversing the subtree headed at any node, we can find all the relational tables logically associated with that node.
By traversing the associated subtree of the tuple tree and collecting values at leaf nodes, we can build the join of all the associated relational tables.
We can traverse the schema tree to build an intuitive user interface showing the relationship among the attributes.

Subtrees allow us to restrict our attention. For example, suppose we want to add a new residence to a client's record. To perform this task in a database using relational tables, we would (1) find the client in the Client table, read Client Id, (2) add a new row to the Residence table with Client Id and a new unique Residence Id, and (3) add one or more phone numbers to the Phones table, each with the appropriate Residence Id. With schema trees, however, we only need to find the client and add a new section of the tuple tree corresponding to the Residence attribute. Multiple phone numbers can be added to the new instance by creating a new section of the tuple tree corresponding to the Residence.Phones attribute under the same instance of Residence.

We have demonstrated that a recursively-structured schema tree can describe a set of relational tables where rows in a subordinate table are related to a single entity. In database terminology, these are one-many relations.

Tables that are related in more complex ways, such as many-many relationships, can still be described by a set of schema trees. Two schema trees are linked much in the same way that conventional relational tables are linked. For example, consider the following two schema trees linked by the SocialSecurityNumber field:

        # Person schema tree
        Name (First Last)
        SocialSecurityNumber
        Address (
            Street City State ZipCode
        )*

        # Company schema tree
        CompanyName
        CompanyAddress (
            Street City State ZipCode
        )*
        Employees (
            SocialSecurityNumber
            Position
            SalaryHistory (
                Date
                Salary
            )*
        )*

These particular relations should not be represented by a single schema tree because it is a many-many relationship. A person can be an employee of multiple companies. If you change the Address field of the Company relation, you want the Address field to change whenever accessing employees of that company.

3.1 Schema trees as full database descriptors

We have seen how schema trees can be used to describe the relationship between relational tables and to create new, logically related, entries in those tables. We now describe how Qddb uses the schema tree to fully describe a database by establishing types, verbose field names, search options, word descriptions and value formatting.

Qddb associates a type with each leaf attribute, either a numeric type (real, integer, or date), or string. Numeric values may be associated with a format for display (not input) purposes. The string type consists of words interspersed with separators. Usually, separators are taken to be any non-alphanumeric characters, but the schema may specify other separators. Strings are of arbitrary length.

Each attribute in the schema may be associated with a verbose (many-word) name for display purposes. By default, database searches cover values in all leaf attributes (in relational database parlance, all attributes are indexed), but the designer can exclude individual attributes from the indices to reduce their space requirements.

The schema itself is maintained in a free-format Ascii file that can be created by any text editor. Each attribute or subattribute in the schema is of the form:

        AttributeName ?<options>? ?(<subattributes>)? ?*?

where we use the ?? notation to indicate optional syntax. The <options> may be any of:

Option Purpose

verbosename ``string'' Verbose name of attribute

type string Attribute has type string [default]

type date Attribute has type date

type integer Attribute has type integer

type real Attribute has type real

alias AttributeName A unique alias for the attribute

separators ``string'' Words are separated by one

of the characters in ``string''

format ``string'' Format the attribute based on ``string''

exclude Exclude the attribute from indexing

defaultvalue ``string'' A default value for each instance

To represent a simple database of potential employees, we might keep their name, addresses, phone numbers, rank, and date of application. A schema for this relation could be:

        Name ( First Middle Last )
        Address (
            Street exclude type string
            City
            State
            Zip verbosename "Zip Code"
        )*
        Phones ( Desc verbosename "Description" Number )*
        Rank 
            type integer 
            format "%2d"
            verbosename "Prospect ranking"
        Date type date format "%D %T"
            verbosename "Date Applied"

Real and integer formats are specified in the standard format for printf() in the ANSI C standard, and dates are specified in the standard form accepted by the POSIX function strftime(). All formats pertain to output conventions; input conventions are those accepted by atoi() (integers), atof() (reals), and a wide variety of forms (dates).

3.2 Enumerating attributes within a schema tree

It is important to be able to uniquely distinguish attributes in a schema tree. For this reason, a schema must uniquely name attributes at the top level of each subtree. Attributes across levels may have identical names. For example, consider this schema:

        A ( B C )* B C

The leaf attributes are A.B, A.C, B, and C. When we refer to B, we are referring to the leaf attribute B, not A.B.

A depth-first traversal of the schema tree generates the set of leaf attributes. This set, displayed with one column for each leaf attribute, is called a complete row of the schema. If we wish to elide certain uninteresting parts of the complete row, we may ignore the columns comprising those parts, resulting in a partial row. For example, if we are only interested in the leaf attributes B and C, then we exclude attribute A and all its leaves from our traversal.

Next: 4 Tuple Trees Up: Schema and Tuple Trees: Previous: 2 A motivating example

Herrin Software Development, Inc.

Option	Purpose
verbosename ``string''	Verbose name of attribute
type string	Attribute has type string [default]
type date	Attribute has type date
type integer	Attribute has type integer
type real	Attribute has type real
alias AttributeName	A unique alias for the attribute
separators ``string''	Words are separated by one
	of the characters in ``string''
format ``string''	Format the attribute based on ``string''
exclude	Exclude the attribute from indexing
defaultvalue ``string''	A default value for each instance