1 Introduction - Database Technology Group

Transcrição

Advanced Techniques
in
Personalized Information Delivery
Dr.-Ing. W. Lehner (Hrsg.)
Lehrstuhl für
Datenbanksysteme
Institut für Informatik
Friedrich-AlexanderUniversität
Erlangen-Nürnberg
Preface
While most of our day-by-day interactions with information systems are of pull-based nature, the availability of mobile information appliances like smart phones, PDAs, or even laptop computers demands permanent and personalized information supply without explicitly
querying the information sources. This shift from pull- to push-based information delivery
hides a tremendous potential as well as the risk of information flooding. Thus the need of
systems providing a push-based and personalized information service is obvious.
In general we may distinguish four classes of personalized information services. The lowest
class applies to broadcasting, where information is simply distributed to every consumer device. Either the end user or an appropriate application on his/her side is responsible for filtering and further processing. The next class of data dissemination provides a channel-based
addressing scheme, where users register explicitly for certain channels. Every information
fed to the selected channels by any information producer is forwarded to the end user without
further filtering. Sample application scenarios may be seen in news- or stock-tickers. The
third class of personalized information services may be considered the content-based filtering scheme provided by data notification services. In this context, the user may specify a
predicate addressing the relevant information with regard to a certain channel, thus providing
extensive information filtering.
All these concepts covered by broadcast, dissemination or notification preserve the original
messages produced by the publisher. Any further processing like combining information
from multiple information sources or aggregating numerical or textual information is not
supported. Therefore the class of subscription systems allows users to register complex queries in combination with specific delivery criteria. Once messages are produced by publishing components, they are integrated into a publication-consistent global database and propagated to the appropriate subscribers. The application context, the resulting requirements, the
existing technologies, and advanced strategies from a modeling as well as implementional
perspective are the focus of this report.
The collection of papers is subdivided into four parts. The first two articles give a general
overview and a comprehensive introduction into the publish/subscribe area. While the first
paper emphasizes the interface between publish/subscribe systems and classical database
4
Preface
concepts, the second paper provides a general overview of the subscription system PubScribe
to build an open information market where information may be freely published and users
may specify their personalized delivery strategies.
The second part of this report contains a comprehensive discussion of personalized information delivery in the context of data warehousing and eBusinnes application areas. The first
paper in this part, ’Revealing Real Problems in Real Data Warehouse Applications’, pinpoints several requirements with regard to a successful data warehouse installation. The following contribution extends the classical demand-driven data warehouse architecture by an
additional subscription component providing flexible data integration as well as push-based
data delivery. The use of subscription technology in the context of contract negotiation in
electronic business environments is discussed in the following article. Subscription technology is used during an informational match making phase between the selling and buying
side.
The third part includes a collection of papers addressing technological issues from several
directions. The first contribution discusses maintenance strategies to synchronize materialized views with changes to the base tables implemented in IBM DB2/UDB. Efficient synchronization techniques may be considered the core of an efficient propagation mechanism
of incoming messages to the corresponding subscribers. While the second paper addresses
the producer of a subscription system by reviewing web site scraping technologies and proposes a new iterative mechanism called XWeb, the third article in this part gives an example
of possible target systems of subscription delivery in the context of footprint database systems running on handheld devices.
The last part finally provides an in-depth explanation of the research prototype PubScribe. In
general the PubScribe system aims at implementing a highly scalable and open subscription
system based on relational database technology. The first contribution introduces the general
architecture at a microscopic and at a macroscopic level. The second paper emphasizes the
different modeling perspectives from a structural and dynamic perspective. The last article
closes the description by sketching implementational issues like mass query optimization
and mapping schemes to relational structures.
In summary, the reader of the report will gain an in-sight knowledge of general requirements,
special characteristics, and technological impacts as well as on-going work in the field of
subscription systems.
Erlangen, February 2001
Dr.-Ing. Wolfgang Lehner
Table of Contents
Preface ................................................................................................................................... 3
Table of Contents .................................................................................................................. 5
PART A: INTRODUCTION ..................................................................9
The Revolution Ahead:
Publish/Subscribe meets Database Systems ............................................... 11
(Wolfgang Lehner, Wolfgang Hümmer)
Abstract ...................................................................................................................... 11
1. Introduction ...................................................................................................... 12
2. Information Delivery Criteria ........................................................................... 14
3. An Overview of Existing Event Notification
and Data Dissemination Systems ..................................................................... 16
4. The PubScribe Subscription Model .................................................................... 17
5. The PubScribe Message Processing Model ........................................................ 20
6. Summary........................................................................................................... 24
References .................................................................................................................. 24
Building An Information Marketplace using a Content
and Memory based Publish/Subscribe System .......................................... 27
(Wolfgang Lehner, Wolfgang Hümmer, Michael Redert)
Abstract ...................................................................................................................... 27
1. Introduction ...................................................................................................... 28
2. Related Work .................................................................................................... 30
3. Subscription Semantics..................................................................................... 31
4. The PubScribe Communication Pattern.............................................................. 34
5. The Message Processing Model ....................................................................... 37
6. Implementational Perspectives ......................................................................... 43
7. Summary and Conclusion................................................................................. 45
References .................................................................................................................. 46
6
Table Of Contents
PART B: APPLICATIONS ................................................................. 47
Revealing Real Problems in Real Data Warehouse Applications ............ 49
(Wolfgang Lehner, Thomas Ruf)
Abstract...................................................................................................................... 49
1. Introduction ...................................................................................................... 50
2. Modeling the World of Business...................................................................... 51
3. User-Friendly Warehouse Access .................................................................... 54
4. Ways of Optimizing Aggregation Processing .................................................. 56
5. The Magic Triangle of Summary Tables ......................................................... 59
6. Summary and Conclusion ................................................................................ 61
References.................................................................................................................. 63
Publish/Subscribe-Systeme im Data-Warehousing:
Mehr als nur eine Renaissance der Batch-Verarbeitung.......................... 65
(W. Lehner, W. Hümmer, M. Redert, C. Reinhard)
Kurzfassung ............................................................................................................... 65
1. Einleitung ......................................................................................................... 66
2. Logische PubScribe Architektur ...................................................................... 69
3. Verarbeitungsmodell ........................................................................................ 74
4. PubScribe Implementierung ............................................................................... 76
5. Verwandte Arbeiten und Projekte .................................................................... 79
6. Zusammenfassung und Ausblick...................................................................... 80
Literatur ..................................................................................................................... 81
Über Anbahnung und Aushandlung von Verträgen im eBusiness .......... 83
(H. Wedekind, W. Hümmer, W. Lehner)
1. Einleitung ......................................................................................................... 84
2. Aspekte des eBusiness...................................................................................... 85
3. Anbahnung auf Basis von ‚Publish/Subscribe‘ ................................................ 88
4. Vertragsverhandlungen als eine dialogische Inhaltsbearbeitung ..................... 94
5. Zusammenfassung und Ausblick...................................................................... 98
Literatur ................................................................................................................... 100
Table Of Contents
7
PART C: TECHNOGICAL ISSUES...................................................101
Maintenance of Automatic Summary Tables in IBM DB2/UDB ........... 103
(Wolfgang Lehner, Bobbie Cochrane, Richard Sidle, Hamid Pirahesh,
Markos Zaharioudakis)
Abstract .................................................................................................................... 103
1. Introduction .................................................................................................... 104
2. Definition of Automatic Summary Tables ..................................................... 104
3. Incremental Maintenance of ASTs ................................................................. 106
4. Correctness of the ‘Cube Delta’ Approach..................................................... 108
5. Full Refresh of ASTs ...................................................................................... 112
6. Summary and Future Work ............................................................................ 117
References ................................................................................................................ 117
Server Side Website Wrapping: The XWeb approach ........................... 119
(Jürgen Lukasczyk, Wolfgang Hümmer)
Abstract .................................................................................................................... 119
1. Introduction .................................................................................................... 120
2. Related Work .................................................................................................. 122
3. General Aspects of Wrapping......................................................................... 126
4. The XWeb Framework ................................................................................... 132
5. An Example: XWeb wraps news.com ............................................................ 137
6. Conclusion ...................................................................................................... 141
References ................................................................................................................ 143
Platform Invariant Database Design for Information Appliances ........ 145
(Ulrich Grießer, Wolfgang Hümmer, Wolfgang Lehner)
Abstract .................................................................................................................... 145
1. Introduction .................................................................................................... 146
2. Overview of Handheld Device Operating Systems ........................................ 147
3. EPOC Release 5.............................................................................................. 150
4. Palm OS Version 3.x ...................................................................................... 155
5. Windows CE Version 3.0 and ”PocketPC” .................................................... 157
6. Other Handheld Operating Systems ............................................................... 161
7. Platform Invariant Design of Applications..................................................... 163
8. Summary and Conclusion............................................................................... 169
References ................................................................................................................ 170
Internet Links ........................................................................................................... 170
8
Table Of Contents
PART D: PUBSCRIBE ...................................................................... 171
PubScribe Mikro- und Makroarchitektur ................................................... 173
(M. Redert, C. Reinhard, W. Lehner, W. Hümmer)
Kurzfassung ............................................................................................................. 173
1. Einleitung ....................................................................................................... 174
2. Architekturentwurf ......................................................................................... 174
3. Informationskanäle des Brokers..................................................................... 177
4. Verteilter Architekturentwurf......................................................................... 182
5. Zusammenfassung und Ausblick.................................................................... 188
Literatur ................................................................................................................... 189
Strukturelle und Operationelle Modellierungaspekte in PubScribe ......... 191
(M. Redert, W. Lehner, W. Hümmer)
Kurzfassung ............................................................................................................. 191
1. Einleitung ....................................................................................................... 192
2. Beispielszenario.............................................................................................. 192
3. Struktureller Aspekt ....................................................................................... 193
4. Operationeller Aspekt..................................................................................... 195
5. Modellierung von Produzenten ...................................................................... 203
6. Modellierung von Subskriptionen .................................................................. 206
7. Zusammenfassung .......................................................................................... 211
Literatur ................................................................................................................... 211
Ausgewählte Konzepte der Realisierung von PubScribe............................ 213
(M. Redert, C. Reinhard, W. Hümmer, W. Lehner)
Kurzfassung ............................................................................................................. 213
1. Einleitung ....................................................................................................... 214
2. Registrierung einer Subskription.................................................................... 214
3. Produzentenregistrierung und Publikation ..................................................... 227
4. Anfragenübergreifende Optimierung ............................................................. 231
5. Evaluierung anfrageübergreifender Optimierungsmethoden ......................... 244
6. Zusammenfassung .......................................................................................... 249
Literatur ................................................................................................................... 249
A
Introduction
10
Part A: Introduction
THE REVOLUTION AHEAD:
PUBLISH/SUBSCRIBE MEETS DATABASE SYSTEMS
Wolfgang Lehner, Wolfgang Hümmer
University of Erlangen-Nuremberg
Martensstr. 3, Erlangen, 91058, Germany
EMail: {lehner, huemmer}@informatik.uni-erlangen.de
Abstract
The traditional way of user interaction with a database system follows the classical ’request/response’ query paradigm where the user is issuing a query and retrieving
the result as fast as possible. The primary issues are efficiency and consistency. Driven
by huge numbers of concurrent users accessing large data sets especially in the context
of Data Warehouse Systems, the ’request/response’ paradigm looks no longer feasible.
There is commonly no doubt that the novel query paradigm of ’publish/subscribe’ may
be one way to diminish the before mentioned problems. In the ’publish/subscribe’ context, a user registers a subscription (also called a standing query) once at a subscription
management system and periodically or aperiodically receives notifications, i.e. the
result of the query with regard to the currently valid state of the underlying database.
In this paper, we give an overview of the ’publish/subscribe’ paradigm in general and
introduce the framework of the content- and memory-based subscription system PubScribe. The PubScribe framework provides an open platform for information publishing
and registering of complex structured subscriptions. The PubScribe subscription model as
well as the processing model are discussed.
12
1
1. Introduction
Introduction
The everyday interaction with any sort of databases is following the classical ’request/response’ paradigm, where the user (or client) is posing a query, and the (database) system (or
server) tries to execute the query as fast as possible and the result set is transferred into the
user’s application context (Figure 1a). However, it would be nice to move from a system-oriented perspective, where the user has to explicitly interact with a database system, to a dataoriented perspective, where the user simply has to register some kind of information template
(subscription) and automatically gets notifications over new or updated information within
the underlying database ([Norm98]).
In this paper we address such a kind of interaction paradigm ([Birm93]), typically called
“publish/subscribe”, where producer and consumer of information do not know each other
and communicate via a subscription management system (figure 1b; [OPSS93], [Chan98]).
Producers (’Publishers’) are sending new information in form of messages to the subscription management system as soon as the information is available. Newly incoming messages
are merged into a global logical database and standing queries, specified in the context of a
subscription, are evaluated. Furthermore, query specification and result transmission are decoupled from a subscriber’s point of view. Once a query is formulated, the user is no longer
in contact with the database system, but receives a notification only if new messages of interest have arrived at the database system or a given time interval has gone by. The advantage
for the user is tremendous: once a query (in form of a subscription) is specified, the user does
not have to wait for the answer of the query because the result will be delivered automatically
according to the predefined delivery properties. From a more global information provider
perspective, we can argue that the ’publish/subscribe’ paradigm helps the users to efficiently
filter the vast amount of available information with regard to only ’relevant’ information
pieces. The advantages of the ’publish/subscribe’ paradigm for a database system are also of
fundamental importance: the execution of queries does not necessarily proceed isolated from
each other. Instead, multiple subscription queries may be clustered together and executed simultaneously ([Sell88]) thus decreasing the overall query runtime and increasing the overall
capacity of a database system. Obviously, handling thousands of subscriptions within a single system requires specific support from a database system.
’Publish/Subscribe’ Sample Scenario 1: News-Service
A very popular and already prospering service implementing a very limited kind of ’publish/
subscribe’ system may be seen in news services, sending emails on a regular basis containing
abstracts of various news articles ([RaDa98], [Mari97]). However, we envision a more intelligent subscription management system, attaching additional information to incoming articles, e.g. adding stock information to company names and weather information to city or region names. Figure 2a shows this scenario with three different information publishers on the
13
request
consumer
producer response
publish
publisher
consumer
producer
publisher
consumer
a) the ‚request/response‘ paradigm
SubscriptionManagement-System
1 Introduction
subscribe
subscriber
subscriber
subscriber
delivery
b) the ‚publish/subscribe‘ paradigm
Fig. 1.1: Comparison of ‚request/response‘ and ‚publish/subscribe‘ paradigm
one hand but only one single type of subscribers (emails recipient) on the other hand. It is
worth to mention here that even a (passive) web server (e.g. the weather service) may play
the role of a publisher. This implies that the terms ’publish’ and ’subscribe’ must not be
mixed up with the very modern concepts of ’push’ and ’pull’ (section 2) mainly stressed in
the web application domain ([Hack97], [Fran97]).
’Publish/Subscribe’ Sample Scenario 2:
Decision Support in Data Warehouse Environments
A Data Warehouse System provides an integrated, long-term, and logically centralized database to retrieve information necessary for decision support, supply chain management, customer relationship management etc. Due to the nature of data warehousing, these database
applications exhibit specific characteristics with regard to data volume, update characteristics, and aggregation-oriented organization of data. Information stored in a data warehouse
is usually accessed using predefined reports printed on a regular basis, interactively explored
using standard OLAP tools, or exported into special statistical software packages
([CoCS93]). Figure 2b shows a scenario, where different data sources are publishing to the
subscription management system. The subscription management system performs analyses
on the data and satisfies a huge variety of different subscriber types. In general, a subscription
management system may be seen as an addition to a state-of-the-art data warehouse environment, where decision makers are informed about regularities as well as irregular patterns
within the database.
news
agency
stock
broker
Publisher
marketing
sales
user
email
a) news service example
customer service
logistics
production
distribution
Subscriber
weather
service
Subscriber
Publisher
finance
accounting
auditing
HR
purchasing
controlling
b) subscription service for a
data warehouse environment
Fig. 1.2: Examples of ’Publish/Subscribe’ Scenarios
14
2. Information Delivery Criteria
The PubScribe Framework
The proposed framework of a content- and memory-based subscription-management system
extends the classical paradigm of ’publish/subscribe’ in the following directions:
• The role model
Every communication within the PubScribe framework relies on the principles of ’publish/subscribe’. This implies that the notion of publisher and subscriber are only roles
which may be taken on by all participating components.
• The processing model
PubScribe does not only route incoming messages efficiently to subscribed recipient but
also provides a processing model as a mean to specify complex processing operators.
Thus, incoming message holding new sales data in a data warehouse scenario, may be
evaluated into multiple directions to answer complex decision support queries.
Structure of the Paper
As we will show in the remainder of this paper, the ’publish/subscribe’ paradigm is an efficient and adequate mean to overcome limitations caused by the traditional query paradigm
of ’request/response’ and may build the basis of a powerful platform for open and selective
information distribution. The next section reveals different parameters which have to be considered in the design of a subscription management system. Section 3 provides a comprehensive overview of existing subscription management systems following a content-based subscription specification. Section 4 details the PubScribe subscription model and section 5 outlines the PubScribe message processing model and sketches the internal representation on top
of a relational database system. The paper concludes with a short summary given in
section 6.
2
Information Delivery Criteria
In this section, we identify and explain a number of different information delivery characteristics ([AcFZ97]), which are useful to address general problems in subscription management
systems as well as to classify our own novel approach PubScribe.
Push and Pull++ versus simple Pull
From the traditional ’request/response’ paradigm point of view, the consumer or client follows a pull strategy in fetching the results from the database server into its application context. So the consumer initiates the server to send him the desired information. The opposite
strategy is to push information from the server to the clients ([Fran97]), i.e. the consumer
receives new information whenever the server thinks it is necessary. Multiple applications are
already following that paradigm in the web context. News articles or stock trade information
are supplied by various information brokers. However, pushing data to a client is not directly
2 Information Delivery Criteria
15
supported by the underlying transport protocols (i.e. TCP, HTTP, ...). This implies that all
such push services are implemented in an extended pull style, i.e. a piece of software is running at the client side in the background permanently polling for new information, thus pretending a pushing server to the user. Such a strategy is called smart pull or pull++
([DeJe97]). Another technique for simulating push is server-initiated pull: the server sends
a short notification to the client stating that there is a new document ready for delivery. The
client then downloads this document by a pull operation ([Hack97]). Once more it has to be
said that the concepts of push vs. pull and publish/subscribe vs. request/response are completely orthogonal to each other. Publish/subscribe for example can be implemented using
push as well as pull techniques for data delivery.
Time Driven versus Data Driven Notifications
The second criterion with regard to information delivery is to distinguish the kind of ’event’
which has to happen to send out a notification to the client ([HaNo99]). Notifications are sent
out either periodically after a specified amount of time has passed or are initiated due to a
data driven event. A typical example of the first case is to send out an electronic news letter
every day at 6 p.m. A new letter (or the collection of accumulated single news articles) is simply sent out after another 24 hours have gone by. On the other hand a subscriber may be interested in getting notified by the dissemination system aperiodically, i.e. only when a certain event occurs, e.g. a stock value the subscriber is interested in passes a certain threshold.
Data driven events are usually connected to insert or update operations in the underlying database and result in aperiodic notifications. They are closely related to the trigger concept of
active and relational database systems. In practice the combination of both notification
modes is most interesting, e.g. you want to be informed immediately if the IBM stocks fall
below a certain value and additionally get a weekly summary how it performed through the
week.
Subject Based versus Content Based Subscriptions
A subject-based subscription implies that all events belong to a certain group (or channel)
and are automatically sent to all subscribers without further filtering, i.e. without analyzing
the content of the event. Thus in a subject-based system publishers and subscribers are bound
to a number of predefined channels presenting specific topics ([Elle97], [RaDa98]). Publishers have to assign their products to a certain channel and subscribers may choose among the
set of all existing channels which they want to subscribe to. A good example for subjectbased subscriptions are mailing lists, where a user may register him-/herself for the corresponding mailing list at providers like www.onelist.com. All postings to this mailing list are
forwarded to the subscriber either right away or as a daily digest. In the opposite, contentbased subscriptions are referring to a specific information or a portion of a channel inside
the single messages without regard to the subject. Typically content-based subscriptions are
coming along with a predicate identifying the context of the subscription. Therefore, contentbased subscriptions free producers and consumers from binding to specific channels. Publishers can publish independently and subscribers can subscribe independently. Especially
16
3. An Overview of Existing Event Notification and Data Dissemination Systems
the subscriber is not restricted by choosing from a predefined set of channels. Certainly
someone has to pay the price for the freedom of publishers and subscribers: managing content-based subscriptions is by far more complex than in the subject-based way: The system
has to ’scan’ all incoming messages/events and check all possible predicates provided by the
subscribers. Thus the subscription management system has to be very powerful.
Broadcast versus Unicast Data Delivery Mechanisms
The way of data delivery opens up a third dimension to classify proactive data delivery services. Broad-/multicasting (or flooding) is a often used to disseminate data from servers to
clients. Although this 1-to-N data delivery strategy seems adequate for subject-based subscriptions, it is not feasible for (the more interesting) content-based subscriptions, because
the client has to post process, i.e. filter the received data according to the content predicates.
Depending on the selectivity of the client’s content predicate, flooding yields a huge potential
for wasting resources. In the opposite, using unicast connections to deliver the information
from the server to the client is recommended for highly specialized content-based subscriptions but produces even more load on the subscription management server. A reasonable
compromise for certain applications may be the use of multicast connections to implement
channel-based subscriptions directly on the network level ([HuSu97]).
Full Update versus Incremental Update
For the subscription management system, it is important to know what to do with the data the
client already has got by a previous delivery. In case of a thin client like a simple web browser
without any application logic and local storage capacity, the server always has to deliver a
full update to the client. However, if the client is only interested in the current value of a
business figure, e.g. a certain stock value, or it is able to combine the values already received
with the latest value on its own, the server system should choose an incremental update
strategy, i.e. it will only send the delta changes and thus save network bandwidth and perhaps
server memory.
3
An Overview of Existing Event Notification and Data
Dissemination Systems
This section characterizes the most interesting event notification and data dissemination services which are of interest in the context of ’publish/subscribe’ systems. We intentionally focus on content-based systems and omit a vast number of channel-based systems, found especially in the context of web applications like PointCast, Marimba, BackWeb, etc. For a comprehensive overview and comparison, we refer to [Inri97] and [Hack97].
• SIENA (Scalable Internet Event Notification Architecture, [BKS+99]) targets a scalable event service for highly distributed applications, which is realized by a network of
distributed event servers organized in a hierarchical or peer-to-peer topology. Notifica-
4 The PubScribe Subscription Model
17
tions are sets of attributes consisting of an identifying name, a data type and a value.
Subscriptions are combinations of ’event filters’ representing logical predicates over
attributes. Subscription and notification are said to be compatible if a notification sent
out by a server contains all attributes of a subscription and further fulfills its event filters.
• In the Elvin system ([FMK+99]) publishers (producer clients) are special servers not
using the subscription engine and never getting notified. Subscribers (consumer clients) are also special servers ignoring any subscription requests from the server. The
content-based subscriptions consist of boolean expressions specified in a specifically
designed subscription language even comprising full support for regular expressions.
Sample applications based on Elvin are Tickertape, a configurable news ticker, or
Breeze, an event driven workflow engine.
• SIFT (Stanford Information Filtering Tool; [YaGa95]) is a large scale information dissemination service supporting full text filtering and emphasizing the information retrieval application area ([FoDu92]). Subscriptions are set up by specifying an information profile with different parameters concerning update frequency or information
amount. Two filtering models (vector space model and boolean model) are used to
compute a relevance factor for an incoming message. Documents with a higher relevance factor than the specific relevance threshold given in the subscription are delivered to the subscriber.
• Gryphon ([SBC+98]) is a scalable message brokering system developed by IBM that
allows, similar to PubScribe, content based subscriptions based on an information filtering graph consisting of information spaces (the nodes storing event histories or interpretations) and information flows. Along these edges filtering, merging (union semantics, no join), transformation, interpretation and expand operations can be applied. The
Gryphon project specializes in optimizing this graph and in distributing it over several
broker processes.
• The Continual Query Project (CQ, [LiPT99]) is a publish/subscribe system for querying large distributed and heterogeneous information sources. Within the CQ project a
subscription consists of a standard SQL query, a time/event based trigger condition for
delivery and a termination condition. The CQ Project is able to process subscriptions
over multiple sources as demonstrated with one of CQ’s related projects (Query Router, [LiPH00]).
4
The PubScribe Subscription Model
This section explains the PubScribe Subscription Model in detail. First of all we sketch the
logical architecture of PubScribe, then introduce different types of supported subscriptions
and finally outline the components of a subscription in the PubScribe context.
18
4. The PubScribe Subscription Model
The Logical PubScribe Architecture
The logical architecture of the proposed PubScribe service is based on the ANSI/SPARC 3schema architecture for database systems ([TsKl78]). This architecture proposes a 3 layered
approach with an internal, conceptual, and external schema. Transferring this approach into
the context of ’publish/subscribe’, we identify the internal schema for representing all incoming messages from the publishing side and we view the subscription-oriented part as the
external schemata. Thus, like in database systems, a single and evaluation independent conceptual schema decouples the incoming messages from the outgoing subscription deliveries.
Figure 3 illustrates the PubScribe framework according to the 3 schema architecture. We have
to note here that a subscription (better: the instantiation of subscription, i.e. a piece of data)
may also appear as a message on the publishing side of another PubScribe service.
The internal schema holds all incoming messages as individual entities in the order of their
arrival in the system. These published messages are then compiled into a single conceptual
schema. This separation yields a first decoupling between incoming messages and the evaluation process of subscriptions. The external schemata represent the set of registered subscriptions in the systems. Obviously, a subscription is not only a single database query but
extended with additional properties. Logically, each subscription is derived from the integrated database represented by the conceptual schema. Physically however, similar subscriptions
are clustered and evaluated simultaneously (section 5).
Types of PubScribe Subscriptions
As one extension with regard to existing subscription systems, the PubScribe system supports
three different types of subscriptions based on the data requirement:
• snapshot subscriptions:
A snapshot ([LHM+86]) subscription may be answered by processing only the most
current message of a publisher. Snapshot subscriptions require ’update-in-place’ semantics.
Example: A subscription regarding the current weather condition only refers to the
last available information.
external schema
registered
subscriptions
conceptual schema
integrated
database
internal schema
contributing
publications
Fig. 4.1: PubScribe architecture according to ANSI/SPARC 3-schema-architecture
4 The PubScribe Subscription Model
19
• ex-nunc (’from now on’) subscriptions:
Ex-nunc subscriptions are based on a set of messages. This set of messages is constructed starting from an empty set at the time of registering a subscription. Some kind
of data store is necessary in this case.
Example: Computing the value of a three-hour moving average of a stock price starts
with a single value for the first hour, the average of two values for the second hour, and
the average of three values for all other hours.
• ex-tunc (’starting in the past’) subscriptions:
Ex-tunc subscriptions are based on data from the past plus current information.
Example: A subscription of the cumulative sum of trading information of a specific
stock needs an initial overall sum, which can be maintained using new messages.
To provide ex-tunc subscriptions, the system implements an initial evaluation mechanism.
Such an initial evaluation of a subscription is performed as a set of queries referring to the
corresponding publishers, asking for historic data. If the publisher is capable of providing
this information (e.g. in the context of a data warehouse scenario), this data is used for an
initial evaluation of the ex-tunc subscription. Otherwise, the subscription is rejected.
Components of a Subscription
A PubScribe subscription consists of multiple atoms stating the type of delivery, address of
recipient, three conditions, and most important the body (query) of the subscription. The
evaluation of this subscription body is controlled by the following conditions:
• opening condition
• closing condition
• delivery condition
The complete condition evaluation semantics works as follows: once the opening condition
is satisfied the first time, the delivery and closing conditions are instantiated and evaluated.
If the delivery condition is satisfied, the subscription body is evaluated and the result is delivered to the corresponding subscriber instance. If the closing condition evaluates to true,
the subscription is removed from the system. It is worth to note here that the PubScribe system
provides an ’at least once’ semantics with regard to the execution of a subscription in the context of an initial evaluation for ex-tunc subscriptions: the delivery condition is checked before
the closing condition is evaluated. Thus, if the delivery condition is satisfied, the subscription
body is evaluated and delivered before a satisfied closing condition removes the subscription
from the system.
20
5
5. The PubScribe Message Processing Model
The PubScribe Message Processing Model
This section discusses the message processing model from a conceptual as well as from an
implementational point of view. First of all, we introduce the notion of a ’message scheme’,
followed by a description of the available operators on messages disseminated by registered
publishers. The section closes by outlining the representation of the message processing
model on top of a relational database.
Message Schemes and Message Queues
The messages produced by registered publishers must follow a message scheme announced
during the registration process of a publisher at the subscription management system. The
scheme of a message M = (H, B) consists of a header H = (H1,...,Hn), a (possibly empty) set of
header attributes Hi (1≤i≤n) and a message body B = (B1,...,Bm) with at least a single body attribute Bj (1≤j≤m). For example, the scheme of messages produced by a stock information
service may look like the following:
M = ( {TimeStamp, Stock, StockExchange}, { Price, ChangeAbs, ChangePercent } )
A message queue then is an ordered list of messages following the same message scheme,
i.e. the entries of a queue are produced by the same publisher. The order is given implicitly
by the arrival time of the message in the system. Operators are then applied to such message
queues. A subscription body is then represented as a directed acyclic graph (DAG) where
nodes are message operators and messages are flowing in form of a queue along the edges of
this graph. The opening, closing, and delivery conditions are also expressed in terms of such
operator graphs. Figure 5.1 represents the internal representation of a subscription referring
to stock information with a corresponding delivery condition.
switch
merge
scalar
AND(DelivTimeOK,
DelivDataOK)
merge
SUM(TradeVol) AS TradeVolSum
(Exchange)
collapse
>(Price,
Const) AS
DelivDataOK
scalar
scalar
merge
<(NextDelivery,
CurrentTime) AS
DelivTimeOK
(*)
merge
subscription
body
selection
StockID IN
(’IBM’, ‘Oracle’, ‘MSE’)
selection
70.0
Timer
scalar
+(DeliveryInterval,
LastDelivery) AS
NextDelivery
StockID=’MSE’ AND
Exchange = ‘FSE’
StockWatch
merge
delivery
condition
[(), (DeliveryInterval)]
TimeIntervals
[(), (LastDelivery)]
Repository
Fig. 5.1: Operator Tree of a Sample Subscription (subscription body and delivery condition)
5 The PubScribe Message Processing Model
21
Message Operators
As can be seen in figure 4 the PubScribe processing model provides a huge variety of operators
to specify complex conditions and subscription queries. The subscription in this figure tells
its subscriber the added up trade volumes of the stocks of IBM, Oracle and MSE per stock
exchange whenever the MSE stock in FSE (Frankfurt) is greater than 70 and it is time for
the next delivery. We will iterate through the set of operators and sketch each single operator
as detailed as possible.
• selection operator
The selection operator is defined only with respect to header attributes. Thus, the resulting queue holds only messages with values in the header attributes satisfying a given predicate.
• merge operator
The merge operator allows the horizontal combination of messages from different
sources. The resulting message queue exhibits all common header attributes with the
same value together with the attributes from both operand queues. The body of the resulting queue consists of the union of the body attributes of the operand messages.
• scalar operator
A scalar operator may be applied to body attributes only and, therefore, does not have
any impact on the header attributes. The set of applicable scalar operators depends on
the type of the body attributes to which the scalar operator is applied to.
Example: Node (*) performs a scalar operation by adding the delivery time interval to
the last delivery time.
• collapse operator
A collapse operation performs a projection operation on the messages of an operand
message queue, so that the scheme of an output queue contains a subset of the original
header attributes. However, the set of body attributes is extended by an additional attribute holding the number of collapsed messages per outgoing message. Furthermore,
explicitly specified collapse functions are applied to the body attributes. A collapse
function may be an aggregate function like SUM(), MIN(), MAX() for numerical body
attributes or CONCAT(<root>) for attributes holding XML-structured data.
Example: The subscription body in figure 5.1 exhibits a collapse operation over the
stock exchange by building the sum over the trading volumes.
• copy operator
A copy operator may be applied to a header attribute. This attribute is then redundantly
stored in the body of the message, so that scalar operations may be applied to it.
• shift operator
The shift operator may be seen as the inverse of the copy operator. A body attribute migrates from the body to the header, thus increasing the size of the header.
Example: Suppose the hourly delivery of stock information. To perform a collapse
22
5. The PubScribe Message Processing Model
integrate
propagate
join
apply
specialize
join
split
join
join
base tables
message
tables
INTERNAL
SCHEMA
CONCEPTUAL
SCHEMA
subscription
staging tables
generalized
subscription
tables
notification
documents
EXTERNAL
SCHEMA
Fig. 5.2: PubScribe Message Processing Model
function with regard to a daily time granularity, we first copy the TimeStamp attribute
into the body, perform a scalar operation to retrieve the day, move this attribute back
into the header using a shift operator, and perform a collapse operation.
• switch operator
The special switch operator returns NULL as long as the control input (right input in
figure 4) representing the condition testing is FALSE. Otherwise, the switch operator
returns the messages from the data input (left input in figure 4).
Internal Representation of the PubScribe Message Processing Model
The overall goal of the proposed PubScribe approach is to clearly decouple the incoming messages from the resulting notifications as much as possible so that the database system gains
the potential to optimize the processing of the subscriptions by operator clustering ([Sell88])
and materialization ([GuMS93]).
As shown in figure 5.2, the internal representation of the PubScribe message processing model
is oriented along the 3-schema architecture and consists of four different phases:
• integration phase
Incoming messages are stored in ’message staging tables’, where they are kept in the
original (received) way. In a first step, data from single messages is integrated into the
base tables, reflecting the instantiation of the conceptual schema. A message contributes either exactly to one single base table, a message needs a join partner to generate
an entry for a base table, or a message feeds two or more base tables, i.e. the content
of a message is split into multiple entries in the base tables. This happens for example,
if the message holds denormalized data structures from the conceptual point of view
([NeAM98]).
5 The PubScribe Message Processing Model
23
The transition from the conceptual schema to the external schema is sub-divided into three
phases, which are introduced for sharing as much work as possible to enable the evaluation
of a huge number of subscriptions and providing a robust framework for subscription support
in database systems. Before sketching these phases, we point out that we need two kind of
temporary tables, providing the temporary storage space for data between the single phases:
• subscription staging tables
The purpose of staging tables is to keep track of all changes to the base tables, which
are not yet completely considered by all dependent subscriptions. It is worth to mention
here that a system may have multiple staging tables and each staging table covers only
the part of subscriptions, which do only exhibit a lossless join. Lossy joins are delayed
into the propagate or specialize phase.
• generalized subscription tables
A generalized subscription table serves as a base to compute the notifications for multiple subscriptions (a) referring to the same set of base tables (at least a portion of) and
(b) exhibiting similar delivery constraints. Each subscription may either be directly answered from a generalized subscription table, or retrieved from the generalized subscription table with a join to another generalized subscription table or with a backjoin
to the original base table. It is worth to note here that it must be ensured that the state
of the base tables must be the same as the state at the propagation time of this delta.
It is important to understand that subscription staging tables are organized from a data perspective, whereas the set of generalized subscription tables are organized according to delivery constraints, thus providing a real extension to former work. We are now able to outline
the phases necessary to compute subscriptions
• propagation phase
Comparable to the context of incremental maintenance of materialized views
([BlLT86], [GuMS93]), we propose a second phase of propagating the changes from
base tables to a temporary staging area. The resulting data is already aligned to the
schema of the outgoing message, i.e. the relational peers of message operators (joins,
selections, projections, and aggregation operations) are already applied to the delta information. We have to mention here that the propagation appears immediately after the
update of the base table.
• apply phase
The staging table holds accrued delta information from multiple updates. This sequence of delta information is collapsed so that the primary key condition is satisfied
again and the resulting data is applied to one or more generalized subscription tables.
In this phase subscriptions exhibiting non-lossless joins are combined from entries of
multiple staging tables or a backjoin to the base tables.
• publishing phase
The last phase in computing the notification for a subscription consists in evaluating
the underlying queries based on the generalized subscription tables. Since these tables
24
6. Summary
are already quite similar to the final subscriptions, one little specialization has to be
performed to retrieve the final result. This step also includes a transformation from the
relational representation into an XML document ([ACM+99]). The resulting notification documents are stored as CLOBS within one large notification table.
Sub-dividing the process of subscription evaluation into multiple independent phases implies
that the system has a huge potential for optimization. This process proves beneficial if we
think in large scale subscription services operating on the same conceptual schema.
6
Summary
This paper aims to give an idea of the publish/subscribe paradigm in the context of database
systems. A subscription is formulated once and registered at a subscription management system. According to incoming messages from various information producers (publishers), queries of subscriptions are evaluated and the result of the subscriptions are shipped back to the
user in form of notifications. In a way, subscriptions may be seen as ’query templates’ or
’standing queries’, which are once registered at a database system and evaluated multiple
times against the currently valid database content.
The proposed PubScribe framework is organized according to the ANSI/SPARC 3-schema architecture, yielding a separation of incoming messages (internal schema) and registered subscriptions (external schema). A subscription may accept multiple states during its life cycle,
controlled by an opening, closing, and a delivery condition. The messaging model provides
powerful operators to specify complex conditions and subscription bodies. The internal representation of the message processing model also follows the 3 schema architecture and focuses on decoupling of the different stages in evaluating subscriptions. In fact, during the
four phases of subscription evaluation, messages are integrated, deltas are propagated, applied to general tables and finally converted into XML documents.
In summary, we think that the ’publish/subscribe’ paradigm in general and the proposed PubScribe system especially yield a sound basis for a robust platform handling complex subscription evaluation in a large scale.
References
AcFZ97
Acharya, S.; Franklin, M.; Zdonik, S.: Balancing Push and Pull for Data Broadcast. In: Proceedings
of the International Conference on Management of Data (SIGMOD’97, Tuscon (AZ), U.S.A., 13.15. Mai), 1997, S. 183-194
Birm93 Birman, K.P.: The Process Group Approach to Reliable Distributed Computing. In: Communicatons
of the ACM, 36(1993)12, pp. 36-53
BKS+99 Banavar, G.; Kaplan, M.; Shaw, K.; Strom, R.E.; Sturman, D.C.; Tao, W.: Information Flow Based
Event Distribution Middleware. In: Proceedings of the Middleware Workshop at the International
Conference on Distributed Computing Systems, 1999
6 Summary
BlLT86
Chan98
CoCS93
DeJe97
Elle97
FMK+99
FoDu92
Fran97
GuMS93
Hack97
HaNo99
Husu97
Inri97
LHM+86
LiPH00
LiPT99
Mari97
NeAM98
Norm98
OPSS93
RaDa98
SBC+98
Sell88
TsKl78
YaGa95
25
Blakeley, J.; Larson, P.; Tompa, F.: Efficiently Updating Materialized Views. In: Proceedings of the
ACM International Conference on Management of Data (SIGMOD’86, Washington, D. C., 28.-30.
Mai), 1986, pp. 61-71
Chan, A.: Transactional Publish/Subscribe: The Proactive Multicast of Database Changes. In:
Proceedings of the 27th International Conference on Management of Data (SIGMOD’98, Seattle
(WA), U.S.A., 2.-4. Juni), 1998, pp. 521-522
Codd, E.; Codd, S.; Salley, C.: Providing OLAP (On-line Analytical Processing) to User-Analysis:
An IT Mandate, E. F. Codd & Associates, 1993
DeJesus, E.X.: The Pull of Push, 1997, (http://www.byte.com/art/9708/sec6/art4.htm)
Ellermann, C.: Channel Definition Format (CDF), 1997
(http://www.w3.org/TR/NOTE-CDFsubmit.html)
Fitzpatrick, G.; Mansfield, T.; Kaplan, S.; Arnold, D.; Phelps, T.; Segall, B.: Instrumenting and
Augmenting the Workaday World with a Generic Notification Service called Elvin. In: ECSCW99
Foltz, P.W.; Dumais, S.T.: Personalized Information Delivery: An Analysis of Information Filtering
Methods. In: Communications of the ACM 35(1992)12, pp. 51-60
Frank, M.: Pondering Push Technology. In: DBMS Magazine, 1997, (http://www.dbmsmag.com/
9703d02.html)
Gupta, A.; Mumick, I.; Subrahmanian, V.: Maintaining Views Incrementally. In: Proceedings of the
ACM International Conference on Management of Data (SIGMOD’93, Washington, D.C., 26.-28.
Mai), 1993 pp. 157-166
Hackathron, R.: Publish or Perish. In: Byte Magazin, September 1997,
http://www.byte.com/art9709/sec6/art1.htm
Hanson, E.N.; Noronha, L.X.: Timer-Driven Triggers and Alerters: Semantics and a Challenge. In:
SIGMOD Record, 28(1999)4, pp. 11-16
Husum, D.: When Push Becomes IP Multicast. In: WebVantage Magazine, May 1997
N.N.: INRIA WebCanal, White Paper, http://webcanal.inria.fr/index.html, 1997
Lindsay, B.; Haas, L.; Mohan, C.; Pirahesh, H.; Wilms, P.: A Snapshot Differential Refresh
Algorithm. In: Proceedings of the ACM International Conference on Management of Data
(SIGMOD’86, Washington, D.C., 28.-30. Mai), 2000, pp. 53-60
Liu, L.; Pu, C.; Han, W.: XWRAP: An XML-enabled Wrapper Construction System for Web
Information Sources, Technical Report, 2000
Liu, L.; Pu, C.; Tang, W.: Continual Queries for Internet Scale Event-Driven Information Delivery.
In: IEEE Transactions on Knowledge and Data Engineering 11(1999)4, pp. 610-628
N.N.: Castanet Proxy. Marimba Corporation, 1997 (http://www.marimba.com/products)
Nestrov, S.; Abiteboul, S.; Motwani, R.: Extracting Schema from Semistructured Data. In:
SIGMOD98, 1998, pp. 295-308
Norman, D.A.: The Invisible Computer. In: MIT Press 1998
Oki, B.; Pfluegl, M.; Siegel, A.; Skeen, D.: The Information Bus - An Architecture for Extensible
Distributed Systems. In: Operating Systems Review, 27(1993)5, pp. 58-68
Ramakrishnan, S.; Dayal, V.; The PointCast Network. In: SIGMOD98, 1998, pp. 520
Strom, R.; Banavar, B.; Chandra, T.; Kaplan, M.; Miller, K.; Mukherjee, B.; Sturman, D.; Ward, M.:
Gryphon: An Information Flow Based Approach to Message Brokering. In: Proceedings of the
International Symposium on Software Reliability Engineering, 1998
Sellis, T.: Multiple Query Optimization. In: ACM Transactions on Database Systems 13(1988)1,
pp. 23-52
Tsichritzis, D. C.; Klug, A.: The ANSI/X3/SPARC DBMS framework report of the study group on
database management systems. In: Information Systems 3 (3), 1978, pp. 173-191
Yan, T.W.; Garcia-Molina, H.: SIFT - A Tool for Widearea Information Dissemination. In:
Proceedings of the USENIX Technical Conference, (USENIX‘95), 1995, pp. 177-186
26
6. Summary
BUILDING AN INFORMATION MARKETPLACE USING A
CONTENT AND MEMORY BASED PUBLISH/SUBSCRIBE
SYSTEM
Wolfgang Lehner, Wolfgang Hümmer, Michael Redert
EMail: {lehner, huemmer, mlredert}@immd6.uni-erlangen.de
Abstract
The well-known publish/subscribe paradigm offers a reliable and scalable group
communication mechanism. This paper proposes the PubScribe framework using a content and memory based publish/subscribe system to provide an XML-based access open
for everybody to participate as a publisher as well as a subscriber of new information.
In addition to pure event notification or message brokering systems, the PubScribe framework targets at processing published messages and deriving result sets for registered
subscriptions. The processing strategy aims to support ’Business Intelligence’ applications by providing appropriate operators as well as local memory inside the brokerage
component. The paper discusses different subscription semantics, the communication
model, the processing model, and finally gives an overview of the prototypical implementation.
28
1
1. Introduction
Introduction
Imagine the situation of getting an email telling you of a new Italian restaurant around the
corner, a vacant and perfect suited triple bedroom apartment in your favorite living area, the
quarterly sales figures of your company as soon as they are released, or the information that
MSE stocks are finally over $70 again. Currently there are several ways to accomplish this
task: Organize your personal ’web bookmarks’ and pull for your required information (the
Italian restaurant), subscribe for simple document delivery services to receive documentbased emails (the apartment scenario), start Excel or your favorite OLAP tool to compute the
quarterly sales figures or write your own SQL query with trigger conditions based on your
own stock database (the $70 MSE scenario). Thus, there is a need for an open market place,
where information may be offered by producers, somehow processed, and selectively delivered to consumers.
This paper outlines the general framework of the open publish/subscribe system ’PubScribe’,
which offers an XML-based access to specify subscriptions on the one side and participate
as a publisher of information on the other side. To provide such a reliable and scalable group
communication mechanism ([Birm93]), the proposed PubScribe service relies on the asynchronous communication model of publish and subscribe, a very well known concept to implement asynchronous communication in distributed systems ([Powe96], [OPSS93]).
Figure 1.1 depicts the logical data flow of a classical publish/subscribe system. Publisher
components are sending messages to a brokerage component. The brokerage component dispatches these messages according to registered subscriptions to the appropriate subscribers.
Subscriptions are either subject-based if they refer to a specific data source (channel) or content-based, if the subscription is based on predicates over the incoming messages. Thus, traditional publish/subscribe systems implement a document based asynchronous and anonymous dispatching of messages.
However, the PubScribe approach does not only address document based routing of published
messages to the appropriate subscribers, but focuses on the processing of incoming data and
deriving the result of complex queries, the operational core of a subscription. The PubScribe
system especially seeks support of ’Business Intelligence Applications’ like ’Online Analytical Processing’ or ’Decision Support’. Evaluations in those application areas are not only
based on snapshot data (current message), but on historic data either stored locally within the
Producer/
Publisher
Publication
...
Publication
Subscription
Subscription
Management
System
Delivery
...
Delivery
Consumer/
Subscriber
Fig. 1.1: Classical Publish/Subscribe Paradigm
1 Introduction
29
subscription management system or within an underlying database (usually a data warehouse). Therefore the PubScribe framework adapts the original publish/subscribe ideas
([OPSS93]) and extends them into several directions:
• role model:
Within the PubScribe framework, ’publisher’ and ’subscriber’ do not reflect the names
of single system components, but are used as roles, which may be adapted by different
components. Thus, since everybody may act as subscriber or publisher, obviously every interaction between system components follows the publish/subscribe paradigm.
For example, bootstrapping of the PubScribe kernel is implemented as a subscription
with regard to the internal repository, which in turn publishes the necessary meta data.
• processing model:
The main idea of publish/subscribe is to route incoming messages from multiple publishers to a set of appropriate subscribers. The routing process is either based on predefined channels ([RaDa98]) or content-based ([SAB+00]) using predicates on the
header or content of the messages. The PubScribe framework extends this perspective
and emphasizes the processing of messages. To accomplish message processing from
an operational point of view, the PubScribe system provides a rich set of sequence operators like selection, merge, collapse, and windowing for transforming published messages. From a structural point of view, the PubScribe system provides customized storage of former messages to be able to provide subscription services referring not only to
the current state of an entity (current weather condition) but also to access previously
published messages to compute the accurate result of a subscription (moving average
over the last 7 days of the MSE stock closing price).
• considering the spirit of XML:
The PubScribe system takes advantage of the high availability of XML technology from
two perspectives: On the one hand everybody who is able to generate an XML document satisfying PubScribe‘s DTD is able to join the system either in the role of a publisher or a subscriber. On the other hand all system components are talking SOAP
([BEK+00]), which enables the setup of a highly distributed system across multiple
platforms and firewalls.
Within this paper, we give an overview of the PubScribe framework and architecture. After
sketching the main ideas and key differences of the PubScribe system with regard to other publish/subscribe projects, we first present the proposed subscription semantics. In section 4 we
point out certain key communication patterns. Section 5 focuses on the presentation of the
message processing model. We close the description of the PubScribe system by giving an idea
of the internal architecture and sketching some implementational issues.
30
2
2. Related Work
Related Work
Although a high number and variety of systems are based on the publish/subscribe paradigm,
we identified two categories, which may be used to classify existing systems. To keep the description as brief as possible, we concentrate on CORBA, Siena, and Elvin as representatives
of publish/subscribe systems at the generic and completely application independent service
level and on SIFT, Gryphon, and CQ on a more application-oriented level.
Service Level Systems
The Event Service is one of CORBA’s services (COSS, [OMG00]). Similar to other parts of
the CORBA specification, the event service is defined in a general multi-purpose way. Demanders and suppliers exchange events with each other in a push, pull or mixed mode, i.e. a
pushing supplier can even feed a pulling demander by using an intermediate event channel.
CORBA allows subject based as well as content based subscriptions depending on the event
messages.
SIENA (Scalable Internet Event Notification Architecture, [BKS+99]) targets a scalable
event service for highly distributed applications, which is realized by a network of distributed
event servers organized in a hierarchical or peer-to-peer topology. Notifications are sets of
attributes consisting of an identifying name, a data type and a value. Subscriptions are combinations of ’event filters’ representing logical predicates over attributes. Subscription and
notification are said to be compatible if a notification sent out by a server contains all attributes of a subscription and further fulfills its event filters.
In the Elvin system ([FMK+99]) publishers (producer clients) are special servers not using
the subscription engine and never getting notified. Subscribers (consumer clients) are also
special servers ignoring any subscription requests from the server. The content-based subscriptions consist of boolean expressions specified in a specifically designed subscription
language even comprising full support for regular expressions. Sample applications based on
Elvin are Tickertape, a configurable news ticker, or Breeze, an event driven workflow engine.
Application Level Systems
In contrast to publish/subscribe systems designed for the service level, application level systems are more interesting from our perspective. SIFT (Stanford Information Filtering Tool;
[YaGa95]), for example, is a large scale information dissemination service supporting full
text filtering and emphasizing the information retrieval application area ([FoDu92]). Subscriptions are set up by specifying an information profile with different parameters concerning update frequency or information amount. Two filtering models (vector space model and
boolean model) are used to compute a relevance factor for an incoming message. Documents
with a higher relevance factor than the specific relevance threshold given in the subscription
are delivered to the subscriber.
3 Subscription Semantics
31
Gryphon ([SBC+98]) is a scalable message brokering system developed by IBM that allows,
similar to PubScribe, content based subscriptions based in an information filtering graph consisting of information spaces (the nodes storing event histories or interpretations) and information flows. Along these edges filtering, merging (union semantics, no join), transformation, interpretation and expand operations can be applied. The Gryphon project specializes
in optimizing this graph and in distributing it over several broker processes.
The Continual Query Project (CQ, [PuLi98]) is a publish/subscribe system for querying large
distributed and heterogeneous information sources. Within the CQ project, a subscription
consists of a standard SQL query, a time/event based trigger for delivery and a closing condition. The CQ Project is able to process subscriptions over multiple sources as demonstrated
with one of CQ’s depending projects ([LiPH00]).
Summary
Gryphon as well as the Continual Query Project were influential for the design of PubScribe.
However, PubScribe exhibits far more capabilities with regard to analytical operations and
operator nodes with local memory. PubScribe is neither just an extended database system
with triggers and the support of continual queries - like CQ - ([TGNO92]), nor a pure message brokering system without opening, closing and delivery conditions like Gryphon. PubScribe is both.
3
Subscription Semantics
Before delving into modelling details and architectural issues let us point out certain important issues in the more general context of subscriptions.
Theory behind Subscriptions
In theory a subscription may be represented as a mathematical function which is not yet saturated, i.e. the result of this function is still under computation or, in other words, the data on
which the computation of the function is based is either not yet complete or changing over
time. The bottom line with regard to subscription systems from a database perspective is that
a user registers a query once and regularly receives a notification of the query result with regard to the current state of the underlying data set. Therefore, the query may be considered
the ’body’ of a subscription, which is subject of evaluation, if a corresponding delivery condition is met. Furthermore, subscriptions are instantiated if corresponding opening conditions are satisfied. Analogous, subscription are canceled, if given closing conditions evaluate
to true.
32
3. Subscription Semantics
Different Types of Subscriptions
From a theoretical point of view one can distinguish between fictitious and real subscriptions. An example for a fictional subscription may be the query regarding “the highest prime
number“. Of course, we restrict ourself to real subscriptions. The set of real subscriptions
however may be further classified into feasible and non- or not yet feasible subscriptions. A
subscription on “the highest prime number twins“ may be an example for a not-yet feasible
subscription, because it is (still) unknown, whether such numbers exist at all. Therefore, we
restrict ourself to real and feasible subscriptions. Moreover we are able to classify these types
of subscriptions in more detail from a data point of view into the following three categories:
• snapshot subscriptions:
A snapshot ([AdLi80]) subscription may be answered by referring only to the currently
valid information, i.e. the answer may be retrieved by processing only the most current
message of a publisher. Snapshot subscriptions require ’update-in-place’ semantics.
Example: A subscription regarding the current weather condition only refers to the
last available information. Old data is no longer of interest.
• ex-nunc (’from now on’) subscriptions:
Ex-nunc subscriptions are based on a set of messages. This set of messages is constructed starting from an empty set at the time of registering a subscription.
Example: Computing the value of a three-hour moving average of a stock price starts
with a single value for the first hour, the average of two values for the second hour, and
the average of three values for all other hours.
• ex-tunc (’starting in the past’) subscriptions:
Ex-tunc subscriptions are based on data from the past plus current information.
Example: A subscription of the cumulative sum of trading information of a specific
stock needs an initial overall sum, which can be maintained using new messages.
The PubScribe system supports (classical) snapshot based, ex-nunc and ex-tunc subscriptions. To provide ex-tunc subscriptions, the system implements an initial evaluation mechanism (Section 4).
Condition Evaluation Semantics
The evaluation of a subscription query (body of a subscription) is controlled by conditions.
The PubScribe system uses the following three conditions to control the execution and delivery of a result of a subscription:
• opening condition:
A subscription gets active, i.e. the body and the following two conditions are instantiated as soon as the opening condition is satisfied the first time.
• closing condition:
A subscription is removed from the system as soon as this condition evaluates to true.
3 Subscription Semantics
33
• delivery condition:
If and only if the delivery condition evaluates to true, the body of the subscription gets
updated, i.e. messages which are arrived since the last delivery are ’merged’ into the
current state of the subscription.
evaluate
Figure 3.1 illustrates the complete condition
opening
delivery
evaluation semantics in form of a state macondition
condition
satisfied
satisfied
chine ([McDo89]). Once the opening condition is satisfied the delivery and closing condiclosing
delivery
condition
condition
tions are evaluated. If the delivery condition is
NOT
NOT
satisfied
satisfied
satisfied the subscription body is evaluated
and the result is delivered to the corresponding
closing
publish
condition
subscriber instance. If the closing condition
satisfied
evaluates to true, the subscription is removed Fig. 3.1: Condition Evaluation Semantics
from the system. It is worth to note here that
the PubScribe system provides an ’at least once’ semantics with regard to the execution of a
subscription in the context of an initial evaluation for ex-tunc subscriptions (Section 3, ’Initial Evaluation of a New Subscription’): the delivery condition is checked before the closing
condition is evaluated. Thus, if the delivery condition is satisfied, the subscription body is
evaluated and delivered before a satisfied closing condition removes the subscription out of
the system.
Quality of Service Requirements
A subscription may exhibit quality of service requirements (QoS) specified in addition to
opening, closing, and delivery condition. To satisfy these quality of service requirements, the
broker component has to initiate certain actions on time as well as to make sure that the necessary messages are published with a sufficient frequency.
To achieve this goal a publisher tells the brokerage component the highest publishing frequency it is willing or able to provide. Each incoming subscription has to be checked against
these requirements. A subscription has to be rejected if the required delivery frequency can
not be met by one of the addressed publishers. A second reason which may cause a subscription to get rejected is the case where the subscription is of type ex-tunc but the corresponding
publisher is not able to provide historic information to the PubScribe brokerage component.
Therefore a publisher tells the brokerage component its properties at registration time. Further properties of a publisher are the ability to evaluate conditions and selection operations
based on the source the publisher is responsible for. Those abilities are optional but are supported by the brokerage system to reduce network traffic from the publisher to the brokerage
component.
34
4
4. The PubScribe Communication Pattern
The PubScribe Communication Pattern
Basically, three fundamental rules determine the communication pattern within the PubScribe
system:
(1) Every data flow is modeled as a subscription and a series of publications.
(2) Publishers and subscribers are only roles. Different PubScribe components may take on
these roles depending on the current context.
(3) A complete decoupling of subscriber and publisher (or one-way communication) is not
feasible to support complex subscriptions referring to historic data.
We sketch these properties in this section by explaining communication patterns for different
situations during the lifetime of a PubScribe system.
System Startup
Although rule (2) states that every component may play the role of a publisher or subscriber,
we use the notions of publisher and subscriber from a global point of view in the traditional
way of thinking (Figure 4.1): data flows from multiple publishing components to multiple
consumers. The routing of messages according to the existing subscriptions is done by a central broker component acting as a dispatcher and processing unit of messages.
The very first step at system startup is to initialize the central broker component. Following
rule (1), technical meta data necessary to boot the brokerage component is published by a
repository. Thus, the first action in a broker’s life is to formulate a subscription referring to
the publishing component of an internal repository. The existence of an internal repository
publisher and the schema of repository messages is static and hard-coded inside the broker.
Figure 4.1 shows the simple communication pattern for the system startup procedure. The
broker submits a subscription at the repository. The opening, closing, and delivery component are set to ’true’, i.e. are referring to a logical channel disseminating constant ’true’ values. The body of the subscription refers to or is based on data from the repository data source,
which, of course, is represented by the recipient of the subscription. The repository is publishing a message containing the data necessary for bootstrapping into the ’Repository’ channel of the broker.
Publisher
Repository
Repository
Broker
Subscriber
subscribe
opening[True], closing[True],
body[Repository]
delivery[True]
publish
msg[Repository]
Fig. 4.1: Broker Startup Communication Pattern
4 The PubScribe Communication Pattern
Publisher
35
Subscriber
Broker
subscribe
opening[TRUE], closing[TRUE],
body[ChannelDirectory]
delivery[TRUE]
Stockwatch
publish
msg[ChannelDirectory]
publish
msg[User1]
User1
User1
Fig. 4.2: Publisher Registration Communication Pattern
Part of the initialization tasks of the broker is to start up the following system publishers:
• channel directory:
The broker publishes all known publishing components together with their message
schemes. Thus, a (real) subscriber may subscribe to this channel directory to retrieve a
list of currently known and receive notification of new publishers.
• timer:
The timer sends timestamps of the current time in a requested resolution.
• subscription repository:
In analogy to the system repository needed for startup, the subscription repository is
used to receive meta data (like last delivery time) from each subscription.
It is worth to mention that in opposite to other publish/subscribe systems, publishers do not
immediately start publishing but are waiting for incoming subscriptions (deferred publishing). Therefore, the timer component, for example, does not disseminate time signals after
startup. Only after receiving a subscription with certain quality of service conditions, the timer starts sending signals with the required resolution. This mechanism, on a service level introduced in [SAB+00] as ’quenching’, is tremendously important, because publishers do not
disseminate messages which are not used by any consumer (broadcast semantics).
Registering a new Publisher and Querying the Channel Directory
A new publishing component is able to join the system in publishing a message to the directory channel of a PubScribe broker. The schema of an announcement message is predefined,
static and consists of the new publisher’s address, the schema of the proposed messages and
a statement of the provided quality of service (see ’Registering a new Subscription’). Every
user of the PubScribe system with a subscription referring to the directory channel receives a
notification of the arrival of a new publisher and, therefore, gains knowledge of new information sources which may be objects of future subscriptions. Figure 4.2 illustrates a sample
communication pattern. A subscriber holds a subscription on the directory channel, where
the opening condition depends on an entry of the subscription repository (e.g. at least two
known channels). As soon as a message of a new publisher arrives at the broker, the message
is processed and the result is published into the result channel of the subscriber.
36
4. The PubScribe Communication Pattern
Registering a new Subscription
For registering a new subscription a user or an application program formulates a subscription
consisting of a opening condition, a closing condition, a delivery condition, and a body and
approaches the broker for registering. In a first step, the broker checks whether the subscription refers to known publishers and whether the quality of service conditions can be met. In
a second step the broker either submits a new subscription to the publishing component or,
if there is already a user subscription based on this publishing component, modifies the existing subscriptions, so that the quality of service conditions of both subscriptions can be met.
Figure 4.3 shows the communication pattern for the case when a new user subscription triggers new subscriptions from the broker at some publishers. After receiving a subscription requiring access to the stockwatch, timer and repository publishers, the broker submits three
atomic subscriptions to the corresponding publishers. While timer and repository are internal
publishers, the StockWatch publisher wraps a web site providing snapshots of stock prices,
trading volumes, etc.
It is worth to mention here that the bodies of the subscriptions are completely different.
While the user subscription is a complex network of operators transforming and processing
the incoming messages, the bodies of broker-publisher subscriptions may be of very simple
nature.
The communication pattern slightly changes in the second case where a broker already holds
a subscription on the necessary publishing components. In this situation, an existing subscription is altered to meet the new quality of service requirements; no additional subscriptions are installed in the system. In detail:
• publishing frequency:
The frequency of publishing messages is set to the maximum of all quality of service
requirements.
• content restriction:
If the data source is capable of providing the application of selection predicates, the
predicate of the new subscription is conjunctioned with the existing selection predicate
of the publisher.
Publisher
Broker
Subscriber
subscribe
Stockwatch
Timer
Repository
subscribe
opening[TRUE], closing[TRUE]
body[Stockwatch]
delivery[Stockwatch]
User1
opening[Stockwatch]
closing[Timer]
body[Stockwatch]
delivery[Stockwatch, Timer, Repository]
subscribe
body[Timer]
delivery[Timer]
subscribe
body[Repository]
delivery[Repository]
Fig. 4.3: New Subscription Communication Pattern
5 The Message Processing Model
37
• opening- and closing conditions:
If the publisher provides the service to evaluate conditions, then the conditions are modified to publish a first message as soon as the earliest of all opening conditions is satisfied. The closing conditions are modified the opposite way.
Initial Evaluation of a New Subscription
The memory-based character of the PubScribe system requires that subscriptions referring to
data items, which were valid some time ago, but are not present in the broker’s memory, are
accessible directly by the publisher. The motivating example, asking for the 7 day moving
average of the IBM, MSE, and Oracle stock closing prices, needs access to price information
of the last 7 days. To accomplish this task we introduce a further publishing QoS property
telling the broker about the capability of the publisher to deliver historic data. If a subscription requires historic data, but refers to a snapshot publisher, the subscription is either rejected or cumulatively evaluated based only on new messages by storing the results in the broker’s local memory.
The publication of historic messages is provided by an initial subscription: the communication pattern of figure 4.3 is extended by an initial subscription and publishing step before the
regular or permanent subscription from the broker to the publisher (in this case ‚StockPriceDB‘) is initialized (Figure 4.4).
5
The Message Processing Model
This section discusses the message processing model by sketching the data flow from an incoming message to the reflection of this message within the affected subscriptions. Basically
each subscription is internally represented as a DAG with edges describing the message flow
and the nodes reflecting various operators transforming published messages. It is worth to
mention that the model allows the tagging of operators as ’immediate’ or ’lazy’, denoting the
Publisher
Broker
Subscriber
Initial Evaluation
StockPriceDB
StockPriceDB
User1
opening[StockPriceDB]
closing[Timer]
body[StockPriceDB]
delivery[StockPriceDB, Timer, Repository]
publish
msg[StockPriceDB]
subscribe
body[StockPriceDB]
delivery[TRUE]
Snapshot
Subscriptions
StockPriceDB
body[StockPriceDB]
delivery[TRUE]
Ex-Tunc
Subscription
subscribe
subscribe
Fig. 4.4: Communication Pattern for a new Subscription with Initial Evaluation
38
5. The Message Processing Model
execution time of the operators in case of new messages. Usually operators participating in
evaluation of conditions are tagged ’immediate’ whereas operators for evaluation of a subscription body are tagged as ’lazy’ operations to be performed on demand.
Message Schemes, Messages and Message Queues
One part of the registering data of a publisher consists of the scheme of the publishing messages. The scheme of a message M = (H, B) consists of a header H = (H1,...,Hn), a (possibly
empty) set of header attributes Hi (1≤i≤n) and a message body B = (B1,...,Bm) with at least a
single body attribute Bj (1≤j≤m). The type of a header attribute is an element of {number,
string, datetime, boolean}. A body attribute may additionally be of type XML reflecting a not
further interpreted XML data string.
To refer to our ongoing example, a publisher disseminating stock information may exhibit
the following message scheme which is specified within an XML document at the time of
registering the publisher at a PubScribe broker
<?xml version="1.0" encoding="UTF-8"?>
<PublisherRegistration>
...
<Scheme>
<Header name=”Stock” type=”integer”>
<Header name=”StockExchange” type=”string” length=”32”>
<Body name=”TimeStamp” type=”datetime”>
<Body name=”Price” type=”float”>
<Body name=”ChangeAbs” type=”float”>
<Body name=”ChangePercent” type=”float”>
<Body name=”BidAsk” type=”string”>
</Scheme>
<QualityOfService notificationDelay=”5”/>
<Channel value=”stockwatch”/>
</PublisherRegistration>
As already outlined in the preceding section, every piece of data flow is modeled as a published message. Thus, the system conceptually provides dedicated publishers for constant
values, for the current time or meta data from a repository publisher.
An instance of a message schema is a single message representing a single real world event
including accompanying data (the message body). Messages of a single publisher are stored
within a message queue, where the single messages are ordered according to the timestamp
of entering the brokering system. The scheme of a message queue corresponds to the scheme
of the stored messages.
The following queue q1 shows a sample queue of the incoming messages from the publisher
mentioned above. The queue q2 reflects the logical queue holding a single message denoting
the current time. It is worth to mention here that this queue exhibits an empty header. The
information, i.e. the timestamp of the current time, is part of the message body.
q1 = [ ( { MSE, FSE } , { 2000-07-18-11.28, 74.22, +4, +1.2% } )
( { MSE, NYSE } , { 2000-07-18-11.28, 73.12, -1.1, -1.5% } )
... ]
39
q2 = [ ( { }, { 2000-07-18-11.28 } ) ]
Subscriptions
According to the XML-based nature of the PubScribe system, a single subscription is specified
by an XML document following a given DTD. As explained earlier, a subscription basically
consists of three conditions, a subscription body, a specification of the delivery method and
a specification of the expected quality of service.
The subscription body and every condition is modeled as an operator tree over possibly multiple source message queues ([Forg82]). To reflect the characteristics of a condition the system requires a ’boolean’ message scheme (MC = ({},{B}) with type(B) = boolean) at the root of
each condition operator subtree. The message scheme of root of a subscription body determines the message scheme of the delivered message.
To illustrate the different operators, consider a subscription asking for the 3-day moving average over a stock price and for the cumulative sum of the trading volumes for IBM, Oracle,
and MSE stocks per stock exchange. Delivery should depend on the current price of the MSE
stock. Furthermore, there should be only one delivery per day.
Figure 5.1 shows the corresponding operator tree of this subscription. The leaf nodes represent either messages from external publishers or internal data in message format (e.g. constant values). The inner nodes of the tree are operators on message queues which are formally
introduced in the remainder of this section.
Switch Nodes
The root of the tree is a special switch node storing the messages produced by the child node
of the right subtree in a local memory. If an only if the left child node sends true ’boolean’
message, the switch node either requests a evaluation of the subscription body or, if the local
memory reflects the currently valid state of the subscription body, flushes the memory by
sending the messages to the delivery component (Section 6).
Selection Operator
The selection operation on messages of a queue is only defined on header attributes, i.e. the
resulting queue holds only messages with values in the header attributes satisfying a given
predicate.
Example: The following expression restricts the set of interesting messages to those from
MSE and Frankfurt Stock Exchange (FSE).
q1 = [ ( { MSE, FSE } , { 2000-07-18-11.28, 74.22, +4, +1.2% } )
( { MSE, NYSE } , { 2000-07-18-11.28, 73.12, -1.1, -1.5% } )
... ]
q2 = selection(q1, StockID=’MSE’ AND Exchange = ’FSE’ )
40
Merge Operator
The merge operator allows the horizontal combination of messages from different sources.
In a relational context this operation may be visualized as a natural full outer join over the
header attributes of the participating message queues. Therefore the resulting queue exhibits
all common header attributes with the same value together with the attributes from both operand queues. The body of the resulting queue consists of the union of the body attributes of
the operand messages. To clarify the merge operation, let us look at two examples:
Example 1: The subscription body requires a moving average over the stock price as well as
a cumulative sum over the trading volumes. Both figures are computing separately using the
’windowing operator’ (see below). To join the two data streams again, a merge operation defined over the header attributes is used.
Example 2: The message holding the current time (from the internal ’time’ publisher) and
the timestamp of the last delivery of the subscription (from the internal ’repository’ publisher) are merged together. Since they both do not exhibit a single header attribute, the set of
header attributes of the output queue is empty, too. The set of the body attributes however,
encompasses both attributes holding the current time and the timestamp of the last delivery.
q1 = [ ( { }, { 2000-07-18-11.28 } ) ]
q2 = [ ( { }, { 1440 } ) ]
// 24 hours = 1440 minutes
q = merge(q1, q2)
= [ ( { }, { 2000-07-18-11.28, 1440 } ) ]
switch
AND(DelivTimeOK,
DelivDataOK)
merge
scalar
merge
merge
>(NextDelivery,
CurrentTime) AS
DelivTimeOK
window
>(Price,
Const) AS
DelivDataOK
scalar
scalar
SUM(Index) AS TradingCumSum
(’IBM’, ‘Oracle’, ‘MSE’), (TradingDay).
(StockExchange),
(<FIRST>, <CURRENT>)
+(DeliveryInterval,
LastDelivery) AS
NextDelivery
scalar
Timer
70.0
selection
selection
StockID=’MSE’ AND
Exchange = ‘FSE’
merge
[(), (LastDelivery)]
Repository
delivery
condition
/(ThreeDaySum,
<COUNT>) AS
ThreeDayAVG
SUM(Index) AS ThreeDaySum
(’IBM’, ‘Oracle’, ‘MSE’), (TradingDay).
(StockExchange),
window (<CURRENT>-2, <CURRENT>)
merge
merge
scalar
subscription
body
StockID IN
(’IBM’, ‘Oracle’, ‘MSE’)
StockWatch
[(), (DeliveryInterval)]
TimeIntervals
Fig. 5.1: Operator Tree of a Sample Subscription (subscription body and delivery condition)
41
Scalar Operator
A scalar operator may be applied to body attributes only and, therefore, does not have any
impact on the header attributes. The set of applicable scalar operators depends on the type of
the body attributes to which the scalar operator is applied to. The following sample queues
and operators demonstrate the evaluation condition for the delivery condition:
q = [ ( { }, { 2000-07-18-11.28, 1440 } ) ]
// last delivery and delivery interval
q1 = scalar(q, +(LastDelivery, DeliveryInterval) as NextDelivery)
= [ ( { }, { 2000-07-19-11.28 } ) ]
q2 = [ ( { }, { 2000-07-18-17.22 } )
... ]
// current time
q3 = merge(q1, q2)
= [ ( { }, { 2000-07-19-11.28 , 2000-07-18-17.22 } ) ]
q4 = scalar(q, <(NextDelivery, CurrentTime) as ReadyForDelivery)
= [ ( { }, { TRUE } ) ]
In a first scalar operation, the delivery interval is added to the timestamp of the last delivery.
After attaching the current time to the resulting queue, the second scalar operation compares
the current and next delivery time resulting in a queue with a boolean message scheme of the
single body attribute indicating if a given message is to be delivered or not.
Collapse Operator
In a certain sense, a collapse operation performs a projection operation on the messages of
an operand message queue. Therefore the scheme of an output queue after applying a collapse operation does only contain a subset of the original header attributes. However, the set
of body attributes is extended by an additional attribute holding the number of collapsed messages. Furthermore, collapse functions are applied to the body attributes. A collapse function
may be an aggregate function like SUM(), MIN(), MAX() for numerical body attributes or
CONCAT(<root>) for attributes holding uninterpreted XML data.
Example: Although the semantics of aggregate functions is obvious, we give an example to
compute an average value using the collapse operation and propagating XML documents
through an additional scalar function applied afterwards.
q1 = [ ...
( {2000-07-18, NYSE},
{7342, „<opinion>Market will break together.</opinion>“} )
( {2000-07-18, FSE },
{7543, „<opinion>Europe is doing pretty well.</opinion>“} )
... ]
q2 = collapse(q1, (TradingDay),
(SUM(Index) as SUM_IDX,
CONCAT(<trade opinions>, Comment) as Comments)
= [ ...
42
({2000-07-18}, {2, 14885, „<trade opinions>
<opinion>The market will break together</opinion>
<opinion>European market is doing pretty well</opinion>
</trade opinions>“} )
... ]
q3 = scalar(q2, dividedby(SUM_IDX, <COUNT>) as AVG_IDX,
identity(SUM_IDX) as SUM_IDX,
identity(Comments) as Comments )
The collapse operation computes the sum of the trading numbers and generates a new XML
document with a new document root element given in the CONCAT aggregation operator.
Since a collapse operator does not exhibit a direct average computation, it implicitly produces an additional column <COUNT> holding the number of collapsed messages. The countcolumn is then used to compute the required average value using the scalar function ’devidedby’.
Windowing Operator
The windowing operator is a well known sequence-based mechanism to define partitions
based on a relative addressing scheme with regard to the neighboring elements in a sequence
([SeLR94], [IBM00]). In opposite to the collapse function, a windowing operator does not
affect the cardinality of the queue, i.e. the number of messages remains constant.
Example: The following sequence of operators computes a 3-day moving average over the
stock index for every stock exchange. Thus, in addition to a collapse operator, the windowing
operator holds parameters specifying the ordering of the single messages (default criterion is
the arrival time of the messages), defining the partitions for which the windowing operation
takes place (for every stock exchange), and stating the relative addressing scheme (the current message, the last message, and the message before the last one). A scalar operation is
then used to compute the average value. The second windowing operator computes the cumulative sum over the trading volumes also for each stock exchange. To compare the values,
the two merge operators combine the original values as well as the computed values from the
windowing queues.
q1 = [ ...
( {2000-07-18, NYSE }, {10342, 533 } )
( {2000-07-19, NYSE }, {10543, 234} )
( {2000-07-20, NYSE }, {10428, 422} )
( {2000-07-21, NYSE }, {10812, 766} )
( {2000-07-18, FSE }, { 7143, 254} )
( {2000-07-19, FSE }, { 7937, 723} )
( {2000-07-20, FSE }, { 7402, 453} )
( {2000-07-21, FSE }, { 7422, 711} )
... ]
q2 = windowing(q1, (SUM(Index) as ThreeDaySum_IDX),
// the aggregate operation
(TradingDay),
// ordering
6 Implementational Perspectives
(StockExchange),
(<CURRENT> -2, <CURRENT>) )
43
// partitioning attributes
// lower and upper limits
q3 = scalar(q2, dividedby(ThreeDaySum, <COUNT>) as ThreeDayAvg )
q4 = windowing(q1, (SUM(TradingVolumes) as TradingCumSum),
// the aggregate operation
(TradingDay),
// ordering
(StockExchange),
// partitioning attributes
(<FIRST>, <CURRENT>) )
// lower and upper limits
q5 = merge(q3, q4)
q6 = merge(q1, q5)
q1 = [ ...
( {2000-07-18, NYSE }, {10342, 533, 10342, 533} )
( {2000-07-19, NYSE }, {10543, 234, 10442, 767} )
( {2000-07-20, NYSE }, {10428, 422, 10438, 1189} )
( {2000-07-21, NYSE }, {10812, 766, 10594, 1955} )
( {2000-07-18, FSE }, { 7143, 254, 7143, 254} )
( {2000-07-19, FSE }, { 7937, 723, 7540, 977} )
( {2000-07-20, FSE }, { 7402, 453, 7494, 1430} )
( {2000-07-21, FSE }, { 7422, 711, 7587, 2141} )
... ]
6
Implementational Perspectives
This section sketches the prototypical implementation of the proposed PubScribe system using
the overview of the architecture given in figure 6.1. Each broker may be divided into handlers
for the incoming and outgoing side as well as the broker‘s kernel and the publishing component. Each part is outlined below.
Publishers
Publishers, i.e. simple brokers acting only as publishers, have to provide messages according
to a committed message scheme and specified quality of service requirements. Since publishers only cover real data sources, publisher may be for example web sites embedded into a
wrapper that is responsible for regularly polling the site, converting the results into the export
schema and pushing the new information to higher processing layers. We intentionally do not
emphasize the extraction or wrapping process, but refer to well-known principles and techniques. ([LiPH00], [RoSc97]). Moreover, another kind of publishers are database systems or
data warehouse systems pushing incoming tuples by utilizing triggers (see demo below).
Special publishers like timer, channel directory, or PubScribe repository are created during
startup of the system. All publishers are managed by the publisher handler, registering new
publishers at the repository and forwarding new publications to the kernel (see below).
44
6. Implementational Perspectives
Subscribers
Broker‘s Kernel
Data Processing Unit (DPU)
S
Publisher Handler
Broker‘s Kernel
Publishing Component
Operator Clustering Unit (OCU)
Subscription Handler
A subscription is specified in form of an XML
document using either an XML editor or an
appropriate graphical interface. The subscription is submitted at the subscription handler
which registers them at the repository in form
of a new publication.
S
S
S
S
Staging Area (STA)
S
The central unit of the complete system is the
Broker‘s kernel. To the demander side, the
kernel is responsible for the life cycle management of each single subscription. From the
Data
Warehouse
WebSite
producer side, the kernel gets informed about
new publishers or publications and initiate
Fig. 6.1: Architecture of the PubScribe System
that the system levels below create or update
(from 10.000 feet)
their corresponding data structures.
In case of a new subscription the underlying data processing unit (DPU) integrates this new
operator tree into its DAG representing already existing subscriptions so that existing nodes
in the operator tree are shared as much as possible. It further decides as part of the inter-query
optimization which nodes are to be materialized and which are ordinary views in the underlying database system. Each materialized view together with its (virtual) subtree is handed
to the operation clustering unit (OCU) that tries to compose several operators to one combined operator. For each of these combined operators the staging area below creates exactly
one database view. Furthermore the staging area maintains the set of materialized views
when processing new publications by inserting them into the basic staging tables.
Communication between most components of the previously described architecture is done
using the Simple Object Access Protocol (SOAP, [BEK+00]) which essentially means conveying messages formulated in XML over the standard HTTP protocol. The great advantage
of this solution is that communication can easily take place across firewalls guarding many
systems in the Internet. The benefits of XML are supposed to be well known.
Demo
We are currently running a PubScribe system under http://www.pubscribe.org. The current
installation allows the configuration of subscription templates (no free subscription definition possible due to the lack of an appropriate graphical interface) and offers access to a webbased stock service and to an Oracle 8i database storing the log file of the department web
server.
7 Summary and Conclusion
7
45
Summary and Conclusion
Many application areas require information directly delivered to the consumer. The wellknown publish/subscribe paradigm offers a reliable and scalable service to provide information delivery based on predefined subscriptions (in opposite to advertising). The proposed
PubScribe framework adapts this paradigm and extends it into several directions:
• The role model states that every module may act as a publisher or subscriber. This implies that all internal data flows are modeled as messages in the context of a publication. Furthermore, this conceptional view provides a deferred publishing mechanism without any additional effort, because publishers also react only with regard to incoming
subscriptions.
• The memory-based characteristics does not only allow the dispatching of published
messages to the appropriate subscribers, but enables a wide range of processing capabilities. The PubScribe framework distinguishes between snapshot subscriptions, deriving a subscription result only from the single currently valid message, and historic
(ex-nunc and ex-tunc) subscriptions referring to a sequence of historic messages,
which may be subject of evaluation using powerful operators like collapsing and windowing.
• The PubScribe system is designed to be open for publishing components as well as for
subscribers. XML documents are used as a mean to register new publisher, publishing
messages, and specifying subscriptions. Thus, everybody who is able to ’speak’ XML,
may join in the system.
In summary, the PubScribe framework is a solid base for building an open information market
place, which does not only provide a snapshot view of the world by dispatching document
style messages, but allows subscription definitions based on historic data analyzed with complex operators.
46
7. Summary and Conclusion
References
AdLi80 Adiba, M.; Lindsay, B.: Database Snapshots. In: VLDB Conference 1980, pp. 86-91
BEK+00 D. Box, D. Ehnebuske, G. Kakivaya, A. Layman, N. Mendelsohn, H.F. Nielsen, S. Thatte, D. Winer:
SOAP: Simple Object Access Protocol,
http://msdn.microsoft.com/workshop/xml/general/soapspec.asp, 2000
Birm93 K.P. Birman: The Process Group Approach to Reliable Distributed Computing. In: Communications
of the ACM, 36(1993)12, pp. 36-53
BKS+99 G. Banavar, M. Kaplan, K. Shaw, R.E. Strom, D.C. Sturman, W. Tao: Information Flow Based Event
Distribution Middleware. Middleware Workshop at the International Conference on Distributed
Computing Systems 1999
FMK+99 G. Fitzpatrick, T. Mansfield, S. Kaplan, D. Arnold, T. Phelps, B. Segall: Instrumenting and
Workshop on Community Knowledge, 1999
FoDu92 P.W. Foltz, S.T. Dumais: Personalized Information Delivery: An Analysis of Information Filtering
Methods. Communications of the ACM 35(1992)12, pp. 51-60
Forg82 Forgy, C.L.: Rete: A Fast Algorithm for the Many Pattern/Many Object Pattern Match Problem. In:
Artifical Intelligence, 19(1982)1, pp. 17-37
IBM00 N.N.: IBM DB2/UDB SQL Reference Version 7. IBM Corporation, pp. 177-183
LiPH00 L. Liu, C. Pu, W. Han: XWRAP: An XML-enabled Wrapper Construction System for Web
Information Sources, 2000 (to be published)
McDo89 D.R. McCarthy, U. Dayal: The Architecture Of An Active Data Base Management System. In:
SIGMOD Conference 1989, pp. 215-224
OMG00 N.N.: Event Service, Version 1.0. In: Corba Services Specifications, Object Management Group,
2000
OPSS93 B.M. Oki, M. Pflügl, A. Siegel, D. Skeen: The Information Bus - An Architecture for Extensible
Distributed Systems. SOSP 1993, pp. 58-68
Powe96 D. Powell: Group Communication (Introduction to the Special Section). Communications of the
ACM 39(1996)4, pp. 50-53
PuLi98 C. Pu, L. Liu: Update Monitoring: The CQ Project. In: The 2nd International Conference on
Worldwide Computing and Its Applications - WWCA'98, Tsukuba, Japan, Lecture Notes in
Computer Science 1368, pp. 396-411
RaDa98 S. Ramakrishnan, V. Dayal: The PointCast Network. In: SIGMOD Conference 1998, pp. 520
RoSc97 Roth, M.T.; Schwarz, P.M.: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data
Sources. In: VLDB Conference 1997, pp. 266-275
SAB+00 B. Segall, D. Arnold, J. Boot, M. Henderson, T. Phelps: Content Based Routing with Elvin4
SBC+98 R. Strom, G. Banavar, T. Chandra, M. Kaplan, K. Miller, B. Mukherjee, D. Sturman, M. Ward:
Gryphon: An Information Flow Based Approach to Message Brokering, International Symposium
on Software Reliability Engineering '98 Fast Abstract, 1998
SeLR94 Seshadri, P.; Livny, M.; Ramakrishnan, R.: Sequence Query Processing. In: SIGMOD Conference
1994, pp. 430-441
TGNO92 D.B. Terry, D. Goldberg, D. Nichols, B.M. Oki: Continuous Queries over Append-Only Databases.
In: SIGMOD Conference 1992, pp. 321-330
YaGa95 T.W. Yan, H. Garcia-Molina: SIFT - a Tool for Wide-Area Information Dissemination. USENIX
Winter 1995, pp. 177-186
B
Applications
48
Part B: Applications
REVEALING REAL PROBLEMS IN REAL DATA WAREHOUSE
APPLICATIONS
Wolfgang Lehner*, Thomas Ruf+
*
) Martin-Luther Universität Halle-Wittenberg - Institut für Informatik
Kurt-Mothes-Str. 1, D-06120 Halle (Saale)
[email protected]
+)
Martketing Services Europe (MSE)
Nordwestring 101, D-90319 Nürnberg
[email protected]
Abstract
Data warehouse systems offer novel techniques and services for data-rich applications both from a data modeling and a data processing point of view. In this paper, we
investigate how well state-of-the-art concepts fulfill the requirements of real-world data
warehouse applications. Referring to a concrete example from the market research area,
key problems with current approaches are identified in the areas of dimensional modeling, aggregation management, metric definitions, versioning and duality of both master
and tracking data, context-sensitive fact calculations, derived attributes, heterogeneous
reports, and data security. For some of these problems, solutions are shown both within
and beyond the framework of the data warehouse platform used for building GfK’s market research data warehouse system. The paper concludes with a list of requirements for
extensions of current data warehouse and OLAP systems.
50
1
1. Introduction
Introduction
Following the promises of various OLAP ('Online Analytical Processing'; [CoCs93]) tool
and relational database vendors, building a data warehouse is an easy and straightforward
task. The purpose of this paper is to share with the reader some observations made and lessons learned from applying state-of-the-art data warehouse technology to a real-world application scenario. It is assumed that the reader is familiar with the basic concepts of data warehousing and On-Line Analytical (OLAP) processing, in particular with dimensional data
modeling using classification hierarchies. By discussing how these concepts are used for
building the GfK data warehouse system, some general requirements for next-generation
data warehouse systems are derived, along with some proposals of how to overcome specific
problems in current systems.
Our presentations are based on the experiences in building a data warehouse for the GfK
Marketing Services. The GfK group (http://www.gfk.de/) is a worldwide operating market research company with >3300 employees and >$350Mio revenues in 1998. The headquarters
is based in Nuremberg, Germany. Besides media, ad-hoc and consumer panel research, GfK
offers a bundle of services in the non-food retail panel area. The non-food retail panel, which
is run by the organizational unit “GfK Marketing Services”, monitors basic market information (e.g. prices, stocks, sales units) from a selected sample of shops on a regular timely basis.
The data monitored from the sample shops are transformed into a common format, identified
(i.e. mapped to the GfK product master data), cleansed, extrapolated to the total markets and
transformed into information on key market factors, e.g. market share and model distribution
information. This information helps GfK’s customers to measure their performance in the
markets they operate in and to optimize their marketing and logistics efforts.
Since 1998, GfK is developing a new data production and reporting system for its non-food
business. The aim is to replace the out-dated, host-based system currently in use by a modern,
client/server-based solution and thus to substantially extend the data analysis capabilities.
The new data production and reporting system is based on data warehouse technology offered by MicroStrategy, which may be classified as a ROLAP (relational OLAP) approach.
The underlying database system is Oracle 8 running on an HP Unix server system.
To give an impression of the data volume, sales, stock, and price figures are monitored from
over 3000 shops and for more than 250.000 single articles for Germany alone on a bi-monthly basis. To be able to perform trend-analysis operations, back data for at least 3 years is required to be kept on-line. The application will be described in more detail throughout the remainder of this paper.
The GfK data warehouse application will be used to reveal general requirements as well as
to show specific problems in building the data warehouse system for the GfK Marketing Services. Section 2 starts the discussion describing all modeling oriented perspectives with a
special focus on a comprehensive support of dynamic classification mechanism and a versioning scheme for master as well as tracking data. Thereafter, we address various require-
2 Modeling the World of Business
51
ments from a data warehouse user and administrator point of view in Section 3. Section 4
outlines the general requirements according to an efficient aggregation process. We discuss
several perspectives which are essential from our practical point of view for implementing
an efficient data warehouse system. Section 5 picks up the point of management of summary
tables which is crucial in physical database design for a data warehouse. The paper closes
with a summary and a conclusion.
2
Modeling the World of Business
Data modeling in data warehouse systems is based on a distinction between quantifying and
qualifying data. The former, often also called tracking data or fact data, describe time-variant, typically numerical attribute values (e.g. price or stock data). The latter spawn a multidimensional descriptive context which is necessary to assign a meaning to the quantifying
data (e.g. product, shop, and time information). The typical data warehouse architecture is
characterized by a central fact table, surrounded by some descriptive dimensions.
Actually there are three basic dimensions (product, segment, and time) and one auxiliary dimension (price classes). The basic dimensions directly correspond to the data collection parameters in the field, i.e. measured data are collected in an article/shop/period context; the
price class dimension will be described in Sections 2.2 and 2.3. In a proper dimensional model, dimensions are orthogonal to one another, meaning that values in the dimensions can be
selected and combined in an arbitrary manner. If dimensions are modeled properly, they can
be discussed independently from one another. We will concentrate on the product dimension
in the following to elaborate on modeling dimensions internally.
2.1 Context-Sensitive Dimensional Attributes
As already outlined in the introduction, the data schema for our data warehouse project
seems to fit very seamlessly into what is commonly known as a star- or snowflake schema.
As we will discuss in this section, however, there are a lot of requirements, which can not
directly supported by referring to such a simple approach. Nevertheless, we believe that our
list of defects is not specific to the market research application domain but addresses a wide
area of applications.
The product dimension is hierarchically organized; single articles are grouped into product
groups, these in turn are recursively grouped into product categories and those into product
sectors. Each product within a product group holds a set of features describing properties of
products for that specific product.
52
2. Modeling the World of Business
According to the general multidimensional modeling idea, dimensional attributes are used to
further describe elements of a dimension. For the product dimension, typical dimensional attributes are brand, color, or packaging type, which may be assigned globally to every article.
However, our product dimension additionally holds local dimensional attributes. For example, the feature 'video system' is only valid for video related articles, whereas the average water usage may be applicable only for the product group of 'dish washers'. Conceptually, the
dimensional hierarchy represents an inheritance hierarchy, where each node within that hierarchy reflects a specialization of its parent node.
Although local dimensional are a very natural phenomena, we could not find sufficient support for them in a commercial product. Therefore, we demand further support from a modeling perspective ([Lehn98]) as well as from an architectural point of view in exploiting already existing object-relational features of commercial database systems. We currently solve
that problem of attributes specific to a certain product group by introducing an additional layer, which maps locally valid properties to generic global properties. A generic attribute
‚M02‘ may therefore reflect the property ‚video system‘ in the product group of camcorders
and, within the same data schema, denote the property ‚water usage‘ for all dish washers. Unfortunately, no OLAP vendor was able to provide such a resolution, so that we are currently
going to implement this layer in our end-user applications.
2.2 Secondary and Ad-Hoc Classifications
Tightly coupled with the former characteristics of specialized properties of dimensional elements is the requirement to define secondary classifications. Many OLAP tools provide that
mechanism as a parallel classification (‚week‘ and ‚month‘ classifications are parallel under
the ‚year' node). In our context, the existence of a secondary classification depends on a specific product group and primarily refers to a dimensional attribute. Moreover, classifications
based on values of dimensional attributes are not necessarily balanced. A typical example
might be 'sales of camcorders split by VideoSystemGroups' where ‚VideoSystemGroup‘ is
defined as a secondary classification based on the specific VideoSystem of a single camcorder. For example, the VideoSystem instances ´VHS´, `VHS-C´ and ´S-VHS´ may be classified
into the VideoSystemGroup ´VHS´.
In combination with this requirement, our typical power users, which are working intensively
with the data warehouse system, want to be able to create classifications 'on the fly' to test
whether such created baskets of products yields in classes with a higher value for further
analysis. This fact is important from a modeling point of view, from a end-user program design point of view, and especially important according to physical support of new aggregation combinations coming along with such ‚ad-hoc‘ classifications.
2.3 Dynamic Classifications
53
The third observation in this context may be denoted as 'functionally determined classifications'. For example, price classifications are defined so that each class of the classification
subsumes the same number of articles. Since the price of an article is recorded in the context
of tracking data, this implies that tracking data have an influence on the design of master data!
Unfortunately, we have not found any OLAP product supporting both requirements to our
full satisfaction, resulting again in a high additional implementation overhead for our data
warehouse project.
2.3 Dynamic Classifications
A further major issue according to modeling perspective is that classifications within our data
warehouse schema are not static but are dynamically changing. In general, master data
changes must be considered under different perspectives. We will outline the problems and
resulting requirements again referring to the product dimension of our concrete scenario.
Since each market is changing rapidly, it is obvious that new articles are added to the master
data base nearly every day. The assignment of an article to a product group is extremely critical, since the presence or absence of specific articles implies a great semantic impact according to the market shares within a single segment. On the other side, single items of a dimensional structure must never be deleted. Instead, each item must be accompanied by a valid
time indicator denoting the time frame when the specific item has been valid. The third case
reflects the situation, when a single item moves from one product group to another one. This
may happen explicitly or implicitly. The explicit move refers to the situation when a new
class is introduced and populated with already existing products. The implicit move of a single article corresponds to the situation when the corresponding classification is functionally
determined. For example, since the assignment of a product to a specific price class depends
on the (average) price of that product, a product may move from one class to another if the
price of that product changes.
Consider a range query over multiple periods. In this case, the system has to decide, to which
price (and to which price classifications) the query has to refer. The more general problem,
which from our point of view needs a detailed investigation, is known as the 'duality of master and tracking data' ([Shos82]). On the one hand, price figures, for example, are gathered
periodically in the context of tracking data and minimum/average/maximum prices are derived from these detailed data. Hence, price information is used to define the classifications
and therefore should be considered and treated like master data.
These characteristics require a comprehensive management of the validity and versioning of
dimensional structures (or of single items within a dimension). Especially in combination
with ad-hoc classifications, the notion of different variants of dimensional structures has to
be paid attention to. To validate former query results, each valid state of a dimensional structure must be easily retrievable and multidimensional data must be consistently queryable.
54
3. User-Friendly Warehouse Access
Technically, versioning may be implemented with explicit time stamps (´valid_from´,
´valid_through´) or implicitly by the use of surrogate keys. In the latter case, the multi-column key consisting of the identifiers of all dimensional attributes which have to managed under version control is replaced by a single-column surrogate key, which is used also as the
foreign key to the fact table. Whenever one of the underlying dimensional attributes is versioning, a new surrogate key is created. This allows to access the data alternatively in a version-aware (key-based access) and a version-free (ID-based access) manner; validity information may be derived from the first and last occurrences of a specific key in the fact table.
Without going into the very detail, we will note at this point that not only versioning of master data but also versioning of tracking data is of fundamental importance. As an example for
that requirement, consider that a single shop reports incorrect sales figures (which, in real
world, happens all the time!), and derived information is already delivered to the customer
when the sales figures in the warehouse database are corrected. To provide report consistency
(for example w.r.t. yearly summaries), it must be always possible to refer to the old and
wrong, but already delivered tracking data.
3
User-Friendly Warehouse Access
In this section, we will briefly summarize the problems and requirements of data warehouse
access from a user point of view. We will omit classical requirements like 'as fast as the old
system' or 'as flexible as my current spreadsheet program', since those requirements are of
fundamental importance for the acceptance of every software system and need not to be repeated here.
3.1 Report Management, Design, and Access
To outline the requirements according to a comprehensive report management, we again refer to our scenario, where for each reporting period thousands of predefined reports and
charts have to be generated. It is easy to imagine that many persons from different departments are involved in developing and designing those reports and charts. Therefore, a possibility of having concurrent access to a library of such reports is crucial. Moreover, reports
again may be classified according to specific characteristics ('ranking reports', 'running reports', 'distributions', ...). Furthermore, reports have to be subject of a versioning process,
since report definitions are evolving over time. A comprehensive report library must consider
general report definitions as well as definitions specific for single customers and groups of
customers. As a final point, we would like to mention that report definitions must be subject
of user access restriction. We will elaborate on this point in more detail in Section 3.4.
3.2 Proactive Information Delivery
55
From a report management point of view, we could not find any commercial solution which
provides all of the requirements mentioned above. Our current implementation basis (DSS
Suite; [MSI98]) allows to group templates and to organize reports at least in a quite restrictive
manner. Versioning of reports is not supported.
From the design and access point of view, we experienced that current products are quite restrictive in generating complex OLAP reports. In our case, single reports do not consist of a
single homogenous spreadsheet but of a combination of more than one spreadsheet, which
are designed following the wishes of our customers (and not some algorithmic or schematic
principle). Unfortunately, current products support only complex reports defined on a schematic level (i.e. by giving attribute combinations). As it is explained in full detail in
[RuGR99], we were forced to implement a way to define composite reports based on the instance level by ourselves.
3.2 Proactive Information Delivery
As the concept of data warehousing is getting more and more attractive within the every-daybusiness, there is a growing need for proactive information delivery as an alternative way to
retrieve knowledge from the data warehouse database. The supported scenario should be like
this: In a first step, a user poses an Inquire()-Call to retrieve a list of channel or sources of
information. In a second step the user may submit a Subscription()-Call to place an order of
incoming information. This ’order’ is registered in the data warehouse system. As soon as
new information arrives in the system and the corresponding delivery property is satisfied,
the subscription is evaluated and the result is delivered to the appropriate user. The delivery
of this pieces of information may be done by various ways. Small and urgent information
may be delivered using mobile communication techniques (like SMS-message to cell
phones). Customized information may be translated into different data formats (like Excel)
and prepared for download (laptop computers) or synchronisation (handheld computers). It
should be noted here that the process of proactive information delivery may be additional to
any other query driven analysis methods as mentioned before. The technique targets a preliminary step to get informed about important and interesting changes in the data warehouse.
3.3 Programming Interface
Since a data warehouse in general targets the information supply for a whole company, we
learned that a data warehouse has to be open even for department-specific extensions which
are not covered by the data warehouse project team directly (either due to financial or political situations). Therefore, we believe that a data warehouse system has to provide efficient
access to its data for the end user as well as for application programmers at other departments
or organizational units.
56
4. Ways of Optimizing Aggregation Processing
With an underlying relational database system (see Section 4.1 for a discussion of the system
architecture), SQL is one but certainly not the right way to access multidimensional information. Our current implementation of end-user tools ([RuGR99]) is based on the object-oriented interface provided by our commercial OLAP engine (DSS Objects). This programming
interface provides an easy-to-use, high-level access to existing data warehouse 'objects' (like
report definitions). On the other hand, this interface is highly proprietary w.r.t. to the specific
OLAP solution. According to this point, we demand standardized, high-level access to multidimensional data. One promising approach is the current state of OLE-DB, especially in
conjunction with the 'MultiDimensional eXpression' language MDX ([MSC98]).
3.4 User-Friendliness Unlimited?
When talking about fast and user-friendly access to data, we also have to discuss the backside
of that coin - security of detailed tracking data. As we already have mentioned, shops are delivering their very critical business data (like sales and turnover figures) to GfK down to the
individual article level. Thus, it is extremely important to keep those detailed tracking data
secret for the public and especially for the main competitors of a specific data provider. Unfortunately, this cannot be provided alone by simply rejecting a query asking for specific data. It is much more challenging (and unfortunately we see nobody who is currently dealing
with that problem) to recognize and deny the access to so-called tracker queries dynamically
at run-time. Tracker queries are mathematically well-understood ([Mich91]) and may be
seen as a set of queries retrieving just regular looking aggregated information. However,
combining the results of a sequence of non-critical queries may narrow down the data to the
single data item level, especially if some knowledge can be derived outside of the database
(e.g. about trade brands that are only sold in specific shops). This problem becomes even
more critical when we think about connecting our warehouse database to the WWW. We demand (further) research efforts in this area and believe that this is a general problem and not
only specific to our scenario.
4
Ways of Optimizing Aggregation Processing
Aggregation processing is the most challenging and mission-critical point in building an efficient data warehouse system. In this section, we will outline some ways of how to optimize
aggregation processing in a data warehouse system. We will enumerate the most interesting
points and detail the current state of research and commercial systems. Moreover, we will
attach a list of requirements and demands which are, from our practical point of view, most
beneficial for a successful data warehouse project.
4.1 The System Architecture
57
4.1 The System Architecture
With the advent of the idea of data warehousing and multidimensional data exploration, there
has been a long discussion about the right way of bringing multidimensional data ‚down to
the metal’, i.e. mapping data cubes to main memory and to hard disk ([Coll96]). In the meantime, it has turned out that storing the high volumes of detailed tracking data in a relational
system is most promising, especially from a scalability point of view. For evaluation reasons,
we initially performed a case study in the context of our data warehouse project, where we
have compared a relational with a multidimensional database system for a pre-defined and as
far as possible representative set of queries ([LeRT95]). From our investigations, we strongly
support the recommendation that multidimensional data structures are suited mainly for organizing data warehouse information in main memory as well as storing data for small
‚Desktop Online Analytical Processing’ applications on disk (e.g. Plato [MSC99]). Hybrid
architectures try to combine both techniques. They are either derived from the relational approach and provide a multidimensional caching mechanism at the client side or are coming
along with a real multidimensionally organized data cube enabling a ‘drill-through’ possibility to a relational database system for the very detailed tracking data.
Although the relational database technique has proven to be reliable as well as scaleable up
to multiple terabytes, running a relational database system in a data warehouse mode shows
different usage patterns as compared to running a database system in a transactional mode.
Instead of single row access, large volumes of data are touched by a single query and mostly
aggregated along pre-defined hierarchies (see Section 2). To speed up data warehouse queries
by aggregation support, several independent as well as combined techniques are proposed in
the literature and step-by-step implemented into commercial relational database products.
The current state-of-the-art in commercially available systems is that aggregates are registered in the data warehouse system and automatically used if they match the query predicate.
Little work is known so far in the area of self-adopting aggregation management techniques
and partial match predicate usage.
4.2 Support of Specialized Index Structures
Whereas traditional B-tree-like index structures are useful for queries with a high selectivity
(e.g. ”give me the address of Mr. Miller”), the old idea of bit-wise index structures
([ONei87]) comes to a rejuvenation in the context of data warehouse applications
([ONQu97]). Opposite to B-tree index structures, bit-wise indexes are designed to speed up
queries ranging over attributes with a low cardinality (e.g. sales by gender). Nowadays, every
major relational database system has implemented a flavor of those index structures varying
mostly in compression methods and extension to join indexes. Pointing into the same direction as summary tables, join indexes are basically tables holding tuple identifiers of a precomputed join operation between the fact table and a dimension table of a specific star- or
58
4. Ways of Optimizing Aggregation Processing
snowflake schema. In our application scenario, all of these techniques may be applied, as cardinality ranges from 2 (e.g. ´yes´/´no´ or ´with´/´without´) to a couple of hundreds (e.g.
brands) for frequently needed data warehouse attributes.
4.3 Sampling
Tracing the roots back to the area of Statistical Databases ([Olke93]), Informix (http://
www.informix.com/) has been the first major database vendor implementing a sampling
mechanism to speed up queries especially in the developing and testing phase of a database
application. New approaches like [HeHW97] try to iteratively apply sampling techniques
yielding in a result as exact as the user wants it to be. Sampling techniques in general show
a high potential especially in the context of risk analyses, trend explorations, or trend forecast
applications. The concept must not be confused in our application with sampling during data
collection; whilst the data collection sample defines the universe of data to be evaluated, sampling during query execution relates to sub-sets of data within the universe of discourse.
Since we believe that sampling serves a wide class of applications, we demand that sampling
techniques in relational databases are much more exploited.
4.4 Materialized Views
Based on the concept of database snapshots ([AdLi80]), summary tables reflect a special
form of materialized views, where queries make heavy use of aggregation operations. To provide efficient access to highly aggregated data, using, maintaining, and selecting the appropriate set of summary tables in a data warehouse application is of paramount importance. Although the concept of precomputing summary data is a well-known technique in the SSDB
(‘Statistical & Scientific Databases’, [Shos82], [ChMc89]) area, an adequate support of summary tables is a hot topic in relational database research and within the commercial database
community. Since summary tables are crucial to a successful data warehouse project, we will
detail the requirements coming along with this technique in Section 5 in more detail.
4.5 Multiple Query Optimization
Current database technology is primarily designed to isolate single queries from one another
(Isolation as part of the ACID concept for transaction processing). In the context of data
warehousing, however, it is quite common to a-priorily know a set of queries for which results have to be computed on the same set of tables, i.e. a single star-schema. This characteristic provides a sound basis for the application of multiple query optimizations. Prior work
in this area ([Sell88]) has focussed on finding common join operations and predicates within
a set of queries. This approach failed mainly due to the complexity in handling general predicates. From a data warehouse point of view, the focus of multiple query optimization is on
4.6 Handling Historical Data
59
using common aggregation levels. The work of [YaKl97] reflects the current state of research
in this area. As a matter of fact and reinforced by experiences from our concrete application
scenario, where thousands of predefined reports have to be computed in every reporting period, we believe that the technique of multiple query optimization in some kind of batchmode would provide an enormous optimization potential.
4.6 Handling Historical Data
The last point in our – by no means complete - list of optimization potentials for an efficient
aggregation processing is dealing with historical data. According to the wide-spread definition of a ‘data warehouse’ ([Kimb96]), keeping historical data is one of the four distinctive
characteristics of a data warehouse. Unfortunately, as we had to learn in our application scenario, historical data causes several problems in several aspects, where we demand extensive
support from the underlying database system. As outlined in the former section, the raw data
base of the current period may slightly change during a reporting period, e.g. when some
shops delay the reporting of their sales figures or correct them afterwards. Therefore, the data
of the current period has to be physically stored in a different way as the data from older periods, e.g. with a higher RAID level. As the data becomes more and more stable, the physical
organization of the data may be changed to less cost-intensive forms. Keeping historic data
on a high RAID level is not only an issue of saving money, but migrating stable data to cheaper formats, which are often optimized for read-only access, may even increase the data access
rate.
As we have seen in our application, it is often the case that very detailed tracking data from
former reporting periods must not necessarily be kept online. It should be possible that data
are transparently migrated to a near-line tape archive, for example, whenever possible. However, the data has to be still accessible by the database system in an application-transparent
way. This migration policy has to be combined with a comprehensive summary table management. Up to now, we see nobody spending some efforts within this area at the commercial
side as well as within the research community.
5
The Magic Triangle of Summary Tables
The concept of summary tables addresses three different functional perspectives which have
to be adequately supported within an efficient data warehouse system. The requirements of
using, maintaining, and selecting the ‘best’ set of summary tables define our so-called ‚Magic Triangle of Summary Tables’. In addition to these functional perspectives, we will address
the question of which architectural component has to provide a sophisticated management
for summary tables yielding the greatest benefit with the lowest additional cost.
60
5. The Magic Triangle of Summary Tables
5.1 Transparent Use of Summary Tables
Assuming that pre-computing summary tables already exist, the first step is to take advantage
of their existence and provide a speed-up in answering incoming queries. This could be done
either on the query specification level, where the user has to be aware of existing summary
tables, or transparently within the ROLAP server or inside the relational database engine. An
explicit use of summary data in the end-user queries leads to a dynamic management of the
set of summary tables at application level and should not be taken into consideration. Internal
query re-routing at the level of the ROLAP server implies, on the one hand, that the server
has to have appropriate knowledge of the existence of summary tables. On the other hand,
the OLAP engine usually has detailed knowledge of functional dependencies prevailing in
the dimensional structures (cf. the article – product group – product category – product sector
hierarchy mentioned earlier in this paper). As long as the underlying relational schema is not
normalized (this is the normal case in data warehouse applications), information about functional dependencies inside a table is not available within a relational engine. To overcome
these restrictions and to use that knowledge for the test of derivability of an incoming query,
SQL extensions like the ‘create hierarchy’ statement in RiSQL ([RBS98]) are required. Current algorithms of derivability (e.g. [ScSV96]) are limited to ‘equal match’ or to ‘query containment’ queries, where the query and the summary table must either match exactly, or the
query must be completely derivable from the underlying summary table. A first proposal removing this limitation and derive a single query from a set of summary tables (‘set-derivability’) is presented in [AlGL99].
Since the transparent use of summary data is state-of-the-art in modern database systems, we
demand the implementation of extended derivability techniques and a more convenient way
to formulate user-defined aggregation operations to reflect the increasing need for complex
and application-oriented statistical analyses.
5.2 Maintaining Summary Data
As already outlined in the modeling section, tracking data in data warehouse applications is
only stable from a theoretical point of view. Most applications require to change the fact data
after the production of generic summary data or specific pre-defined end-user reports. This
requires that derived data have to be maintained if a change of the base data occurs. The research community has been tackling the aggregate maintenance problem for the last few
years resulting in highly sophisticated algorithms (see [GuMa95] and [MuQM97] for an excellent overview). On the opposite, commercial database systems hardly have even started to
implement those features. Although a relational database system seems to be the right place
for such ´repair´ operations (the changes are made at the base data residing in the relational
engine), many derived data like the result of complex distribution or derivation analyses are
typically performed inside the OLAP server and have therefore to be maintained under direct
control of the OLAP server.
5.3 Selection of the ‘Best’ Set of Summary Tables
61
In general, we see that a lot of work still needs to be done in this important area. Especially
in the context of user-defined aggregation functions, it becomes crucial for relational databases systems to fulfill the ever increasing requirements with regard to complex aggregation
functions. We see the need of a way to define an aggregation function itself in combination
with a buddy-function describing a (as far as possible incremental) maintenance algorithm
for the original aggregation function.
5.3 Selection of the ‘Best’ Set of Summary Tables
The third perspective in the context of summary data management is the question of which
attribute combination yields the best performance gains in query processing and should
therefore become a materialized summary table ([Pend98]). Optimality in this case is determined by the (estimated) size of the summary data, the reference frequency of the attribute
combination (either directly or indirectly from queries referring to a combination which is
derivable from that specific combination), and from the savings potentials in relation to the
raw data or the next summary table from which this specific combination is derivable. Current research proposals (e.g. [GHRU97], [BaPT97], and [Gupt97]) are addressing that problem only with respect to attribute combinations. Therefore, these algorithms lack an appropriate support of analysis for hot-spots like ‘the last two periods’, because they do not take
data partitions into consideration. The only known work within this area is [DRSN98] and
[AlGL99]. This perspective comes along with the requirement of adaptability of the set of
summary tables according to the users’ reference behavior. Ideally, the database system
should automatically determine the best set of summary tables. The reality on the commercial side is that the database administrator has to explicitly define and populate summary tables. Current research has already picked up this topic in the context of automatically determining the set of appropriate indexes. Commercial products like Redbrick Vista ([RBS98])
or OLAP Services ([MSC99]) also provide a first solution to that problem.
From our data warehouse application point of view, content-based and dynamically organized sets of summary tables are of fundamental importance to ensure adequate query response time, and we demand further development within this area. The current state-of-theart is providing an administrator with some hints of good summary tables ([MSI98]), which
is far from being optimal.
6
In this paper, we have presented our experiences from building a market research data warehouse for GfK Marketing Services. We have addressed the problems from a modeling point
of view (esp. modeling a dimensional hierarchy like an inheritance hierarchy), from a user’s
62
access point of view (management of report libraries and data access denial for tracker queries), and we stated our requirements in combination with physical database design considerations in the context of data warehousing.
As far as we can see, existing products only cover some core data warehouse functions, but
lack of sophisticated modeling, administration and operation support for many real-world
problems. Unfortunately, it seems that the vendors prefer to endorse new opportunities (e.g.
broadcasting services over the Web) rather than strengthening the core systems with badly
needed functionality. The worst thing to happen would be that the responsibility of finding
proper solutions to challenging requirements is shipped back to the end user.
6 Summary and Conclusion
63
References
AdLi80
AlGL99
BaPT97
ChMc89
CoCS93
Coll96
DRSN98
GHRU97
GuMa95
Gupt97
HeHW97
Kimb96
LeRT95
Lehn98
Mich91
MSC98
MSC99
MSI98
MuQM97
Olke93
ONei87
Adiba, M.E.; Lindsay, B.G.: Database Snapshots. In: Proceedings of the 6th International
Conference on Very Large Data Bases (VLDB’80, Montreal, Canada, Oct. 1-3), 1980, pp. 86-91
Albrecht, J.; Guenzel, H.; Lehner, W.: Foundations for the Derivability of Multidimensional
Aggregates, submitted to DAWAK‘99
Baralis, E.; Paraboschi, S.; Teniente, E.: Materialized Views Selection in a Multidimensional
Database. In: Proceedings of the 23rd International Conference on Very Large Data Bases
(VLDB’97, Athens, Greece, Aug. 25-29), 1997, pp. 156-165
Chen, M.C.; McNamee, L.P.: On the Data Model and Access Method of Summary Data
Management. In: IEEE Transactions on Knowledge and Data Engineering 1(1989)4, pp. 519-529
Codd, E.F.: Codd, S.B.; Salley, C.T.: Providing OLAP (On-line Analytical Processing) to User
Analysts: An IT Mandate, White Paper, Arbor Software Corporation, 1993
Colliat, G.: OLAP, Relational, and Multidimensional Database Systems. In: ACM SIGMOD Record
25(1996)3, pp. 64-69
Deshpande, P.M.; Ramasamy, K.; Shukla, A.; Naughton, J.F.: Caching Multidimensional Queries
Using Chunks. In: Proceedings of the 27th International Conference on Management of Data
(SIGMOD’98, Seattle (WA), June 2-4), 1998, pp. 259-270
Gupta, H.; Harinarayan, V.; Rajaraman, A.; Ullman, J.D.: Index Selection for OLAP. In:
Proceedings of the 13th International Conference on Data Engineering (ICDE’97, Birmingham,
Great Britain, April 7-11), 1997, pp. 208-219
Gupta, A.; Mumick, I.: Maintenance of Materialized Views: Problems, Techniques, and
Applications. In: IEEE Data Engineering Bulletin 18(1995)2, pp. 3-18
Gupta, H.: Selection of Views to Materialize in a Data Warehouse. In: Proceedings of the 6th
International Conference on Database Theory (ICDT‘97, Delphi, Greece, Jan. 8-10), 1997,
pp. 98-112
Hellerstein, J.M.; Haas, P.J.; Wang, H.J.: Online Aggregation. In: Proceedings of the 26th
International Conference on Management of Data (SIGMOD’97, Tucson (AZ), May 13-15), 1997,
pp. 171-182
Kimball, R.: The Data Warehouse Toolkit, 2nd edition. New York, Chichester, Brisbane, Toronto,
Singapore: John Wiley & Sons, Inc., 1996
Lehner, W.; Ruf, T.; Teschke, M.: Data Management in Scientific Computing: A Study in Market
Research. In: Proceedings of the International Conference on Applications of Databases (ADB’95,
Santa Clara (CA), Dec. 13-15), 1995, pp. 31-35
Lehner, W.: Modeling Large Scale OLAP Scenarios. In: Proceedings of the 6th International
Conference on Extending Database Technology (EDBT’98, Valencia, Spain, March 23-27), 1998,
pp. 153-167
Michalewicz, Z. (Ed.): Statistical and Scientific Databases. Chichester, West Sussex, England:
Ellis Horwood Limited, 1991
Microsoft Corporation: OLE DB and OLE DB for OLAP specification, 1999 (http://
www.microsoft.com/data/oledb/)
Microsoft Corporation: SQL Server 7.0 OLAP Services, 1999
MicroStrategy, Inc.: DSS Suite, 1998
Mumick, I.; Quass, D.; Mumick, B.: Maintenance of Data Cubes and Summary Tables in a
Warehouse. In: Proceedings of the 26th International Conference on Management of Data
(SIGMOD’97, Tucson (AZ), May 13-15), 1997, pp. 100-111
Olken, F.: Random Sampling from Databases. Technical Report 32883, University of California
Berkeley; Lawrence Berkeley Laboratory, Berkeley (CA), April 1993
O’Neil, P.: Model 204: Architecture and Performance. In: Gawlick, D.; Haynie, M.; Reuter, A.
(Ed.): High Performance Transaction Systems. Lecture Notes in Computer Science 359, Springer,
1987
64
ONQu97 O’Neil, P.; Quass, D.: Improved Query Performance with Variant Indexes. In: Proceedings of the
26th International Conference on Management of Data (SIGMOD’97, Tucson (AZ), May 13-15),
1997, pp. 38-49
Pend98
Pendse, N.: Database Explosion, Business Intelligence Ltd., 1998 (http://www.olapreport.com/
DatabaseExplosion.htm)
RBS98
Red Brick Systems, Inc.: Red Brick Vista Aggregate Computation and Management. White Paper,
1998
RuGR99 Ruf, T.; Goerlich, J.; Reinfels, I.: Complex report support in data warehouse and OLAP
environments, submitted to DAWAK‘99
ScSV96 Scheuermann, P.; Shim, J.; Vingralek, R.: WATCHMAN: A Data Warehouse Intelligent Cache
Manager. In: Proceedings of the 22nd International Conference on Very Large Data Bases
(VLDB’96, Bombay, India, Sept. 3-6), 1996, pp. 51-62
Sell88
Sellis, T.: Multiple Query Optimization. In: Transactions on Database Systems 13(1988)1, pp. 2351
Shos82
Shoshani, A.: Statistical Databases: Characteristics, Problems, and Some Solutions. In:
Proceedings of the 8th International Conference on Very Large Data Bases (VLDB’82, Mexico
City, Mexico, Sept. 8-10), 1982, pp. 208-222
YaKL97 Yang, J.; Karlapalem, K.; Li, Q.: Algorithms for Materialized View Design in Data Warehousing
Environment. In: Proceedings of the 23rd International Conference on Very Large Data Bases
(VLDB’97, Athens, Greece, Aug. 25-29), 1997, pp. 136-145
PUBLISH/SUBSCRIBE-SYSTEME IM DATA-WAREHOUSING:
MEHR ALS NUR EINE RENAISSANCE DER BATCHVERARBEITUNG
W. Lehner, W. Hümmer, M. Redert, C. Reinhard
Universität Erlangen-Nürnberg - Lehrstuhl für Datenbanksysteme
Martensstr. 3, D-91058 Erlangen
{lehner, huemmer, mlredert, cnreinha}@immd6.informatik.uni-erlangen.de
Kurzfassung
Der Erfolg von Data-Warehouse-Systemen weckt neue Anforderungen, die in künftigen
Data-Warehouse-Architekturen berücksichtigt werden müssen. Dieses Papier motiviert
die Erweiterung der klassischen Data- Warehouse-Architektur um eine ‘Publish/Subscribe’-Komponente, um eine nachrichtenzentrierte Informationsversorgung und eine
angebotsgetriebene Informationsbereitstellung im Kontext eines integrierten Informationssystems zu ermöglichen. Um diesen Anspruch zu konkretisieren, wird als Beispiel
eines ‘Publish/Subscribe’-Systems das PubScribe-Projekt vorgestellt. Dabei wird auf die
logische Architektur, das Verarbeitungsmodell und die prototypische Realisierung
eingegangen. Es zeigt sich, dass der PubScribe-Ansatz, basierend auf einem strikten
Rollenkonzept in Kombination mit einer komplexen Verarbeitungslogik, als solider
Baustein eines integrierten Informationssystems positioniert werden kann.
66
1
1. Einleitung
Einleitung
Ein Data-Warehouse stellt - im wesentlichen - eine integrierte und historisierte Datenbasis
über eine Vielzahl partizipierender Quellsysteme dar. Im Gegensatz zum ‚query shipping‘Ansatz im der Bereich der föderativen Datenbanksysteme ([Conr97]), erfolgt im Data-Warehouse-Ansatz eine physische Integration und Historisierung lokaler Datenbestände
(Abbildung 1.1). Einem Benutzer zeigt sich aus funktionaler Perspektive ein Data-Warehouse-System aus folgenden Blickwinkeln:
• Systemzentrierte Informationsversorgung:
Um Informationen aus dem Datenbestand abzuleiten, ist der Einsatz spezieller Analysewerkzeuge beispielweise für die interaktive Datenexploration ‘Online Analytical
Processing’ (OLAP) oder das Aufdecken von (Un-)Regelmäßigkeiten im Rahmen eines Data-Mining-Prozesses, etc. notwendig.
• Nachfragegetriebene Informationsbereitstellung:
Da das klassische Data-Warehouse-Konzept eine explizite Integration von Quellsystemen in einen Data-Warehouse-Datenbestand vorsieht, werden neue Quellsysteme lediglich auf Wunsch bzw. Drängen der Benutzer hin integriert.
Um der Vision einer umfassenden Informationsversorgung im Kontext der Next-Generation
Data-Warehouse-Systeme ein Stück näher zu kommen, werden beide Perspektiven mit neuen
Anforderungen und deren Auswirkungen konfrontiert:
• Öffnung des Data-Warehouse-Systems:
Während die ursprüngliche Idee des Data Warehousing darin bestand, organisationsintern eine konsolidierte Datenbasis zur Entscheidungsfindung zu etablieren ([Inmo96]),
wird der Dienst “Data-Warehouse” immer interessanter für weiter gefasste BenutzerBenutzer
Auswertungsbezogene
Datenbank
(denormalisiert,
Summendaten)
‘Publish/
Subscribe’Komponente
Basisdatenbank
(normalisiert,
Detaildaten)
Externe Datenquellen
Datenbeschaffung
(Extraktion,
Transformation, ...)
Data-Warehouse-System
Quellsysteme
Abb. 1.1: Data-Warehouse-Architektur mit ‚Publish/Subscribe‘-Komponente
1 Einleitung
67
gruppen, sowohl organisationsintern als auch als Gegenstand externer Vermarktung.
Eine Öffnung insbesondere zum Internet wird in einer Explosion der Benutzerzahlen
resultieren, die es bei gleichbleibender Dienstequalität zu versorgen gilt.
• Flexible Integration externer Datenquellen:
Es ist zu erwarten, dass für komplexe Analysen im steigenden Maß auf externe Informationen beispielsweise zur Durchführung anwendungsspezifischer Vergleiche benötigt werden. Damit wird sich die Möglichkeit der dynamischen Integration externer
Datenquellen, insbesondere Web-Seiten, als eine zentrale Anforderung ergeben.
In diesem Beitrag schlagen wir die Erweiterung der klassischen Data-Warehouse-Architektur, wie sie in Abbildung 1.1 nach [BaGü00] skizziert ist, um eine ‘Publish/Subscribe’-Komponente vor und konkretisieren den Vorschlag am Beispiel des PubScribe-Projektes.
Das ‘Publish/Subscribe’-Verarbeitungskonzept
Allgemein bietet der Ansatz des ‚Publish/Subscribe‘ einen robusten und skalierbaren Mechanismus zur Kommunikation loose gekoppelter Systeme ([Birm93], [Powe96]). ‚Publish/
Subscribe‘ steht dabei im Gegensatz zum klassischen Verarbeitungskonzept des ‚Request/
Response‘ (Abbildung 1.2a). Im ‚Request/Response‘-Ansatz wendet sich der Benutzer direkt an einen Diensterbringer, um einen gewissen Dienst anzufordern. Während der Bearbeitung ist der Dienstnehmer blockiert, so dass der Diensterbringer im allgemeinen auf ein effizientes Erbringen des Dienstes ausgerichtet ist.
Die direkte Kommunikation zwischen Produzent und Konsument wird im ‚Publish/Subscribe‘-Ansatz durch Einführung einer Vermittlungskomponente aufgeweicht. Produzenten
(‘Publisher’) publizieren Informationen aus eigenem Antrieb an das Subskriptionsmanagementsystem. Konsumenten (‘Subscriber’) hinterlegen einmalig ihre Anforderungen in Form
einer Subskription und werden über neue Publikationen gemäß spezifizierter Auslieferungsbedingungen unterrichtet.
Eine Anreicherung der klassischen Data-Warehouse-Architektur um eine ‘Publish/Subscribe’-Komponente verspricht eine solide Basis, die die zuvor aufgestellten Anforderungen ergänzend zu bestehenden ‘Request/Response’-Ansätzen in folgender Form zu erfüllen verspricht:
publish
request
subscribe
response
deliver
Produzenten
Konsumenten
a) ‚Request/Response‘-Ansatz
Publisher
SubskriptionsManagementSystem
Subscriber
b) ‚Publish/Subscribe‘-Ansatz
Abb. 1.2: Vergleich von ‚Request/Response‘ und ‚Publish/Subscribe‘
68
1. Einleitung
• Übergang zu einer nachrichtenzentrierten Informationsversorgung:
Im Kontext des ‘Publish/Subscribe’ steht nicht mehr das Auswertesystem, sondern die
Auswertungen in Form von Benachrichtungen im Zentrum des Interesses. Durch die
registrierten Subskriptionen ist es dem System möglich, ähnliche Subskriptionen gemeinsam auszuwerten oder Subskriptionszustände inkrementell zu warten, so dass
eine weitreichende Optimierung der Verarbeitung ermöglicht wird.*
• Übergang zu einer angebotsgetriebenen Informationsbereitstellung:
Analog zu neuen Nachrichten, ist es dem ‘Publish/Subscribe’-Ansatz inhärent zu Eigen, dass neue Produzenten auftreten und am System partizipieren. Die explizite Integration von Datenquellen auf klassischem Weg wird entsprechend um angebotsgetriebene Datenlieferanten erweitert.
‚Publish/Subscribe‘ läßt sich in einer Vielzahl von Anwendungsszenarien auf unterschiedlichen Ebenen einer Dienstehierarchie einsetzen (Abschnitt 5). Im Kontext der Datenbanksysteme wurde ‚Publish/Subscribe‘ bisher deutlich vernachlässigt, erfreut sich jedoch einer immer breiteren Unterstützung sowohl im Bereich der Forschung (Continual Query Project,
[PuLi98]) als auch auf dem Gebiet kommerzieller Systeme ([Orac99], [IBM00]). Bezogen
auf das Anwendungsgebiet des Data Warehousing läßt sich feststellen, dass durch Einführung einer Subskriptionstechnologie bereits ein großer Teil des Informationsbedürfnisses der
Data-Warehouse-Benutzer gestillt werden kann. Detaillierte und spezifische Analysen blieben weiterhin den Werkzeugen zur Exploration von Datenbasen vorenthalten.
Das PubScribe-System
Neben der Motivation von ‘Publish/Subscribe’-Systemen im Bereich des Data-Ware-housing im allgemeinen, beschäftigt sich dieser Beitrag im speziellen mit dem System PubScribe
als konkrete Realisierung der zuvor aufgestellten Forderungen. PubScribe weist folgende Eigenschaften als Abgrenzung zu verwandten Arbeiten und Projekten auf:
• Rollenmodell:
Der PubScribe-Ansatz befolgt ein striktes Rollenkonzept hinsichtlich Produzenten (‘Publisher’) und Subskribenten (‘Subscriber’). Jede am System partizipierende Komponente kann situationsbezogen eine Rolle annehmen, wodurch sämtliche Kommunikationsvorgänge zwischen Teilkomponenten auf Basis des ‘Publish/Subscribe’-Ansatzes
modelliert und realisiert werden können.
• Komplexe Verarbeitungslogik und lokale Speicherung:
Als Erweiterung der reinen Vermittlung (und ggf. Filterung) von Nachrichten zwischen Publisher und Subscriber ist das PubScribe-System mit einer vollständigen Verarbeitungslogik ausgestattet, so dass komplexe Analysen als Subskriptionen im System
hinterlegt werden können.
* Dem aufmerksamen Leser ist sicherlich nicht die im Titel beschworene Analogie zur Batch-Verarbeitung entgangen. Obwohl sich die Grundideen beider Ansätze auf hohem Abstraktionsniveau
durchaus ähneln, hebt sich das Verfahren des ‘Publish/Subscribe’ doch erheblich in der Qualität des
Dienstes und der Methode der Diensterbringung gegenüber der einfachen Stapelverarbeitung ab.
2 Logische PubScribe Architektur
69
• XML-basierte Realisierung:
Der Nachrichtenaustausch zwischen Komponenten und Modulen einzelner Komponenten ist XML-basiert realisiert, was eine flexible Erweiterung und Adaption ermöglicht.
Inhalt
Im folgenden Abschnitt wird die logische Architektur des PubScribe-Systems vorgestellt, wobei auf die einzelnen Komponenten, unterschiedliche Subskriptionstypen und das Rollenkonzept eingegangen wird. Abschnitt 3 skizziert das Verarbeitungsmodell des PubScribe-Systems. Vorgestellt werden Sequenzoperatoren, die auf eingehende Nachrichten angewendet
werden um so die jeweiligen Ausprägungen von Subskriptionen zu ermitteln. Abschnitt 4
schließlich reflektiert in aller gebotenen Kürze die Implementierungsarchitektur. Bevor der
Beitrag mit einer Zusammenfassung und einer Darstellung aktueller und geplanter Arbeiten
schließt, enthält Abschnitt 5 eine kurze Charakterisierung verwandter Arbeiten und Projekte.
2
Logische PubScribe Architektur
Wie aus Abbildung 2.1 ersichtlich ist, basiert die logische Architektur des in diesem Beitrag
vorgestellten PubScribe-Systems auf dem 3-Schema-Architekturmodell für Datenbanksysteme nach ANSI/SPARC ([TsKl78]), in welchem eine Datenunabhängigkeit und Datenneutralität zwischen den internen und externen Schemata durch Einführung eines einzigen konzeptionellen Schemas realisiert wird. Die einzelnen Komponenten, bzw. Rollen, die Komponenten annehmen können (Abschnitt 2.4), werden im folgenden jeweils einer Schicht
zugeordnet und ausführlich beschrieben.
Externe
Schemata
(subskriptionsspezifisch)
XML
XML
XML
XML
SubscriberKomponenten
mit registrierten
Subskriptionen
Konzeptionelles
Schema
Broker-Kern
Subskriptionsmodul
Publikationsmodul
Auslieferungskomponente
Vermittungsund
Verarbeitungskomponente
PublisherKomponenten
mit publizierten
Nachrichten
Interne
Schemata
(kanalspezifisch)
Timer
Data-Warehouse
Basisdatenbank
WebSite
Abb. 2.1: Logische Architektur des PubScribe-Systems
70
2. Logische PubScribe Architektur
2.1 Ebene der Internen Schemata
Auf der Ebene der internen Schemata finden sich die produzierenden Einheiten (Publisher)
mit jeweils einem spezifischen, von ihnen frei gewählten Schema für die von ihnen publizierten Nachrichten. Eine Datenquelle muss sich vor der ersten Publikation bei der Vermittlungskomponente registrieren und sowohl das Schema der zukünftigen Publikationen als auch die
maximale Veröffentlichungsfrequenz angeben. Weiterhin gibt ein Publisher an, ob er Initialauswertungen ermöglicht. Initialauswertungen sind bei ex-nunc-Subskriptionen erforderlich
und werden im nachfolgenden Abschnitt 2.2 erläutert. Die Registrierung selbst erfolgt durch
ein XML-Dokument, welches der Vermittlungskomponente übergeben wird. Diese validiert
die Anforderungen gemäß der entsprechenden DTD und eröffnet einen neuen Informationskanal für diesen Publisher.
Ein Publisher ist nun dafür verantwortlich, bei Bedarf Nachrichten zu publizieren. Dazu formuliert die Vermittlungskomponente zu einer oder mehreren bei ihr eingegangenen Benutzersubskriptionen eine generische Subskription an diesen Produzent. Diese Methode, wie sie
nur das in Abschnitt 2.4 eingeführte Rollenkonzept ermöglicht, verhindert eine Resourcenverschwendung für den allgemein in ‚Publish/Subscribe‘-Systemen ungünstigen Fall, dass
Nachrichten publiziert werden, ohne, dass Abnehmer für diese Nachrichten existieren. Das
generelle Verfahren der bedarfsgesteuerten Publikationen ist detailliert unter dem Begriff der
‚Communication Patterns‘ in [LeHR00] beschrieben und wird in Rahmen dieses Beitrags
nicht weiter vertieft.
2.2 Ebene der Externen Schemata
Die Ebene der Externen Schemata reflektiert die im System registrierten (Benutzer-) Subskriptionen, wobei sich eine Subskription aus vier beliebig komplexen Operatorengraphen
zur Repräsentation folgender Subskriptionskomponenten zusammensetzt:
• Rumpf der Subskription
(„subscription body“)
Stoppbedingung
nicht erfüllt
• Auslieferungsbedingung
(„delivery condition“)
Startbedingung
erfüllt
2
Auslieferungsbedingung
erfüllt
4
3
Stoppbedingung
erfüllt
• Stoppbedingung („closing condition“)
1
Auslieferungsbedingung
nicht erfüllt
• Startbedingung („opening condition“)
Stoppbedingung
nicht erfüllt
Stoppbedingung
erfüllt
5
Ein Operatorengraph in PubScribe ist ein azyklischer
Abb. 2.2: Zustände und
Zustandsübergänge
einer
Graph, dessen Knoten Operatoren und Kanten Sequenzen von Nachrichten widerspiegeln. Die Blattknoten eines Operatorengraphen repräsentieren die im System vorhandenen Informationskanäle, die
von angemeldeten Publisher-Komponenten versorgt werden (Abschnitt 2.1).
2.2 Ebene der Externen Schemata
71
Der Lebenszyklus einer Subskription ist in Abbildung 2.2 in Form eines Zustandsübergangsdiagramms visualisiert. Nach dem Eintreffen einer Subskription wird der Operatorengraph
für die Startbedingung dem globalen Operatorengraph des Systems hinzugefügt. Ist die
Startbedingung erfüllt, so wird sie aus dem System entfernt und die Auslieferungs- und
Stoppbedingung instantiiert. Wird durch das Eintreffen neuer Nachrichten die Auslieferungsbedingung erfüllt, so wird der Rumpf der Subskription ausgewertet; tritt die Stoppbedingung ein, so wird die Subskription vollständig aus dem System entfernt.
Ein weiterer Aspekt, den es im Kontext von Subskriptionen im PubScribe-Projekt auf Ebene
der Externen Schemata zu adressieren gilt, ist die Klassifikation von Subskriptionen. Grundsätzlich werden in PubScribe drei Arten von Subskriptionen unterschieden:
• „Snapshot“-Subskriptionen:
Eine Snapshot-Subskription kann erfüllt werden, indem immer nur auf die aktuell gültige, d.h. die neueste Nachricht eines Publishers Bezug genommen wird. Die Anforderungen an den Publisher sind für diese Art von Subskriptionen minimal.
• „Ex-Nunc“-Subskriptionen:
Wie die nachfolgenden Ex-Tunc-Subskriptionen, beziehen sich Subskriptionen vom
Typ ‚Ex-Nunc‘ auf eine Menge von Nachrichten eines Kanals. Dabei wird das Sammeln der Nachrichten zu Beginn der Subskription begonnen. Der anfangs leere „Arbeitsbereich“ füllt sich mit eintreffenden Nachrichten, über die dann jeweils der Rumpf
der Subskription ausgewertet wird. Die Speicherung der gesammelten Informationen
kann entweder vom Publisher oder vom Subskriptionssystem übernommen werden.
• „Ex-Tunc“-Subskriptionen:
Ex-Tunc-Subskriptionen erfordern, dass die Menge der zur Beantwortung der Subskription erforderlichen Nachrichten bereits nach der erfüllten Startbedingung vollständig vorliegt, d.h. das Sammeln von Informationen muss bereits vor Subskriptionsstart begonnen haben. Um dies zu ermöglichen, muss der Publisher einen Rückgriff auf
historisierte Datenbestände ermöglichen. Somit ist in diesem Fall der Anspruch an den
Publisher am höchsten.
Für „Ex-Tunc“-Subskriptionen wird eine Initialauswertung in Form einer Synchronisierungssubskription von der Vermittlungskomponente an den entsprechenden Publisher formuliert um die zur Auswertung notwendigen historischen Datenbestände zu ermitteln. Eine
Synchronisierungssubskription kennzeichnet sich dadurch aus, dass ihre Start-, Stopp- und
Auslieferungsbedingung ‚wahr‘ sind und der Rumpf sofort nach der Registrierung ausgewertet, die resultierende Nachricht ausgeliefert und die Subskription wieder aus dem System entfernt wird.
72
2. Logische PubScribe Architektur
2.3 Ebene des Konzeptionellen Schemas
Das Äquivalent zum konzeptionellen Schema im 3-Schema-Architekturmodell nach ANSI/
SPARC ist im Kontext der logischen Architektur des PubScribe-Systems in der Vermittlungsund Verarbeitungskomponente zu sehen. Aus konzeptioneller Sichtweise spiegelt diese
Komponente die zentrale Einheit wider, die die Entkopplung von Produzent (Publisher) und
Konsument (Subscriber) realisiert. Analog zur logischen und physischen Datenunabhängigkeit, wird hier eine Unabhängigkeit der Datenproduktion und des Datenkonsumierens erzielt. Um das nachfolgend beschriebene Rollenkonzept realisieren zu können, ist diese Komponente zusätzlich in weitere Module unterteilt:
• Broker-Kern („core broker“):
Der „Core Broker“ ist die zentrale Kontroll- und Verarbeitungskomponente einer PubScribe-Instanz. Beim Eintreffen einer neuen Publikation koordiniert der „Core Broker“
die Auswertung der abhängigen Bedingungen registrierter Subskriptionen. Die physische Implementierung des „Core Broker“, insbesondere die Transformationsschritte
von eingehenden XML-Dokumenten auf eine SQL-Repräsentation einer eines relationalen Datenbanksystems werden in Abschnitt 4 skizziert.
• Publikationsmodul („publisher handler“):
Der Publikationsmodul nimmt Registrierungen und Publikationen von PublisherKomponenten entgegen, validiert diese gegen die vorgeschriebene DTD, wandelt das
XML-Dokument in interne Datenstrukturen um und leitet sie an den „Core Broker“ zur
weiteren Verarbeitung weiter.
• Subskriptionsmodul („subscription handler“):
Der „Subscription Handler“ nimmt sowohl neue Subskriptionen als auch Modifikations- und Löschaufforderungen des Benutzers wiederum in Form von XML-Dokumenten entgegen. Eingegangene Anforderungen werden wiederum hinsichtlich Validität
und Ausführbarkeit, wie beispielsweise auf die Existenz und Einhaltung der maximal
verfügbaren Publizierungsfrequenz der angesprochenen Publisher-Komponenten,
überprüft.
• Auslieferungsmodul („delivery component“):
Die Auslieferung von instantiierten Subskriptionen in Form von XML-Dokumenten
wird vom „Core Broker“ initiiert und von der Auslieferungskomponente durchgeführt.
Verschiedene Module realisieren das Ausliefern, wobei das Verschicken als Anhang
einer EMail, das Hinterlegen auf einem Web-Server und das direkte Schreiben in einen
TCP-Socket als Auslieferungsmodi aktuell realisiert sind.
2.4 Das strikte Rollen- und Kanalkonzept
Eine fundamentale Erweiterung des PubScribe-Ansatzes gegenüber klassischen ‚Publish/Subscribe‘-Ansätzen ist, dass Publisher und Subscriber nicht einzelne, unterschiedliche Komponenten eines Systems, sondern Rollen beschreiben. So ist jeder Kommunikationsvorgang in
2.4 Das strikte Rollen- und Kanalkonzept
73
PubScribe als Publikation, d.h. als Antwort auf eine zuvor formulierte Subskription modelliert
und realisiert. Beispielsweise tritt die Vermittlungskomponente einerseits als Subscriber auf,
indem sie (eine Vielzahl) eingehender Benutzersubskriptionen auf eine einzelne generische
Subskription an eine Publisher-Komponente abbildet. Die Vermittlungskomponente nimmt
andererseits jedoch auch die Rolle eines Publishers ein, indem sie Nachrichten nach einer
entsprechenden Verarbeitung an den Benutzer ausliefert, d.h. publiziert. Das Rollenkonzept
ermöglicht unter anderem eine nahtlose Integration folgender systeminternen Komponenten:
• Systemzeit:
Die Systemzeit ist als eigene Publisher-Komponente realisiert, die jeweils die aktuelle
Zeit publiziert. Analog zu regulären Publisher-Komponenten, werden nur dann Nachrichten erzeugt, wenn auch darauf registrierte Subskriptionen existieren, d.h. dass die
Systemzeitkomponente nur dann die Systemzeit publiziert, wenn es die registrierten
Subskriptionen erfordern (z.B. alle 5 Minuten). Dieses Verfahren erweist sich als vorteilhaft, da die Ereignisbehandlung nicht für zeit- und datengetriebene Ereignisse getrennt (wie es beim CQ-Projekt der Fall ist; Abschnitt 5), sondern einheitlich vorgenommen werden kann.
• Metadaten:
Als Metadaten werden Informationen über die registrierten Publisher-Komponenten
und Subskriptionen gesammelt und ebenfalls als ein eigener Informationskanal modelliert. So erfolgt beispielsweise beim Eintreffen einer neuen Subskription ein Publikationsvorgang von der Vermittlungskomponente in den Metadaten-Kanal, um die Metadaten entsprechend zu aktualisieren. Durch eine Subskription auf diesen MetadatenKanal wird jede Systemkomponente, die Wissen über Metadaten benötigt (z.B. der
Subskriptionsmodul zur Überprüfung der Existenz eines im Rahmen einer Subskription angeforderten Informationskanals) über Änderungen informiert.
• Systemzustandsdaten:
Damit das System nach einen Neustart mit der jeweils zuletzt gültigen Konfiguration
von Publishern und Subscribern aufsetzen kann, publiziert jede Systemkomponente,
die auf Persistenz angewiesen ist, jeden Zustandswechsel in einen expliziten Persistenzkanal. Bei einem Wiederanlauf formulieren die Systemkomponenten Subskriptionen auf diesen Kanal und erhalten ihren zuletzt publizierten Zustand.
Das Rollenkonzept hat sich bei der Realisierung des PubScribe-Systems als fundamentale Erleichterung erwiesen, da systeminterne Vorgänge analog zu Vorgängen auf Anwendungsebene behandelt werden können und keine spezielle Behandlung benötigen. Die Kehrseite dieses strikt durchgehaltenen Rollenkonzeptes liegt jedoch darin, dass eine einzelne Aktion eine
Vielzahl von impliziten Reaktionen mit sich bringt. So resultiert eine eingehende Publikation
in einer weiteren Publikation zur Aktualisierung der Metadaten. Eine neu zu registrierende
Subskription impliziert neben der Publikation zur Aktualisierung der Metadaten sogar weitere Publikationen, in denen die beteiligten Systemmodule ihre Zustandsänderungen durch
Publikationen in den Persistenzkanal protokollieren.
74
3
3. Verarbeitungsmodell
Verarbeitungsmodell
Neben dem Rollenkonzept hebt sich das PubScribe von anderen ‚Publish/Subscribe‘-Ansätzen (Abschnitt 5) ab, indem nicht nur eine Vermittlung, sondern auch eine Verarbeitung der
eingehenden Nachrichten dem Benutzer angeboten wird. Dieser Abschnitt führt exemplarisch das Verarbeitungsmodell basierend auf Sequenzen von Nachrichten ein, verzichtet aber
gleichzeitig auf die formale Beschreibung der zu Grunde liegenden Algebra ([LeHR00]).
Nachrichten und Sequenzen von Nachrichten
Wie bereits angesprochen, muss ein Publisher im Zuge seiner Registrierung das Schema seiner zu publizierenden Nachrichten dem Publikationsmodul der Vermittlungskomponente
mitteilen. Da die Publikation von Nachrichten eine inhärente zeitliche Ordnung mit sich
bringt, werden im System alle Datenströme als Sequenzen von Nachrichten modelliert. Das
Schema einer Nachricht bzw. einer Sequenz von Nachrichten Q besteht aus zwei Attributmengen zur Beschreibung des Identifikatons- und des Informationsteils (Q = (ID, C) = ([ID1,
..., IDn], [C1, ..., Cm])).
Als Beispiel möge an dieser Stelle ein Produzent von Börseninformation dienen, der entweder direkt Daten aus dem Internet (Abschnitt 4.2) oder aus einem vorgelagerten Data Warehouse extrahiert. Der Identifikationsteil besteht dabei aus der Wertpapierbezeichnung, dem
Handelsplatz und der Uhrzeit. Der Informationsteil enthält den jeweiligen Kurs sowie das
Handelsvolumen:
([(Wertpapier, string),
// Name des Wertpapiers
(Handelsplatz, string),
// Name der Börse, z.B. FWB
(Zeit, datetime)
// Datum und Uhrzeit
],
[(Kurs, float),
// Kurs des Wertpapiers
(Volumen, integer)
// Zahl der gehandelten Aktien
])
Beispielausprägung:
[([‘ORACLE’, ‘FWB’, ‘29.8.2000 9:30’], [‘ 96,70’, ‘53’]),
([‘ORACLE’, ‘BER’, ‘29.8.2000 9:33’], [‘ 96,50’, ‘21’]),
([‘ORACLE’, ‘HAM’, ‘29.8.2000 9:07’], [‘ 97,00’, ‘ 8’]),
([‘ORACLE’, ‘FWB’, ‘29.8.2000 9:58’], [‘ 97,10’, ‘23’]),
([‘IBM’, ‘BER’, ‘29.8.2000 9:17’], [‘145,30’, ‘37’]),
([‘IBM’, ‘STU’, ‘29.8.2000 9:30’], [‘144,20’, ‘12’])
]
Es sei an dieser Stelle angemerkt, dass der Identifikationsteil auch eine leere Menge von Attributen umfassen kann (z.B. zur Darstellung der aktuellen Systemzeit). Definitionsgemäß ist
dann nur eine einzige Nachricht erlaubt. Weiterhin gilt für den Identifikationsteil die Primärschlüsseleigenschaft mit der Einschränkung, dass auf Minimalität verzichtet wird.
3 Verarbeitungsmodell
75
Operatoren auf Sequenzen von Nachrichten
Das PubScribe-System stellt eine Vielzahl von Operatoren auf Sequenzen von Nachrichten zur
Verfügung. Nachfolgend werden die Operatoren kurz skizziert und am Beispiel erläutert.
Auf eine formale Beschreibung wird wiederum verzichtet.
• Selektion:
Der Selektionsoperator liefert nur diejenigen Nachrichten, die eine angegebene Selektionsbedingung erfüllen. Eine Selektion ist nur auf Attribute des Identifikationsteils erlaubt†.
• Verbundoperation:
Zur Kombination von Nachrichten aus unterschiedlichen Kanälen dient ein Natürlicher
Verbund über die Schnittmenge der Attributmengen der beiden Verbundpartner.
• Skalaroperator:
Die Anwendung eines Skalaroperators erlaubt die Modifikation des Informationsteils
einer Queue durch Skalarfunktionen. Eine Skalarfunktion erhält die Attribute sowohl
des Informations- als auch des Identifikationsteils als Parameter. Die Menge der Skalarfunktionen umfasst neben arithmetischen und boolschen Funktionen auch Kalenderfunktionen zur Bearbeitung von Zeitangaben.
Beispiel: Das Schema
([ID], [Tag=day(Zeit), Kurs=id(Kurs), Umsatz=mult(Kurs, Volumen)])
zeigt die Anwendung von drei Skalarfunktionen zur Berechnung des Tages, der unveränderten Übernahme des Aktienkurses und der Berechnung des Umsatzes. Der Inhalt
wird somit um das Attribut für den aktuellen Tag erweitert.
• Shift-Operator:
Der Shift-Operator erlaubt das Verschieben eines Attributes aus dem Inhalt in den
Identifikationsbereich. Die Eindeutigkeit der Identifikation wird dadurch nicht behindert.
Beispiel: Aus dem vorangegangenen Beispiel wird die Tagesangabe (zusätzlich zur
Uhrzeit) in den Identifikationsteil übernommen:
([ID ∪ {Tag}], [Kurs, Umsatz])
• Gruppierung:
Der Gruppierungsoperator erzeugt für explizit spezifizierte Identifikationsattribute
Partitionen mit gleichen Attributwerten und bestimmt auf dieser Menge entweder die
Summe, das Minimum oder das Maximum. Identifikationsattribute über die aggregiert
wird, fallen weg; die Eindeutigkeit bleibt erhalten.
Beispiel: Das folgende Schema spiegelt das Ergebnis eines Gruppierungsoperators wider, der eine tagesweise Summation über die Umsätze vornimmt:
([{Aktie, Tag}], [Gesamtumsatz=SUM(Umsatz)])
† Diese Einschränkung kann jedoch durch eine Kombination von Skalaroperator und Shift-Operator
relativiert werden.
76
4. PubScribe Implementierung
• Windowing-Operator:
Im Gegensatz zum Gruppierungsoperator bestimmt der Windowing-Operator für jede
Nachricht eine gemäß vorgegebener Reihenfolge explizit spezifizierte Partition, auf die
dann Aggregationsfunktionen angewendet werden.
Beispiel: Folgendes Schema zeigt das Ergebnis eines Windowing-Operators, der für
jeden Tag den Drei-Tages-Gesamtumsatz aus dem vorangegangen, dem aktuellen und
dem nachfolgenden Umsatzwert, angegeben durch (-1,1) in der Operatorparametrisierung berechnet:
([ID], [ DreiTagesUmsatz=SUM(Umsatz:(-1,1)) ])
Angemerkt sei an dieser Stelle, dass der Windowing-Operator im Gegensatz zum
Gruppierungsoperator keine Verdichtung vornimmt, sondern für jede eingehende
Nachricht eine Ergebnisnachricht produziert.
Die Menge der Operatoren ist abgeschlossen und erhält die eindeutige Identifizierbarkeit.
Neben klassischen Operatoren wie Selektion und Verbund bieten insbesondere die Gruppierung- und Windowing-Operatoren ein mächtiges Hilfsmittel, um Anforderungen aus dem
Data-Warehouse-Bereich befriedigen zu können.
4
PubScribe Implementierung
In diesem Abschnitt wird näher auf die prototypische Implementierung von PubScribe eingegangen. Pubscribe basiert auf Oracle8i und ist vollständig in Java in Form einer Zusatzebenenarchitektur realisiert. Der aktuelle Prototyp stellt bereits die volle Funktionalität einer Vermittlungskomponente zur Verfügung. Mangels geeignetem Benutzerinterface zur freien Spezifikation von Subskriptionen wird in Abschnitt 4.2 ein eingeschränktes
Demonstrationsszenario über Subskriptionstemplates vorgestellt.
4.1 Schichtenarchitektur des Kernsystems
Wie bereits in Abschnitt 2.3 beschrieben, besteht eine Vermittlungskomponente aus mehreren Modulen, wie Subskriptions-, Publikations- und Auslieferungsmodul. Die zentrale Komponente des Kernsystems übernimmt die Verwaltung des gesamten Lebenszyklus einer Subskription. Abbildung 4.1 zeigt im Überblick das Kernsystem aus Abbildung 2.1 untergliedert
in die nachfolgend beschriebenen Schichten, die eine Abbildung der XML-Nachrichten und
in XML-spezifizierten Subskriptionen auf ein relationales Datenbanksystem vornehmen.
Data Processing Unit
Registriert ein Subscriber eine neue Subskription am Subscriptionsmodul, so veranlasst das
Kernsystem, dass in der Data Processing Unit (DPU) ein entsprechender Operatorenbaum,
wie in Abschnitt 3 erläutert, für die Startbedingung eingerichtet und in den bereits bestehenden globalen Subskriptions-DAG integriert wird. Dabei wird versucht, die einzelnen Sub-
4.1 Schichtenarchitektur des Kernsystems
77
skriptionsbäume möglichst effizient, d.h. redundanzfrei, bei einem hohen Wiederverwendungsgrad zu integrieren. Aktuell werden sowohl direkte Übereinstimmungen von Teilbäumen als auch Situationen erkannt, in welchen Kompensationsoperationen notwendig sind,
um eine vollständige Substituierbarkeit zu erzielen ([ZCL+00]).
Als Wächter über den globalen Operatorengraph des Systems entscheidet die DPU weiterhin, welche der Knoten physisch realisiert, d.h. materialisiert (z.B. größter Fanout), und welche lediglich virtuell implementiert werden. Per Definition wird jeder Zielknoten, der das Ergebnis eines Subskriptionsrumpfes reflektiert materialisiert, so dass beispielsweise eine Auslieferung an den Subscriber erfolgen kann.
Operation Clustering Unit und Staging
Area
Die DPU reicht jeden zu materialisierenden
Operatorenknoten zusammen mit dem anhängigen Unterbaum an die tiefere Operation Clustering Unit (OCU) weiter. Diese Schicht faßt
möglichst viele der virtuellen Knoten zu einem
einzigen Operator, der direkt auf SQL abgebildet werden kann, zusammen. Für jeden dieser
zusammengesetzten Operatoren legt die darauf
folgende Staging Area genau eine Sicht im Datenbanksystem an.
Publikationsmodul
Auslieferungskomponente
Broker-Kern
Subskriptionsmodul
Data Processing
Unit
Operation
Clustering Unit
Staging Area
RDBMS
Erfährt der Broker-Kern über den Publikati- Abb. 4.1: Schichtenarchitektur des PubScribeonsmodul von einer neuen Publikation, so veranlaßt er das Einbringen der neuen Informationen in die entsprechenden Relationen der Staging Area. Die so aufgenommen Nachrichten werden auf Datenbankebene durch einen Triggermechanismus im Operatorengraphen nach oben propagiert, wobei in erster Linie die
Teilbäume der Subskriptionen ausgewertet werden, die Subskriptionsbedingungen repräsentieren (Abschnitt 3). Ist beispielsweise die Auslieferungsbedingung erfüllt, wird auch der zugehörige Datenteil der Subskription ausgewertet, indem die dazu korrespondierenden Tabellen aktualisiert werden.
Kommunikation zwischen den Komponenten und Ihrer Module
Der Informationsaustausch zwischen den einzelnen PubScribe Komponenten als auch der
oben beschriebenen Module erfolgt ausschließlich über das Simple Object Access Protocol
(SOAP, [BEK+00]). Dieses realisiert eine Verteilfähigkeit des Systems sowohl auf Komponenten-, als auch auf Modulebene‡. SOAP stellt eine Möglichkeit dar, basierend auf XML
([Tolk99]) und HTTP einen Fernaufruf zu ermöglichen, ohne - wie bei anderen Übertra‡ In der tatsächlichen Realisierung kann per Compileroption auf SOAP zwischen den Modulen einer
Komponente verzichtet und auf direkten Prozeduraufruf umgeschalten werden.
78
4. PubScribe Implementierung
Abb. 4.2: Registrierung der Stockwatch-Subskription
gungsformaten, wie etwa CORBA - spezielle Ports freigeben zu müssen. Damit werden
SOAP-Nachrichten insbesondere nicht von Firewalls aufgehalten. Im Fall von PubScribe ist
auch die Nutzinformation selbst in XML formuliert, was eine ausführliche Semantik erlaubt,
die den Betrieb eines Subskriptionssystems enorm vereinfacht. Es besteht weiterhin die berechtigte Hoffnung, daß XML sich in der Industriewelt weit verbreiten wird und damit quasi
als Esperanto der Informationsbranche dient.
4.2 “Stockwatch” Demonstration
Die in diesem Aufsatz vorgestellten Konzepte und die Architektur sind in Form eines Prototypen implementiert, der unter http://www.pubscribe.org/ mehrere mögliche Szenarien anbietet.
Als Beispiel für eine Subskription auf einer Datenbank werden die Logeinträge eines Webservers in einem Oracle8i System aufgezeichnet. Subscriber können sich somit über die Zugriffshäufigkeit auf gewisse Webseiten informieren lassen.
Eine komplexere Demonstration ist die Stockwatch-Subskription, deren Einrichtung in
Abbildung 4.2 zu sehen ist: im Bildhintergrund muss zuerst ein Token angegeben werden,
welches den Anwender identifiziert**. Die Startbedingung ist nicht gesetzt, daher tritt die
Subskription sofort in Kraft. Als Stoppbedingung ist ein Datum Ende September gesetzt. Im
** Solche Tokens sind kostenlos auf der Webseite erhältlich und dienen nur dem Schutz des Anwenders.
5 Verwandte Arbeiten und Projekte
79
Bildvordergrund wird als Lieferbedingung alle 24 Stunden gesetzt. Von den drei möglichen
Anfragemustern wurde die mittlere, die eine Window-Anfrage über die IBM-Aktie enthält,
ausgewählt. Der Anwender kann die in diesem Formular erstellte Subskription von jedem
Webbrowser abschicken und wird dann entsprechend seinen Wünschen per Mail informiert.
Ausgaben auf (WAP-)Mobiltelefone, d.h. eine Umsetzung von XML auf WML, ist leicht
vorstellbar.
5
Verwandte Arbeiten und Projekte
Dieser Abschnitt gibt einen kurzen Überblick über die im Kontext des hier vorgestellten Systems verwandte Notifikations- und Informationsverteilungsdienste, die im Zusammenhang
mit ‘Publish/Subscribe’ stehen. Dabei wird allerdings bewußt nur auf inhaltsbasierte Systeme eingegangen, weshalb vor allem die im Internet beliebten, themenorientierten Dienste,
wie PointCast, Marimba oder BackWeb entfallen. Für einen genaueren Überblick sei etwa
auf [Hack97] verwiesen. Bei den vorgestellten Ansätzen wird zwischen systemnahen und applikationsnahen Ansätzen unterschieden.
5.1 Systemnahe Ansätze
Der Event Service ist einer von vielen CORBA-Diensten (COSS, [OMG00]). Wie bei CORBA-Spezifikationen üblich, ist seine Definition sehr allgemein und damit flexibel gehalten.
Anbieter und Nachfrager tauschen Ereignisse miteinander aus und folgen dabei einer Push, einer Pull- oder einer gemischten Semantik ([AcFZ97]). CORBA erlaubt je nach Implementierung der Ereignisnachrichten sowohl inhalts- als auch themenorientierte Subskriptionen.
Die Scalable Internet Event Notification Architektur (SIENA, [Birm93]) realisiert einen
skalierbaren Eventdienst für hochgradig verteilte Systeme. Notifikationen sind Mengen von
Attributen jeweils bestehend aus Bezeichner, Datentyp und dem tatsächlichen Wert. Subskriptionen sind in SIENA Kombinationen von Prädikaten (Ereignisfiltern) auf Attributen.
Das Elvin-System ([FMK+99]) besteht nur aus Servern: Publisher sind spezielle Server, die
den Subskriptionsapparat nicht benutzen, während Subskribenten Server sind, die jegliche
Subskriptionsanfragen ignorieren. Subskriptionen werden als boolsche Ausdrücke in einer
speziell entwickelten Sprache formuliert, die u.a. reguläre Ausdrücke unterstützt. Beispielanwendungen von Elvin sind Tickertape, ein konfigurierbarer News-Ticker, oder Breeze, ein
ereignisgesteuertes Workflow-Management-System.
80
6. Zusammenfassung und Ausblick
5.2 Applikationsnahe Ansätze
Im Gegensatz zu den eben angeführten Ansätzen auf Systemebene sind die vollwertigen ’Publish/Subscribe’-Applikationen eher mit dem hier vorgestellten PubScribe vergleichbar. Das
Stanford Information Filtering Tool (SIFT, [YaGa95]) beispielsweise ist ein großflächig
angelegter Informationszustelldienst, der Volltextfilterung zulässt und sich im Informationsgewinnungsbereich ([FoDu92]) platziert. Subskriptionen werden durch Festlegen eines Informationsprofils mit diversen Parametern bezüglich Wiederholungsfrequenz oder Informationsmenge spezifiziert. Zwei unterschiedliche Filtermodelle berechnen den Relevanzwert
neuer Nachrichten und liefern diese aus, falls ein definierter (Relevanz-)Schwellwert überschritten ist.
Gryphon ([SBC98]) ist ein skalierbares Nachrichtenverteilungssystem von IBM, das ähnlich wie PubScribe Subskriptionen anhand des Informationsflusses in einem Filtergraphen
evaluiert. Dabei konzentriert sich das Gryphon-Projekt insbesondere auf die Optimierung
dieses Graphen und die Verteilung des zentralen Brokers auf mehrere Instanzen.
Das Continual Query Projekt (CQ, [PuLi98]) ist ein ’Publish/Subscribe’-System zur Verarbeitung verteilter, heterogener Informationsquellen. Eine Subskription besteht aus einer
SQL-Anfrage, angereichert um eine zeit- bzw. ereignisgesteuerte Auslieferbedingung und
eine Stoppbedingung. Subskriptionen über mehrere Quellen sind möglich, was von einigen
anhängigen Projekten ausgenutzt wird.
6
Zusammenfassung und Ausblick
Das Ziel dieses Beitrags ist es, auf der einen Seite die Notwendigkeit eines Subskriptionssystems im Kontext eines Data-Warehouse-Systems aufzuzeigen, in dem zukünftige Anforderungen adressiert und mögliche Lösungsansätze durch den ‘Publish/Subscribe’-Ansatz aufgezeigt werden. Auf der anderen Seite wird in diesem Beitrag das PubScribe-Projekt als ein
Beispiel eines ‘Publish/Subscribe’-Systems vorgestellt. Die logische Architektur basiert dabei auf dem 3-Schichten-Architekturmodell. Das strikt durchgehaltene Rollenmodell, in
welchem physische Komponenten situationsbezogen unterschiedliche Rollen annehmen
können, erlaubt eine einheitliche Modellierung und Umsetzung sowohl anwendungsorientierter als auch systeminterner Abläufe. Das PubScribe-System ist als Prototyp implementiert.
Aktuelle Arbeiten am PubScribe-Kern beschäftigen sich mit der Erweiterung des Rollenkonzeptes in einer vollständig verteilten Umgebung. Weitere Arbeiten beziehen sich auf die Datenbereitstellungsseite und die Entwicklung eines Benutzerwerkzeuges zur Spezifikation
von beliebigen
6 Zusammenfassung und Ausblick
81
Literatur
AcFZ97
AHLS00
BaGü00
Birm93
BEK+00
Conr97
FMK+99
FoDu92
Hack97
Inmo96
OMG00
IBM00
Orac99
Powe96
PuLi98
SBC98
TGNO92
Tolk99
TsKl78
Acharya, S.; Franklin, M.; Zdonik, S.: Balancing Push and Pull for Data Broadcast. In: SIGMOD
Conference 1997, S. 183-194
Albrecht, J.; Hümmer, W.; Lehner, W.; Schlesinger, L.: Adaptive Praeaggregation in
multidimensionalen Datenbanksystemen. In: 8. GI-Fachtagung Datenbanksysteme in Büro, Technik
und Wissenschaft (BTW'99, Freiburg, 1.-3. März 2000), S. 97-114
Bauer, A.; Günzel, H.: Data Warehouse Systeme. dPunkt-Verlag, Heidelberg, 2000
Birman, K.P.: The Process Group Approach to Reliable Distributed Computing. In: Communicatons
of the ACM, 36(1993)12, S. 36-53
Box,D.; Ehnebuske, D.; Kakivaya, G.; Layman, A.; Mendelsohn, N.; Nielsen, H.F.; Thatte, S.; Winer,
D.: SOAP: Simple Object Access Protocol, 2000
(Elektronisch verfügbar unter: http://msdn.microsoft.com/workshop/xml/general/soapspec.asp)
Conrad, S.: Föderierte Datenbanksysteme. Konzepte der Datenintegration, Springer Verlag, Berlin,
Heidelberg, 1997
Fitzpatrick, G.; Mansfield, T.; Kaplan, S.; Arnold, D.; Phelps, T.; Segall, B.: Instrumenting and
Workshop on Community Knowledge, 1999
Foltz, P.W.; Dumais, S.T.: Personalized Information Delivery: An Analysis of Information Filtering
Methods. In: Communications of the ACM, 35(1992)12, pp. 51-60
Hackathron, R.: Publish or Perish. In: Byte Magazin, September 1997
(Elektronisch verfügbar unter: http://www.byte.com/art9709/sec6/art1.htm)
Inmon, W.H.: Building the Data Warehouse, 2. Auflage. John Wiley & Sons, New York, Chichester,
Brisband, Toronto, Singapur, 1996
N.N.: Event Service, Version 1.0. In: Corba Services Specifications, Object Management Group,
2000
N.N.: MQSeries: Message Oriented Middleware. Whitepaper
(Elektronisch verfügbar unter: http://www.ibm.com/software/ts/mqseries/library/whitepapers/
mqover/)
N.N.: Oracle 8i Application Developer’s Guide - Advanced Queuing. In: Oracle 8i Server
Dokumentation, 1999
Powell, D.: Group Communication. In: Communications of the ACM 39(1996)4, S. 59-97
Pu, C.; Liu, L.: Update Monitoring: The CQ Project. In: The 2nd International Conference on
Worldwide Computing and Its Applications, 1998, S. 396-411
Strom, R.; Banavar, G.; Chandra, T.; Kaplan, M.; Miller, K.; Mukherjee, B.; Sturman, D.; Ward, M.:
Gryphon: An Information Flow Based Approach to Message Brokering. In: International Symposium
on Software Reliability Engineering '98, Fast Abstract, 1998
Terry, D.B.; Goldberg, D.; Nichols, D.; Oki, B.M.: Continuous Queries over Append-Only
Databases. In: SIGMOD Conference 1992, pp. 321-330
Tolkersdorf, R.: XML und darauf basierende Standards. Die neue Auszeichnungssprache des Web.
In: Informatik Spektrum, 22(1999)6, S. 407-421
Tsichritzis, D.C.; Klug, A.: The ANSI/X3/SPARC DBMS framework report of the study group on
database management systems. In: Information Systems 3(1978)3, S. 173-191
LeHR00 Lehner, W.; Hümmer, W.; Redert, M.: Building An Information Marketplace using a Content and
Memory based Publish/Subscribe System. Technischer Bericht, Institut für Informatik, Universität
Erlangen-Nürnberg, eingereicht zur Veröffentlichung, 2000
YaGa95 Yan, T.W.; Garcia-Molina, H.: SIFT - a Tool for Wide-Area Information Dissemination. In: USENIX
Winter 1995, S. 177-186
ZCL+00 Zaharioudakis, M.; Cochrane, R.; Lapis, G.; Pirahesh, H.; Urata, M.: Answering Complex SQL
Queries Using Automatic Summary Tables. In: SIGMOD Conference 2000, S. 105-116
82
ÜBER ANBAHNUNG UND AUSHANDLUNG
VON VERTRÄGEN IM EBUSINESS
H. Wedekind, W. Hümmer, W. Lehner
{wedekind, huemmer, lehner}@informatik.uni-erlangen.de
Zusammenfassung. Die modell- und systemtechnische Unterstützung des elektonischen Handels, überlicherweise als eBusiness bezeichnet, muss als zentrale Herausforderung sowohl im Bereich der Forschung als auch aus kommerzieller Perspektive
gesehen werden. Dieser Aufsatz fokussiert die beiden Phasen der Anbahnung und Aushandlung von Verträgen. Es wird diskutiert, wie aus einem N:M-Verhältnis in der Phase
der Partnerfindung nahtlos eine 1:1-Beziehung zur konkreten Aushandlung eines Vertrages hergestellt werden kann und die Inhalte des Vertrages nicht verloren gehen. Im
Kontext der Mereologie (Logik der Teil-Ganze-Relation) wird die Rekonstruktion eines
Vertragswerkes vorgenommen. Geschachtelt strukturierte Teile des zukünftigen Vertragswerkes aus der Anbahnungsphase werden als Gegenstand eines dialogischen Aushandelns aufgefasst. Für die Phase des Aushandelns werden Vertragsstrukturen um eine
Verhandlungssequenz und eine Implikation erweitert.
84
1
1. Einleitung
Einleitung
Das Interesse am elektronischen Handel ist bereits seit Jahren ein zentrales Thema in der Informatik, das jedoch erst in den letzten Jahren aus dem wirtschaftswissenschaftlichen Bereich hinein in die Kerninformatik getragen worden ist. Grundsätzlich werden unterschiedliche Klassen des eBusiness identifiziert ([Merz99a]). Der Bereich des “Business-To-Consumer” adressiert vornehmlich die Thematik, den klassischen Verkauf auf elektronischem
Wege nachzubilden. Als Beispiel par-excellance ist in diesem Kontext die Buchhandlung
Amazon zu nennen. Ein Spezialfall von B2C ist die Situation in welcher der Kunde die Öffentliche Hand ist. In diesem Fall wird üblicherweise von B2A (“Business-To-Administration”) gesprochen. Von Interesse in unserem Kontext ist die Kategorie von eBusiness, in welcher Unternehmen untereinander Handel betreiben (B2B - “Business-To-Business”). Obwohl auch bereits der B2B-Bereich sehr nachhaltig vorangetrieben worden ist, so kann doch
an den bisherigen Techniken Kritik angebracht werden.
Als erster Kritikpunkt muss festgehalten werden, dass aktuell der Schwerpunkt des B2BAspektes in der Durchführung des eigentlichen Geschäftsvorganges liegt. Die Aufnahme
und die Durchführungen von Vertragsverhandlungen werden vernachlässigt. In Merz et al.
([Merz99b], S. 340) werden Vertragsverhandlungen nicht inhaltlich, sondern nur als das “gemeinsame Editieren eines Vertrages als strukturiertes Dokument aufgefaßt”. Bei Kooperationen von Unternehmen “im Grossen” werden üblicherweise Rahmenverträge auf traditionelle
Art und Weise ausgehandelt und im Kontext von eBusiness umgesetzt. Eine Vertragsverhandlung, sofern überhaupt noch nötig, reduziert sich auf das Parametrisieren von Variablen
im bereits zuvor ausgehandelten Rahmenvertrag. Als Beispiel von Vertragsverhandlungen
“im Kleinen” kann an dieser Stelle die Durchführung von Auktionen genannt werden. Auch
hier existiert ein Vertrag zwischen Käufer und Verkäufer, jedoch ist dieser Vertrag durch das
Auktionshaus bereits fest vorgegeben. Eine spezifische Konfiguration eines Vertrages kann
nicht (oder nur informell) vorgenommen werden.
Als zweiter Kritikpunkt, den es in Kombination mit der Durchführung einer Vertragsverhandlung abzuschwächen gilt, ist eine Unterstützung zum Auffinden von Vertragspartnern
mit direktem Übergang zu einer Vertragsspezifikation. Klassische elektronischen Marktplätze sind bei weitem nicht ausreichend.
Ziel des in diesem Aufsatz skizzierten Ansatzes ist es, ein Modell zur Beschreibung von Verträgen bereitzustellen, welches bei der Anbahnung und bei der Aushandlung gleichermaßen
Verwendung finden kann. Eine inhaltsbasierte Verschmelzung der Phase des Auffindens von
Geschäftspartnern und das Aushandeln von konkreten Verträgen sehen wir als fundamental
im Bereich des B2B an. Das dynamische Auffinden von Partnern und das Durchführen von
Verhandlungen ist insbesondere im Kontext der Schaffung virtueller Unternehmen, also Unternehmen, die nur auftragsorientiert bzw. projektorientiert existieren, von grundsätzlicher
Wichtigkeit.
2 Aspekte des eBusiness
85
Vermittlung
Vertrag
Verhandlungsspielraum
Informationsphase
Verhandlungsphase
Abwicklungsphase
Abb. 2.1: Prozessorientierte Perspektive von Vertragsverhandlungen im eBusiness (nach
[Merz99a])
Inhalt
Im folgenden Abschnitt erfolgt eine Betrachtung des eBusiness aus unterschiedlichen Perspektiven. Behandelt werden prozessorientierte, systemtechnische und argumentations-theoretische Betrachtungsweisen. Der dritte Abschnitt adressiert die Phase der Anbahnung von
Vertragsverhandlungen, basierend auf dem Verarbeitungsparadigma des ‘Publish/Subscribe’. Vertragsangebote werden dabei als Publikationen verbreitet. Nachfragen werden als
Subskriptionen modelliert. Beide Seiten beziehen sich auf das gleiche, über mereologischen
Strukturen (Teil-Ganze-Beziehungen) definierte Vertragsmuster. Diese Sichtweise wird im
Abschnitt 4 erweitert, indem im Rahmen der Vertragsverhandlung die Vertragsteile mit Sequenz und Implikation erweitert werden, wodurch ein Dialog logisch rekonstruiert wird. Der
Aufsatz schliesst mit einer Zusammenfassung und einem Ausblick, wie das vorgeschlagene
Rahmenwerk realisiert und zur Nutzung freigegeben werden kann.
2
Aspekte des eBusiness
Wie bereits aufgezeigt, adressiert der Bereich des eBusiness nicht nur kleine Bereiche, sondern muss unter unterschiedlichen Gesichtspunkten betrachtet werden. Im folgenden werden
dabei drei unterschiedliche Sichtweisen (prozessorientiert, systemtechnisch und argumentations-theoretisch) eingeführt und ausführlich diskutiert.
Prozessorientierte Aspekte
Als erste Perspektive soll im Rahmen dieser Betrachtung die prozessorientierte Betrachtung
dienen. Arbeiten im Bereich des eBusiness ([Merz99a], [Merz99b]) identifizieren üblicherweise drei unterschiedliche Phasen zur Durchführung eines Geschäftsvorganges im eBusiness (Abbildung 2.1)
86
2. Aspekte des eBusiness
In einer ersten Phase der Information steht der Produktkatalog eines Anbieters im Zentrum
der Interaktion. Im Rahmen eines elektronischen Marktplatzes erfolgt dann eine gezielte
Vermittlung der Geschäftspartner, woran sich die Phase der Verhandlungen anschließt. Am
Ende dieser Phase findet sich der Vertrag. Die dritte Phase dient nunmehr dem eigentlichen
Geschäft, d.h. dessen Abwicklung.
Es zeigt sich, dass der vorgeschlagene Ansatz, einen nahtlosen Übergang von der Anbahnung, was in Abbildung 2.1 durch die Phase der Information und der Vermittlung repräsentiert wird, zur Vertragsaushandlung zwingend notwendig ist, um nicht in jeder Phase Konvertierungen sowohl auf Schema als auch Instanzebene durchführen zu müssen.
Systemtechnischer Aspekt
Die zentrale Frage, die unter dem Blickwinkel des systemtechnischen Aspekts geklärt werden muss, lautet, wie eine Abbildung eines N:M-Verhältnisses von Anbietern und Nachfragern in der Phase der Vertragsanbahnung (’match making’) auf (potenziell sehr viele) 1:1Verhältnisse von Vertragspartnern reduziert werden kann. Mit anderen Worten: Wie wird der
Vertragspartner gefunden und wie wird ein nahtloser Übergang zwischen den Phasen vorgenommen? In diesem Aufsatz wird dabei zweigleisig verfahren.
Abbildung 2.2 skizziert zunächst die unterschiedlichen Schichten, die aus der systemtechnischen Perspektive betrachtet werden müssen. Schicht 1 und Schicht 2 spiegeln unterschiedliche Kommunikationsprimitive wider und stehen nicht weiter zur Debatte. Es soll lediglich
angemerkt werden, dass höhere Schichten durch unterschiedliche Dienste realisiert werden
können. So reflektiert beispielsweise der Einsatz von Multicast*-Verbindungen im Kontext
von ‘Publish/Subscribe’-Systemen eine sinnvolle Kommunikationsdienstalternative zu Unicast†-Verbindungen. Zur Diskussion stehen im folgenden die beiden „Türme“ der Schichten
3 und 4 für die Phasen der Vertragsanbahnung und Vertragsaushandlung, die die oben angesprochenen Zweigleisigkeit repräsentiert.
Auf der einen Seite erfolgt ein Wechsel des Verarbeitungsparadigmas von ‘Publish/Subscribe’ auf das klassische ‘Request/Response’-Paradigma, das u.a. im “Information and Content
Exchange” (ICE) Protokoll standardisiert ist ([Brod00], Schicht 3 in Abbildung 2.2). Wie im
Abschnitt 3 noch ausführlich dargelegt wird, veröffentlichen in der Anbahnungsphase auf
der einen Seite Anbieter ihre Produkte in einem Katalog, der bei einer logisch zentralisierten
“Vermittlungsstelle” als Kanal oder Thema registiert und zum Subskribieren‡ freigegeben
wird. Auf der anderen Seite spezifizieren Interessenten eine Subskription auf diesen Katalogen und werden benachrichtigt, falls sich Änderungen oder Neuerungen im elektronischen
Katalog ergeben. Die Aushandlung konkreter Verträge erfolgt dann auf Grundlage des klassischen ‘Request/Response’-Verarbeitungsparadigmas.
* Multicast - Kommunikation zwischen einem Sender und mehreren definierten Empfängern
† Unicast - Kommunikation zwischen einem Sender und einem Empfänger
‡ subskribieren (laut Duden) - sich verpflichten, ein noch nicht vollständig erschienenes Druckerzeugnis zu einem späteren Zeitpunkt abzunehmen. „Druckerzeugnis“ muß im Sinne einer medialen Repräsentation weiter gefasst werden.
2 Aspekte des eBusiness
87
Anbahnungen und Aushandlungen von Verträgen
strukturelle
Inhaltsbeschreibung
dialogische
Inhaltsbearbeitung
Schicht 4
publish / subscribe
request / response
Schicht 3
Schicht 2
send / receive
broadcast
multicast
unicast
Schicht 1
Abb. 2.2: Systemtechnische Perspektive von Vertragsverhandlungen im eBusiness
Im rechten Turm von Schicht 4 wird die rein strukturelle Repräsentation von Teilen potenzieller Verträge (linker Turm) um dialogische Komponenten zur Bearbeitung des Inhaltes eines
Vertrages angereichert. In Abschnitt 4 wird ein Beispiel zur Bearbeitung von Vertragsinhalten, das “Information and Content Exchange” (ICE) Protokoll vorgestellt und dessen Stärken
und Schwächen diskutiert.
Argumentations-theoretischer Aspekt
Eine weitere interessante Perspektive ergibt sich bei der Betrachtung der Phasen der Vertragsanbahnung und -aushandlung aus der argumentations-theoretischen Sichtweise. So
wird beispielsweise bereits die Aristotelischen Argumentationslehre in drei Stufen eingeteilt
(Abbildung 2.3):
• Die unterste Stufe der Rhetorik ist parteien- und kontextvariant ([Geth80]). Rhetorik
ist die Redekunst, allgemeiner die Darstellungskunst. Reklame und üppiges Selbstdarstellen gehören in diese Kategorie. Rhetorik spielt deshalb in unserer Betrachtung keine Rolle.
• Die mittlere Stufe der Topik ist bereits parteieninvariant, jedoch noch kontext-, oder anders ausgedrückt, situationsvariant. Neben dem Thema spielt in der Topik auch der gemeinsame Verständnishorizont eine Rolle. Auf dieser Stufe findet die Anbahnung einer
Vertragsverhandlung statt, da zwar das Thema, aber die Parteien noch nicht bekannt
sind.
88
3. Anbahnung auf Basis von ‚Publish/Subscribe‘
• Die oberste Stufe der Logik ist schliesslich sowohl parteien- als auch kontextinvariant.
Wie in Abschnitt 4 noch erläutert wird, ist die materielle (inhaltliche) Implikation auf
dieser Ebene anzusiedeln, während die Sequenz der Bearbeitung noch auf die Stufe der
Topik gehört, da Verhandlungen verlangen, dass zum thematischen Verstehen und Einprägen (Topik**) der Einzelteile eine Reihenfolge eingehalten wird.
3. Stufe
Logik
2. Stufe
Topik
1. Stufe
Rhetorik
parteieninvariant
kontextinvariant
Aushandlung
parteieninvariant
kontextvariant
Anbahnung/Aushandeln
parteienvariant
kontextvariant
hier
nicht von Interesse
Abb. 2.3: Argumentations-theoretischer Aspekt von Vertragsverhandlungen im eBusiness
3
Anbahnung auf Basis von ‚Publish/Subscribe‘
In diesem Abschnitt wird die Phase der Anbahnung von Vertragsverhandlungen auf Basis
von ‘Publish/Subscribe’-Systemen erläutert. Es wird deutlich, dass in dieser Phase lediglich
strukturelle Elemente als Gegenstand der Anbahnung angenommen werden. Diese Bausteine
jedoch werden dann im Abschnitt 4 um die beiden dialogischen Komponenten der Sequenz
und der Implikation, ergänzt um eine Vertragsbearbeitung zu ermöglichen.
3.1 Die Grundidee des ‘Publish/Subscribe’
Bei Publish/Subscribe sind Produzenten und Konsumenten durch das Dazwischenschalten
eines Subskriptionssystems voneinander entkoppelt. Produzenten können frei publizieren
und liefern ihre Ergebnisse an das Subskriptionssystem. Auf der anderen Seite müssen Konsumenten nur einmal subskribieren und werden regelmäßig mit neuen Informationen versorgt (Abbildung 3.1). Die Kommunikation zwischen Produzenten und Konsumenten geht
somit nur indirekt und asynchron vonstatten.
Ein anderes in der Datenverarbeitung weit verbreitetes Kommunikationsparadigma ist Request/Response: Ein Klient stellt seine Anfrage direkt an einen ausgewählten Server, dieser
führt die entsprechenden Verarbeitungen aus und liefert das Ergebnis an den wartenden
** Topik (gr. topos = Ort, Gegenstand, Thema) hat immer auch eine mnemotechnische (gedächtnisstützende) Komponente. „Um sich eine Gedankenreihe einzuprägen, bringt der Mnemotechniker deren
einzelne Punkte gedanklich in Verbindung mit bestimmten, in einer festen Reihe angeordneten Örtern; anschließend braucht er nur in Gedanken die Örter abzuschreiten, um auch die an ihnen haftende Gedankenreihe reproduzieren zu können.“ (Historisches Wörterbuch der Philosophie, Band 10,
Stichwort „Topik“)
3.2 Elektronische Marktplätze zur Vertragsanbahnung
89
publish
Subskriptionsverwaltungssystem
Broker
Publisher
Subscriber
subscribe
delivery
Redakteur/Reporter
Verlag
Leser/Abonnent
Abb. 3.1: Vereinfachte Publish/Subscribe-Architektur
Klienten zurück. Hier muß ein Konsument also den Produzenten der gewünschten Informationen kennen und sich direkt mit ihm in Verbindung setzen. Die Koppelung zwischen beiden
Parteien ist in diesem Fall sehr eng und synchron, d.h. der Klient ist in der Regel blockiert
bis der Server seine Verarbeitung abgeschlossen hat.
Publish/Subscribe kann in themen- und inhaltsbasiert unterschieden werden. In themenbasierten (‘subject-based’) Subskriptionssystemen werden eingehende Publikationen nach ihren Themen klassifiziert und Subskribenten können aus einer vorgegebenen Menge von Themenkanälen (’Channels’) wählen. Die weitere Auswahl muss der Subskribent selbst treffen.
In inhaltsbasierten (’content-based’) Systemen kann der Konsument seine Wünsche mittels
Prädikate präziser formulieren. Das Subskriptionssystem wendet diese Prädikate auf den Inhalt jeder eingehenden Publikation an und benachrichtigt im Erfolgsfall den Subskribenten.
Natürlich sind inhaltsbasierte Subskriptionssysteme wesentlich komplexer in der Verarbeitung.
Für die Praxis scheint ein hybrider Ansatz am vielversprechendsten: Subskribenten können
ihre gewünschten Informationen aus vorgegebenen Kanälen auswählen und innerhalb eines
solchen Themas inhaltsbasierte Bedingungen definieren. Das PubScribe System ([LeHR00])
beispielsweise verfolgt genau diese adaptive Vorgehensweise.
3.2 Elektronische Marktplätze zur Vertragsanbahnung
Im Business-to-Business (B2B) Bereich werden Marktplätze immer attraktiver. Ihre Aufgabe ist es, potenzielle Produzenten und ihre möglichen Konsumenten mit dem Ziel eines Vertragsabschlusses zusammenzubringen. Es geht also darum, wie in Abschnitt 2 dargestellt,
von einem N:M-Verhältnis in ein 1:1-Verhältnis überzugehen. Dazu müssen die Beteiligten
in Form von Vertragsschemata angeben, was sie von sich gegenseitig erwarten bzw. was sie
jeweils zu bieten haben. Diese Anbahnungsphase ist adäquat als ’Publish/Subscribe’-System
realisierbar, was im folgenden erläutert wird. Zwei wesentliche Annahmen werden dabei gemacht:
• Die Vertragsschemata sind konfigurierbar, können also von (potenziellen) Vertragspartnern in Form eines Dialogs bis zu einer beiderseits akzeptierten Version ausgestaltet werden.
90
K 1:
ts1:
ts2:
A 1:
ts3:
ts4:
A 2:
ts5:
Null:
Technischer Teil
Gehäuseoberteil
Gehäuseunterteil
Alternative Getriebetyp
Automatisches Getriebe
Manuelles Getriebe
Alternative Ölpumpentyp
Ölpumpentyp
keine Ölpumpe
K1
ts1
A1
ts2
ts3
A2
ts4
ts5
Null
Abb. 3.3: Technischer Teil eines Vertragsschemas
• Die “Closed-World-Assumption” gilt, d.h. die Vertragspartner befinden sich in einem
abgeschlossenen System, in dem nichts Neues hinzukommt. Auf die Nebenbedingung
der “Closed-World-Assumption” wird im folgenden Abschnitt noch detailliert eingegangen.
Angebotskatalog eines Publishers
Ein Hersteller kann beispielsweise verschiedene mechanische Getriebetypen anbieten. Er
publiziert diese Fähigkeit auf dem entsprechenden Kanal eines allgemein bekannten Marktplatzes, tritt also als Publisher in Abbildung 3.1 auf. Dabei gibt er sowohl detaillierte kaufmännische (Preis, Menge, etc.) als auch technische Daten (genauer Aufbau der Getriebe, Varianten, etc.) in Form eines konfigurierbaren Vertragsschemas heraus. Dieses Schema läßt
sich in natürlicher Weise mit mereologischen Mitteln rekonstruieren.
Die Basisgranulate von Vertragsschemata sind dabei Textstücke (ts), graphisch durch Rechtecke repräsentiert. Sie enthalten beliebigen Inhalt (z.B. Text, Bilder, ...), der nicht weiter
strukturiert ist. Einzelne Textstücke werden mittels Konnektoren verknüpft (Abbildung 3.2):
Konjunktionen K (∧) und Alternativen A (exklusives Oder, ∇). Die Alternativen können in
Muß- und Kann-Alternativen unterschieden werden. Zur Bildung der Kann-Alternativen,
also einer optionalen „Sonderausstattung“, ist ein leeres Textstück, das Nullschema erforderlich. Ein Beispiel für ein Muß-Alternative ist A1 in Abbildung 3.3, A2 hingegen stellt eine
Kann-Alternative dar. In die Konnektoren eingehende Pfeile drücken eine Teil-Ganze-Beziehung aus. Damit sind ts1, ..., tsn Subschemata.
Der technische Teil des Angebots des Getriebeherstellers könnte damit etwa so aussehen,
wie in Abbildung 3.3 dargestellt. Der Getriebehersteller publiziert all seine Angebote, die im
einzelnen Vertragsschemata der eben vorgestellten Art sind, als einen Katalog auf einem
Marktplatz, dem Broker in Abbildung 3.1.
A
K
∇
∧
ts1 …
tsn
K = Konjunktion
ts1 …
tsn
A = Alternative
Abb. 3.2: Konnektoren für die Vertragskomposition
3.3 Abbildung von Vertragsschemata auf XML
91
Der Katalog eines jeden Herstellers bzw. Publishers kann als ein Kanal im Subskriptionssystems verstanden werden. Allerdings sind auch andere Zuordnungen eines einzelnene Angebots zu einem bestimmten Kanal denkbar. Andere Hersteller, die bestimmte Teile für ihre
Produktion benötigen, reichen eine entsprechende Subskription am Marktplatzsystem ein.
Das zentrale Subskriptionsverwaltungssystem vermittelt die entsprechenden Angebote,
dient also als eine Art Kontaktbörse.
Subskription eines Konsumenten
Durch die eben vorgestellte, detaillierte Formulierung der Angebotsschemata der Publisher,
ist es den Abonnenten möglich, ihre Wünsche mittels inhaltsbasierter Subskriptionen sehr
exakt zu formulieren und damit die Anzahl „uninteressanter Angebote“ sehr gering zu halten, d.h. die vom Marktplatzsystem an den Subskribenten vermittelten Informationen erfordern nur wenig Nachbearbeitungsaufwand. Eine inhaltsbasierte Subskription, die vom oben
gezeigte Angebotsschema erfüllt würde, fragt beispielsweise nach all den Angeboten, die ein
Automatikgetriebe enthalten.
Doch auch der einfachere Fall der themenorientierten Subskription ist möglich: ein Subskribent möchte evtl. immer dann benachrichtigt werden, wenn sich auf einem Kanal ein neuer
Eintrag findet. Wenn jeder Kanal der Katalog eines einzelnen Herstellers/Anbieters ist, wäre
die Subskription damit auf die Angebote dieses einen Anbieters beschränkt. Sind über das
’Publish/Subscribe’-System zwei Parteien zusammengekommen, müssen Sie dialogisch
eine für beide Seiten akzeptable Belegung des Vertragsschemas aushandeln.
Es sollte in einem B2B-System möglich sein, zu einem konventionellen „Person-To-Person“
(P2P-System) überzugehen. Um dies zu bewerkstelligen, wurde in [JaSW00] ein generischer
Operator getAgent() spezifiziert und implementiert, mit dessen Hilfe ein Verhandlungspartner in einer Aufbauorganisation bestimmt werden kann. Vertragsanbahnungen und -verhandlungen sind Vertrauenssache. Ein Rekurs auf einen P2P-Modus sollte aus dieser Sicht immer
gegeben sein. Durch die direkte Kommunikation zwischen den beiden beteiligten Parteien
im P2P-System kann die Closed-World-Assumption gelockert bzw. sogar völlig aufgehoben
werden: Durch die direkte Interaktion ist es den Beteiligten möglich, etwa die bis dato uninterpretierten Textstücke mit ihrer eigenen, gewünschten Logik und Struktur zu versehen. Somit können völlig neue Aspekte in einen speziellen Vertrag aufgenommen werden.
3.3 Abbildung von Vertragsschemata auf XML
Die im vorigen Abschnitt vorgestellte mereologische Rekonstruktion von einzelnen Angeboten, aber auch von ganzen Katalogen, Publikationen und Subskriptionen läßt sich in sehr natürlicher Weise auf die Extensible Markup Language (XML, [Tolk99]) abbilden. Im Folgenden ist ein Teil einer entsprechenden Document Template Definition (DTD) skizziert, der die
Formulierung von dem in Abbildung 3.3 gezeigten Teilschema erlaubt:
92
...
<!ELEMENT Text_Stueck ANY>
<!ATTLIST Text_Stueck
Name
CDATA
#IMPLIED
id
ID
#REQUIRED>
<!ELEMENT Null_schema EMPTY>
<!ATTLIST Null_schema
Name
CDATA
#IMPLIED
id
ID
#REQUIRED>
<!ELEMENT Alternative
((Text_Stueck|Alternative|Konjunktion|Null_schema),
(Text_Stueck|Alternative|Konjunktion)+)>
<!ATTLIST Alternative
Name
CDATA
#IMPLIED>
<!ELEMENT Konjunktion
((Text_Stueck|Alternative|Konjunktion),
(Text_Stueck|Alternative|Konjunktion)+)>
<!ATTLIST Konjunktion
Name
CDATA
#IMPLIED>
...
Abb. 3.4: Document Template Definition für ein Vertragsschema
Die einzelnen Elemente sind der besseren Lesbarkeit wegen hervorgehoben. Alle Elemente
haben ein optionales Namensattribut. In Alternativen und Konjunktionen müssen mindestens
zwei Elemente eingehen, wobei eine Alternative höchstens ein Nullschema enthalten darf.
Angebote und Nachfragen, die in einer derartigen Form spezifiziert sind, können zum einen
als Gegenstand der Anbahnung und zum anderen durch Erweiterung der Mittel um Sequenz
und Implikation zur Aushandlung eines Vertrages herangezogen werden. Mit Hilfe dieser
XML-Grammatik läßt sich der in Abbildung 3.3 dargestellte Technische Teil eines Vertrages
wie folgt darstellen:
...
<Konjunktion Name="Technischer_Teil">
<Text_Stueck Name="Gehäuseoberteil" id="ts1"/>
<Text_Stueck Name="Gehäuseunterteil" id="ts2"/>
<Alternative Name="Alternative Getriebetyp">
<Text_Stueck Name="Automatisches Getriebe" id="ts3"/>
<Text_Stueck Name="Manuelles Getriebe" id="ts4"/>
</Alternative>
<Alternative Name=”Alternative Ölpumpentyp”>
<Text_Stueck Name="Ölpumpentyp" id="ts5"/>
<Null_schema id="n1"/>
</Alternative>
</Konjunktion>
...
Abb. 3.5: Technischer Teil eines Vertragsschemas in XML
3.4 Inhaltsbeschreibung am Beispiel von ICE
93
Es soll ausdrücklich darauf hingewiesen werden, daß mit dieser Grammatik auch wesentlich
komplexere Vertragsschemata modellierbar sind. So ist etwa eine beliebig tiefe Verschachtelung von Konjunktionen und Alternativen möglich, wie in Abbildung 3.6 angedeutet.
...
<Konjunktion Name=”K1”>
K1
<Text_Stueck id=”ts1”/>
<Alternative Name=”A1”>
A1
ts1
ts2
<Konjunktion Name=”K2”>
ts3
<Alternative Name=”A2”>
K2
<Text_Stueck
id=”ts5”/>
...
ts4
A2
</Alternative>
</Konjunktion>
ts5
</Alternative>
</Konjunktion>
...
Abb. 3.6: Geschachtelte Vertragsstruktur
...
Mit diesem Metaschema lassen sich hochflexible, konfigurierbare Vertragsschemata erstellen. Ein Produzent kann entsprechend instanziierte Vertragsvorstellungen an ein Marktplatz/
Subskriptionssystem senden. Es ist dann Sache des Brokers, passende Abonnenten mit diesem Hersteller zusammenzubringen, oder die Wünsche eines Abonnenten durch die Kombination der Angebote mehrerer Produzenten zu befriedigen. Damit kann zur nächsten Phase,
zur Aushandlung übergegangen werden.
3.4 Inhaltsbeschreibung am Beispiel von ICE
Das “Information and Content Exchange” (ICE, [Brod00]) Protokoll dient zum automatisierten Austausch von Informationen zwischen Geschäftspartnern, die bereits eine Geschäftsbeziehung miteinander eingegangen sind. ICE ist vollständig in XML definiert und soll ein
branchenspezifisches Vokabular in einer Fachsprache (am besten ebenfalls in XML) um ein
einheitliches, gemeinsames Protokoll- und Verwaltungsmodell erweitern.
Vertragsschemata bzw. -verhandlungen sind nicht das prinzipielle Anliegen von ICE. Vielmehr wird davon ausgegangen, daß die zwei Beteiligten bereits in einer etablierten Geschäftsbeziehung zueinander stehen. Die entsprechenden Verhandlungen schließt der ICEStandard in Version 1.1 explizit aus; dies muß zuvor geregelt werden.
Die Kommunikation mittels ICE läuft in zwei Phasen ab: Im ersten Schritt konfigurieren Anbieter und Nachfrager ein Angebot††. Anschließend werden die tatsächlichen Daten geliefert. Die Einrichtung eines Angebots beginnt in der Regel damit, dass der Nachfrager den
Angebotskatalog (ice-catalog) des Anbieters anfordert. Dieser ist im wesentlichen aus Einzelangeboten (ice-offers) aufgebaut, die evtl. zu thematisch zusammengehörigen Gruppen (ice-of-
94
4. Vertragsverhandlungen als eine dialogische Inhaltsbearbeitung
fer-groups)
zusammengefaßt sind. Der Angebotskatalog kann somit rekursiv aufgebaut werden (ice-offer-groups dürfen wiederum ice-offer-groups enthalten). Der Nachfrager entscheidet
sich nun für bestimmte Angebote. Jedes dieser Angebote kann nun (einzeln!) mit dem Anbieter verhandelt werden. Allerdings handelt es sich hierbei nicht um inhaltliche/materielle
Verhandlungen, wie sie in diesem Aufsatz behandelt werden, sondern es geht um Rahmenbedingungen, wie Beginn der Subskription, Frequenz oder Kommunikationsmodus. ICE
sieht damit keine Strukturierung, z.B. im oben gezeigten mereologischen Sinn, der tatsächlich zu übertragenden Informationen vor. Somit sind die in Abbildung 3.3 und Abbildung 3.6
dargestellten Vertragsschemata in ICE nicht darstellbar. Allerdings steht es dem Anwender
frei, die DTD um entsprechende Konstrukte zu erweitern. ICE ist damit im eBusiness nur
zum Verkauf von fertig konfigurierten Gütern (Bücher, CDs, ...) geeignet. Bei zu konfigurierenden Gütern, wie Autos, Einrichtungen oder Verträgen, muß jede mögliche Variante als eigene ice-offer angeboten werden, was zu einer exponentiell wachsende Anzahl führen kann.
Die Fertigstellung der ICE-Spezifikation war mit hohen Erwartungen verbunden. Allerdings
ist laut [Vald00] eine gewisse Ernüchterung eingetreten, einerseits auf Grund der umfangreichen Spezifikation, die für viele Aufgaben zu weitreichend ist, andererseits, weil sich das Erscheinen von ICE so lange hingezogen hat.
4
Vertragsverhandlungen als eine dialogische
Inhaltsbearbeitung
Der folgende Abschnitt beleuchtet den Vorgang der Vertragsverhandlungen aus einer sprachkritischen Perspektive. Ausgehend von dem einfachen ’request/response’-Muster werden
sukzessive komplexere Konstrukte eingeführt.
4.1 ‚Request/Response‘ als Sprechhandlung
"Ein Dialog ist eine sprachlich geführte Auseinandersetzung zwischen zwei oder mehr Personen, charakterisiert durch Rede und Gegenrede in den Gestalten: Frage und Antwort (zum
Zwecke der Begriffsklärung), Behauptung und Bestreitung (zum Zwecke der Urteilssicherung), Beweis und Widerlegung (zur Offenlegung der Schlussweisen)."‡‡ Alle drei Formen
können in Dialogen einer Vertragsverhandlung vorkommen. Um zu einer Schematisierung
von Vertragsdialogen zu gelangen, ist es zweckmäßig, von einem neutralen Standpunkt aus
die Dialogpartner in ihren Rollen zu betrachten, um im Sinne Austins ([Aust75]) festzustel†† Die ICE-Spezifikation verwendet die Begriffe Syndikator statt Anbieter, Subskribent statt Nachfrager und Subskription statt Angebot. Allerdings werden diese Begriffe nicht in dem Sinn verwendet,
wie sie in diesem Aufsatz verstanden werden. ICE verfährt trotz dieser Bezeichner nach dem ’Request/Response’-Modell.
‡‡ Stichwort "Dialog". In: Enzyklopädie, Philosophie und Wissenschaftstheorie (Hrsg: J.Mittelstraß).
4.2 Schematisieren von Vertragsverhandlungen
95
len, dass mit sprachlichen Äußerungen oder Redehandlungen primär etwas getan und erst sekundär auch etwas gesagt wird. Seit Austin (1911-1960) wird eine Äußerung in einen performativen (π) ("Was tut er?") und einen propositionalen Teil (p) ("Was sagt er aus?") zerlegt,
so dass man die Struktur einer Redehandlung allgemein auf die Standardform π(p) bringen
kann. In Vertragsdialogen gibt es für Rede und Gegenrede zwei Performative: "Request"
(Auffordern) und "Response" (Entgegnen). "Request" ist kontexterzeugend, "Response"
konkontextaufnehmend. Das performative Zeichen π nimmt die Form ’?’ für "Request" und
’ ’ für "Response" an.
4.2 Schematisieren von Vertragsverhandlungen
In dieser Arbeit sind Vertragsverhandlungen ein Dialog zweier Parteien über ein Vertragsschema, das den propositionalen Kern (Inhalt) der sprachlichen Auseinandersetzung vorgibt.
Den Parteien wird jeweils eine Rolle zugeschrieben, wobei wir die eine Rolle "Käufer" (
Subskribent, Konsument, Nachfrager, Opponent) und die andere "Verkäufer" (Publisher,
Produzent, Anbieter, Proponent) nennen. Zwecks Protokollierung einer Vertragsverhandlung
wird eine zweispaltige Tabelle eingerichtet.
Einer Tradition in der Logik seit Beth (1908-1964) folgend, wird für den Käufer, den Opponenten, die linke und für den Verkäufer, den Proponenten, die rechte Spalte der Tabelle vorgesehen.*** Um das angeführte Beispiel fortzusetzen, nehmen wir das Vertragsschema des
technischen Teils K1 als propositionalen Kern eines Vertragsangebotes.
Käufer
Verkäufer
?( )
(K1)
Abb. 4.1: Anfangssituation
Dem Verkäufer obliegt es, für K1 eine Verhandlungsreihenfolge, Tagesordnung (Agenda),
allgemeiner eine topische Reihenfolge vorzuschlagen. Eine der vielen Baumtraversierungen
zum Vorbild nehmend, mag ein Vorschlag des Verkäufers
K1: ts1; ts2; A1; A2
sein. ';' ist das Zeichen für eine Ordnungsrelation (reflexiv, transitiv, antisymmetrisch). Wir
wollen im folgenden Verhandlungsdialog um K1 zunächst annehmen, dass die vorgeschlagene Reihenfolge vom Käufer akzeptiert wird. Sollte der Käufer nicht einverstanden und ein
***Eine humorvolle Merkregel hat sich im Laufe der Zeit entwickelt (Inhetveen). Sie lautet: "Die Opposition ist links." Im Gegensatz dazu befindet sich der Käufer in Abbildung 3.1 auf der rechten Seite, da diese Abbildung sich am Daten- bzw. Lesefluss orientiert.
96
4. Vertragsverhandlungen als eine dialogische Inhaltsbearbeitung
Dialog um Reihenfolgen (sprich "Tagesordnungsdebatte") zulässig sein, so geraten wir in
eine ganz andere Dialogdimension, die Dimension der Meta-Dialoge, auf die später noch
einzugehen ist.
Käufer und Verkäufer sind wechselseitig am Zuge. Der Käufer stellt durch ’?’ in der zweiten
Zeile der Abbildung 4.2 den Kontextrahmen her, in dem der Verkäufer zu antworten hat. Eine
Wiederholung einer Antwort durch den Käufer gilt als Zustimmung. In Abbildung 11 wird
der Dialog um K1 nur kursorisch beschrieben, was durch Punkte kenntlich gemacht wird. Bei
der Behandlung der Alternative A1 in Abbildung 11 schlägt der Verkäufer zunächst ts3 vor,
was vom Käufer abgelehnt wird ( (¬ts3)). Mit dem Vorschlag ts4 kommt der Verkäufer dann
schließlich ans Ziel.
Von einer topischen Reihenfolge (;) in einem Dialog ist eine logische Reihenfolge ( ) streng
zu unterscheiden. Eine logische Reihenfolge ( ) wird auf eine materielle (inhaltliche) Implikation 'a b' (implicare(lat): etwas mit etwas verbinden) zurückgeführt, die in unserem
Beispiel aus Gründen der Vereinfachung noch nicht behandelt wurde. Im Falle einer logischen Reihenfolge wird ein Wahrheitswert vom Antezedens ’a’ auf die Konsequens ’b’ vererbt. ’ ’ ist keine Ordnungsrelation, sondern es gilt: a b oder a impliziert b, wenn die logische Verknüpfung mit der Operation der Subjunktion (→), also ’a → b’, wahr ist. Subjunktionen werden grammatisch durch einen Konditionalsatz wiedergegeben. Eine Reihenfolge
kommt im Sinne der konstruktiven Logik (Lorenzen 1915 - 1994) insofern zustande, als zunächt das Antezedens als Aufgabe mit Erfolg behandelt werden muss, bevor man mit der Behandlung der Konsequens beginnt†††. Wir wollen in unserem Beispiel die folgende technisch
bestimmte Implikation annehmen:
"Wenn ein automatisches Getriebe ausgewählt wird, dann muss eine Ölpumpe für einen Zwangsumlauf vorgesehen werden".
Formalisiert geschrieben als: ts3 ≤ A1
Käufer
ts5 ≤ A2.
Verkäufer
(ts1 ; ts2 ; A1 ; A2)
?(ts1)
ts1
(ts1)
(ts1)
ts2
?(A1)
(ts3)
A1
(¬ts3)
(ts4)
(ts4)
(ts4)
A2
Abb. 4.2: Verhandlungsdialog um K1
†††Die klassische Logik kennt diese Reihenfolge nicht. Sie ersetzt ’a → b’ einfach durch ’¬a ∨ b’.
4.3 Trennung von topischen und logischen Verhandlungsdialogen
I: ts3
ts5
A1
ts3
97
A2
ts4
ts5
Null
Abb. 4.3: Ergänzung des Schemas der Abbildung 6 um eine Implikation
'≤ ' ist das Zeichen für eine Teil-Ganze-Relation. Mit ts3 ≤ A1 bzw. ts5 ≤ A2 soll beschrieben
werden, dass ts3 bzw. ts5 ausgewählt wurden. Implikationen sind Bestandteil eines Vertragsschemas. Insofern muss das Schema in Abbildung 6 um das Implikationsschema in
Abbildung 12 erweitert werden.
Die Erweiterung der DTD in Abbildung 3.4 um die Implikation stellt Abbildung 4.4 dar:
...
<!ELEMENT Implikation EMPTY>
<!ATTLIST Implikation
Name
CDATA #IMPLIED
Antezedens
IDREF #REQUIRED
Konsequens
IDREF #REQUIRED>
...
Abb. 4.4: DTD-Code für die Implikation
Der logische Dialog um die zur Implikation ( ) gehörige Subjunktion (→) ist für unser Beispiel in Abbildung 4.5 gezeigt.
Käufer
?(ts3 ≤ A1)
Verkäufer
(ts3 ≤ A1 → ts5 ≤ A2)
(ts3 ≤ A1)
Es findet jetzt ein Dialog um das Antezedens statt. Wenn
der Käufer dem zustimmt, muss auch der Verkäufer zustimmen.
(ts5 ≤ A2)
(ts5 ≤ A2)
Abb. 4.5: Verhandlungsdialog um eine Subjunktion
4.3 Trennung von topischen und logischen Verhandlungsdialogen
In einer Schematisierung von Verhandlungsdialogen ist die Bestimmung der Reihenfolge ein
zentrales Problem. Reihenfolgen werden zwar als ein formales, schematisches Problem behandelt; jedoch an der Form (Schema) hängen die Inhalte (das Materielle). Ein Dilemma ist,
dass es zwei völlig verschieden determinierte Reihenfolge-Kategorien gibt: Eine topische,
versinnbildlicht durch ’;’, und eine logische, dargestellt durch ’ ’ und seine konstruktiv-logische Behandlung. Beide Kategorien verfolgen zwei unterschiedliche Ziele.
98
In topischen Zusammenhängen steht die Frage zur Debatte, wie - in Analogie zu Lehr- und
Lernsituationen - die einzelnen Stücke in eine methodische Abfolge gebracht werden können, um ein schnelles Verständnis von Inhalten zu fördern. In einigen Situationen ist einem
Vorgehen ’Vom Allgemeinen zum Besonderen’ der Vorzug zu geben. Andere Sachverhalte
verlangen einen umgekehrten Ansatz. Wieder andere sind nur ad hoc, aus einem Augenblick
heraus, zu meistern. Aus diesen Gründen ist in jedem Verhandlungsablauf die Möglichkeit
zu einem Meta-Dialog einzubringen, in dem geklärt wird, ob ’a ; b’ oder ’b ; a’ gelten soll.
In einem Käufermarkt gibt es eine eindeutige Meta-Regel, die lautet: "Im Konfliktfall bestimmt der Käufer die Reihenfolge". Andere Marktformen - man denke nur an eine monopolistische Angebotssituation - gibt es andere Meta-Regeln. Marktformen bestimmen die topischen Meta-Regeln.
In logischen Dialogen stehen Konditionierungen zur Diskussion, wobei Antezedens und
Konsequens einer Implikation rein aus dem Inhalt und nicht aus der Methode seiner Darbietung entstehen. Logik und Topik - und auch die darunter liegende Rhetorik - stehen eben in
unterschiedlichen Argumentationszusammenhängen, wobei sich die in der Regel schwache
Rhetorik immer gerne in den höheren Regionen einnisten möchte.
Die Unterschiede in Logik und Topik begründen den Vorschlag, für beide Teile einer Vertragsverhandlung zwei getrennte Durchläufe vorzusehen. Im topischen Teil, der voranzustellen ist, muss dann vermerkt werden, dass Textstücke Antezendens bzw. Konsequens einer
Implikation enthalten, über die erst in einem zweiten Durchgang entschieden wird. So müsste in Abbildung 11 die Entscheidung einer Ablehnung von ts3 hinausgezögert werden, um
dann erst in einem logischen Zusammenhang verhandelt zu werden (Abbildung 4.5).
Es liegt eine erhebliche Schwierigkeit vor, wenn Implikationen in eng vermaschten Nestern
auftreten. Die Systematik einer Behandlung solcher Nester wie auch die Einbettung von Vertragsverhandlungen als Tätigkeiten in einen Workflow kann im Rahmen dieser Arbeit nicht
zur Sprache kommen.
5
Auf Grund des sich immer weiter verbreitenden eBusiness ist es notwendig, die dabei ablaufenden Vorgänge zu formalisieren. Dieser Beitrag untersucht insbesondere die beiden Phasen
der Anbahnung und der Aushandlung von Verträgen, da gerade sie durch die Anonymität
aber auch durch das unübersichtlich große Angebot im Internet immer schwieriger zu durchschauen sind. Hierbei werden prozessorientierte, systemtechnische und argumentationstheoretische Aspekte berücksichtigt.
Ziel der Anbahnungsphase ist es, von einem N:M Verhältnis, d.h. zwischen vielen Anbietern
und vielen Nachfragern, möglichst effizient zu einem geeigneten 1:1 Verhältnis zu kommen.
Für dieses Finden eines möglichen Vertragspartners eignet sich das ’Publish/Subscribe’-Paradigma hervorragend - sei es inhalts- oder nur themenbasiert. Während Anbieter ihre Ange-
99
bote in mereologischer Form an einem (logisch) zentralen Marktplatzsystem registrieren,
richten Nachfrager auf diesem eine Subskription ein. Existieren zwei "passende" Partner, so
bringt das Subskriptionssystem diese zusammen und sie können in die direkte, persönliche
Vertragsverhandlungsphase übergehen (P2P).
Für diese Verhandlungsphase eignet sich das klassische ’Request/Response’-Vorgehensmodell besser, denn die Aushandlung (auch einzelner Klauseln) lässt sich schematisch als Dialog darstellen. Dabei sind topische und logische Verhandlungen zu unterscheiden. Die topische Reihenfolge betrifft die methodische Abfolge (Agenda), während die logische Reihenfolge aus der materiellen Implikation hervorgeht.
Für beide Phasen skizziert dieser Aufsatz eine universelle XML-Grammatik, die leicht auf
spezifische Bedürfnisse angepaßt werden kann. Neben diesem Vorschlag wird auch das ICEProtokoll untersucht, das aber weniger flexibel ist, da es keine Subskription auf komplexen,
verschachtelten Strukturen beherrscht.
Aktuelle Arbeiten befassen sich vor allem mit der Realisierung der Anbahnungsphase mit
Hilfe des PubScribe-Systems ([LeHR00]), das im Moment prototypisch implementiert wird.
Darauf aufsetzend ist die Entwicklung eines Dialogmonitors für die Verhandlungsphase geplant.
100
Literatur
AcFZ97
Acharya, S.; Franklin, M.; Zdonik, S.: Balancing Push and Pull for Data Broadcast. In: Proceedings
of the International Conference on Management of Data (SIGMOD’97, Tuscon (AZ), U.S.A., 13.15. Mai), 1997, S. 183-194
AAB+99 Altinel, M.; Aksoy, D.; Baby, T.; Franklin, M.; Shapiro, W.; Zdonik, S.: DBIS-Toolkit: Adaptable
Middleware For Large Scale Data Delivery. In: Proceedings of the 28th International Conference on
Management of Data (SIGMOD’99, Philadelphia (PA), U.S.A., 31. Mai.-3. Juni), 1999, S. 544-546
Aust75 Austin, J.L.: How to do things with words. Harvard University Press, 1975
Bart00
Bartsch, M.: Qualitätssicherung für Software durch Vertragsgestaltung und Vertragsmanagement.
In: Informatik Spektrum, 23(2000)1, S. 3-10
Birm93 Birman, K.P.: The Process Group Approach to Reliable Distributed Computing. In: Communicatons
of the ACM, 36(1993)12, S. 36-53
Brod00 Brodsky, J.: Hunt, B.; Khoury, S.; Popkin, L.: The Information and Content Exchange (ICE)
Protocol, AC Review Version 1.1, Revision R, 2000
(Elektronisch verfügbar unter: http://www.icestandard.org)
Geth80 Gethmann, C.F.: Die Logik der Wissenschaftstheorie. In: Gethmann, C.F. (Hrsg.): Theorie des
wissenschaftlichen Argumentierens, Suhrkamp Verlag, Frankfurt, 1980, S. 15-42
Hack97 Hackathron, R.: Publish or Perish. In: Byte Magazin, September 1997
JaSW00 Jablonski, S.; Schlundt, M.; Wedekind, H.: Eine generische Komponente zur rechnergestützten
Nutzung von Aufbauorganisationen. In: Informatik Forschung und Entwicklung, 15(2000)3
KeWo97 Kelly, K.; Wolf, G.: PUSH!. In: Wired Magazin, März 1997
(Elektronisch verfügbar unter: http://www.wired.com/wired/archive/5.033/ff_push_pr.html)
Memory based Publish/Subscribe System. Technischer Bericht, Institut für Informatik, Universität
Erlangen-Nürnberg, eingereicht zur Veröffentlichung, 2000
Lore87 Lorenzen, P.: Constructive Philosophy. The University of Massachusetts Press, Amherst, 1987
Merz99a Merz, M.: Electronic Commerce. Marktmodelle, Anwendungen und Technologien, dpunkt Verlag,
Heidelberg, 1999
Merz99b Merz, M., Tu, T., Lamersdorf, W.: Electronic Commerce, Technologische und organisatorische
Grundlagen. In: Informatik Spektrum, 22(1999)5, S. 328-343
Merz98 Merz, M., e.a.: Supporting Electronic Commerce Transactions with Contracting Services. In:
International Journal of Cooperative Information. World Scientific Publishing Company, 1998
MiBo96 Milosovic, Z.; Bond, A.: Electronic Commerce on the Internet. What is still missing?
(Elektronisch verfügbar unter: http:// www.isoc.org/HMP/PAPER/096/html/096.html)
Powe96 Powell, D.: Group Communication. In: Communications of the ACM, 39(1996)4, S. 59-97
Tolk99
Tolkersdorf, R.: XML und darauf basierende Standards. Die neue Auszeichnungssprache des Web.
In: Informatik Spektrum, 22(1999)6, Dezember 1999, S. 407-421
Vald00
Valdés, R.: Content Aggregation Is Hot; Too Bad ICE Is Melting. In: Webtechniques 7/2000,
(Elektronisch verfügbar unter: http://www.webtechniques.com/archieves/2000/07/plat/)
Wede89 Wedekind, H.: Konstruktionserklären und Konstruktionsverstehen. In: Zeitschrift für wirtschaftliche
Fertigung, 84(1989)11, S. 623-629
C
Technogical Issues
102
Part C: Technological Issues
MAINTENANCE OF AUTOMATIC SUMMARY TABLES IN IBM
DB2/UDB
Wolfgang Lehner*, Bobbie Cochrane, Richard Sidle, Hamid Pirahesh, Markos Zaharioudakis
IBM Almaden Research Center
650 Harry Road, San Jose CA, 95120, U.S.A.
[email protected],{bobbiec, rsidle, pirahesh, markos}@almaden.ibm.com
Abstract
Materialized views are commonly used to improve the performance of aggregation queries by orders of magnitude. In contrast to regular tables, materialized views are not
directly updateable by the user, but are indirectly synchronized by the database system
itself. In this paper we present an overview of the maintenance strategies for ‘Automatic
Summary Tables’, the materialized view implementation in IBM’s DB2/UDB database
system. In the first part, we focus on the incremental maintenance method, providing a
way to synchronize materialized views based on the joins of only the changes of the base
tables (deltas) with all other tables of the view definition. The second part of the paper
outlines optimization techniques to improve the full recomputation of a set of materialized views.
* current address is: University of Erlangen-Nuremberg; Martensstr. 3; D-91058 Erlangen, Germany
104
1
1. Introduction
Introduction
Materialized views are a well-known technique for improving the performance of aggregation queries that access a large amount of data while performing multiple joins in the context
of a typical data warehouse star schema. Fully exploiting the power of the materialized view
technique requires support from the database system in (a) picking the optimal set of materialized views for a specific application scenario and workload ([HaRU96]), (b) transparently
rerouting user queries originally referencing base tables to those views ([ZCPL00]), and (c)
maintaining materialized views, i.e. synchronizing them with the base tables ([MuQM97]).
This paper focuses on the current maintenance strategies for ‘Automatic Summary Tables’
(ASTs), DB2’s implementation of materialized views and outlines ongoing research work to
improve simultaneous refresh of multiple summary tables.
Example
To illustrate the maintenance algorithms by means of an example, consider the database
schema in figure Figure 1.1. This schema reflects a classical snowflake schema with a fact
table recording single business transactions (trans). A single transaction refers to multiple
transaction items (transitem). All items are classified into single product groups (pgroup) and
these in turn are grouped by different product lines (prodline), thus forming a hierarchical
product dimension. In analogy to the product dimension, all locations (loc), where the single
transactions happen, are classified according to cities, state of the single cities and their countries. Furthermore each customer in this scenario may have multiple accounts (acct), which
in turn may be used for multiple transactions. Finally, the time dimension reflects the natural
classification by day, month, and year. As a variant with regard to the classical conceptual
star schema pattern in the context of a data warehouse, the numeric value (amount), which
is subject of aggregation, is not a member of the fact table but belongs to the dimension table
transitem†.
2
Definition of Automatic Summary Tables
Similar to a regular view, the content of an Automatic Summary Tables (ASTs) is defined by
a SELECT expression. The sample expression from figure Figure 2.1 defines a five-dimensional hierarchical data cube for location (city→state→country), product (group→lineitem),
and time (month→year) dimension hierarchies using the ROLLUP()-expression. The summary data is further categorized according to marital status and income range of the customer
† We would like to mention here that from a physical point of view, the location dimension (loc) is
denormalized so that all the data is stored within a single table. Moreover, the time hierarchy is not
directly reflected in tabular structures but implemented via built-in functions like day(), month(), and
year().
2 Definition of Automatic Summary Tables
105
location
country
1
N
prodline
1
state
N
1
N
plineid
pgroup
time
year()
city
1
1
1
N
transitem
amount
1
N
transid
N
trans- N
action
N
month()
custid
locid
pgid
1
N
customer
1
N
N
1
day()
account
acctid
pdate
1
Abb. 1.1: Schema of the sample scenario
using the CUBE()-expression. This example provides data for 4*3*3*4 = 144 grouping combinations, demonstrating that a complete OLAP scenario, providing data for different levels
of aggregations, can be specified using a single summary table.
Additionally, an AST definition may contain an explicit specification of its physical layout
similar to a regular base table, i.e. it can be partitioned, replicated, indexed, etc. Finally, a
refresh mode must be assigned to an AST. Declaring an AST ‘REFRESH IMMEDIATE’ implies that when a base table is modified, all dependent ASTs are automatically synchronized
within the context of the modifying statement. This is done by applying the incremental
maintenance strategy outlined in section 3. If an AST is declared ‘REFRESH DEFERRED’
then no base table changes are propagated when a base table is modified. Using the available
technology, the ASTs have to be fully recomputed to reflect a consistent state again.
CREATE SUMMARY TABLE ast_demo AS (
SELECT loc.country, loc.state, loc.city,
pg.lineid, pg.pgid,
c.marital_status, c.income_range
YEAR(t.pdate) AS year, MONTH(t.pdate) AS month
SUM(ti.amount) AS amount,
COUNT(*) AS count,
GROUPING(c.marital_status) AS grp_mstatus,
GROUPING(c.income_range) AS grp_income_range
FROM transitem AS ti, transaction AS t, location AS loc,
pgroup AS pg, account AS a, customer AS c
WHERE ti.transid = t.transid AND ti.pgid = pg.pgid
AND t.locid = loc.locid
AND t.acctid = a.acctid
AND a.custid = c.custid
GROUP BY ROLLUP(loc.country, loc.state, loc.city),
ROLLUP(pg.lineid, pg.pgid),
ROLLUP(YEAR(t.pdate), MONTH(t.pdate)),
CUBE(c.marital_status, c.income_range)
) DATA INITIALLY DEFERRED REFRESH IMMEDIATE;
Abb. 2.1: Sample ‘Automatic Summary Table’
106
3. Incremental Maintenance of ASTs
Section 4 sketches some issues connected to this strategy. While ‘DEFERRED’ ASTs may
be defined by any SELECT expression, the specification of an incrementally maintainable
AST must exhibit the following characteristics:
• grouping expression:
The grouping expression may consist of single grouping columns or any valid combination of complex grouping expressions like CUBE(), ROLLUP(), or GROUPING
SETS(). The evaluation of the grouping expression must not result in duplicate grouping combinations. For example, “ROLLUP(a,b),a” is not allowed, since it evaluates to
((a,b),a) = (a,b); ((a),a) = (a); and ((),a) = (a), resulting in the combination (a) appearing
twice.
• aggregate functions:
The set of aggregate functions is restricted to SUM and COUNT and must contain a
named COUNT(*) column. If a column X is nullable and parameter of an aggregate
function in that AST, a named COUNT(X) column is also required.
• grouping functions:
A GROUPING() function expression is required for nullable grouping columns that are
involved in a complex grouping expression. This allows the system to differentiate between naturally occurring and system generated NULL-values denoting (sub-) totals. In
the sample AST, grouping columns of the customer dimension may contain NULL values and thus require a corresponding GROUPING() function column in the AST definition.
3
Incremental Maintenance of ASTs
The advantage of an incremental maintenance strategy is that the changes in the AST are
computed directly from the changes of the base table. Consider an AST containing a join
over several tables. Incremental maintenance can compute the changes to the AST using the
joins of only the changes of the base tables (deltas) with all other tables of the AST definition.
Note that even complex ASTs like hierarchical data cubes over a set of tables are incrementally maintainable in DB2 by only applying the base table changes to the AST. The following
description sketches the single steps necessary to perform the incremental maintenance of an
AST.
STEP I: Building the Raw Delta
In a first step all local deltas, i.e. the inserted, updated or deleted rows of all base tables are
combined to generate the global raw delta stream. Multiple local deltas might be caused
within the context of a single statement while maintaining database semantics, such as enforcing referential integrity constraints using ‘ON DELETE CASCADE’ .
3 Incremental Maintenance of ASTs
107
To synchronize an AST with an underlying update operation, the delta consists of the rows
before and after the update extended with a numeric tag column holding ‘-1’ for the old and
‘1’ for the new values.
STEP II: Aggregating the Delta
In a second step, the delta stream is aggregated. If the underlying modification is an insertion
or deletion, then the grouping specification contains all the combinations specified by the
AST. For ASTs with complex grouping expressions, e.g. CUBE(), this step results in a complete “delta cube” with ‘higher’ delta aggregate values for all original delta changes. If the
modification is an update, then the grouping specification contains all the combinations specified by the AST amended to include the tag column.
For updates, the resulting aggregate values are multiplied with the value of the tag column
(resulting in negative old values and positive new values), and a second delta aggregation
step consisting of a simple aggregation over all grouping columns plus all grouping function
columns is added to eliminate the tag column and compute the net aggregate changes (i.e.
delta value) from the old to the new base table values.
STEP III: Pairing the Delta with the AST
After aggregation, the rows in the delta are paired with the current content of the AST using
a left outer-join (the delta goes left) over the grouping and grouping function columns of the
AST. Thus a delta group either matches with a single group or the summary table or no group
at all. Those matching delta groups cause the corresponding row in the AST to be modified;
those that do not have matches are later added to / deleted from the AST.
STEP IV: Aggregate Value Compensation
When a delta group, δ.g, has a corresponding group, τ.g, in the AST, then the new value for
δ.g must be computed based on the value of δ.g and the current value of τ.g. Since the AVG
aggregation function can be mapped to an equivalent SUM/COUNT expression, ‘+’ is the
only aggregation value compensation function, required to support SUM, COUNT, and AVG.
Based on the value of the delta δ and the AST τ, the new value for each aggregation column
(δ.c’ for COUNT and δ.s’ for SUM(x)) is computed as follows:
δ.c’ = δ.c + τ.c
δ.s’ = CASE WHEN (δ.c + τ.c == 0) AND (PGRANDTOTAL) THEN NULL
WHEN ISNULL(δ.s) THEN τ.s
WHEN ISNULL(τ.s) THEN δ.s
ELSE (τ.s + δ.s) END
For ASTs with complex grouping expressions (like CUBE(), ...), the overall summary value,
or grand total, evaluates to NULL even if the number of contributing rows is zero. The necessary predicate PGRANDTOTAL in the first CASE condition consists of a conjunction of either
(ISNULL(Ai)) or, if specified, (GROUPING(Ai) == 1) for each grouping column Ai (1≤i≤n) of
the AST. For a SUM over non-nullable columns, the new cardinality is derived from the re-
108
4. Correctness of the ‘Cube Delta’ Approach
quired COUNT(*) column. If, however, the parameter column of the aggregate function is
nullable, then the new cardinality is derived from the COUNT-values ranging over that nullable column.
STEP V: Applying the Delta to the AST
Depending on the underlying base table operation, the delta stream is applied to the AST using the following operations:
• base table insert:
Already existing groups in the AST are updated, new groups are inserted into the AST.
if (ISNULL(τ.c))
// no matching AST groups after the outer join
INSERT INTO τ SET τ.c = δ.c, τ.s = δ.s
else
UPDATE τ SET τ.c = δ.c’, τ.s = δ.s’ WHERE δ.c’ >0
• base table delete:
Groups of the delta with a new cardinality of zero are deleted, the remaining rows are
updated with the new values of the delta stream. Note that in the case of ASTs with
complex grouping expressions (like CUBE()), the grand total row may never be deleted.
if (NOT ISNULL(τ.c))
// only matching AST groups after outer join
if (δ.c’ == 0)
DELETE FROM τ WHERE NOT(PGRANDTOTAL)
else
UPDATE τ SET τ.c = δ.c’, τ.s = δ.s’
• base table update:
This case may be considered a combination of base table insertion and deletion resulting in a sequence of AST update, delete, and insert operation as described above.
4
Correctness of the ‘Cube Delta’ Approach
In this section, we show the correctness of our incremental maintenance approach for summary tables with complex grouping expressions. Before going into detail, we discuss some
necessary preconditions:
4 Correctness of the ‘Cube Delta’ Approach
109
General Grouping Expressions: grouping sets
To avoid the discussion of the single cases for ‘CUBE()’, ‘ROLLUP()’, and ‘GROUPING
SETS()’, we assume a general grouping expression (grouping set) ranging over n grouping
combinations. The corresponding grouping set is denoted by gs[grp1, ..., grpn], where each
grpi (1≤i≤n) is a single grouping combination. Thus, in the case of n=1, the grouping sets degrade to a simple group-by expression, so that gs[grp1] equals to grp1.
The restriction to grouping sets is feasible, because each ‘CUBE()’ or ‘ROLLUP()’ operator
may be seen as an abbreviation for an explicit list of grouping combinations. In general the
following rules apply for the mapping of ‘CUBE()’ and ‘ROLLUP()’ expressions to a corresponding ‘GROUPING SETS’ expression:
• With Ai (1≤i≤n) as grouping attributes, the expression CUBE(A1, ..., An) corresponds to
a grouping sets expressions over all groups from the power set over the grouping attributes, i.e. GROUPING SETS( (A1, ..., An), (A2, ..., An), ..., (A1, ..., An-1), ..., (A1), ...,
(An), ()).
• With Ai (1≤i≤n) as grouping attributes, the expression ROLLUP(A1, ..., An) corresponds
to the expression GROUPING SETS( (A1, ..., An), (A1, ..., An-1), (A1, ..., An-2), ...,(A1),
()), resulting in n+1 different grouping combinations.
Distinctiveness of Grouping Expression
Our algorithm requires that the grouping expression of the materialized view, i.e. AST, is distinct. This means that no grouping combination is produced more than once. Testing the distinctiveness is accomplished for example by expanding the grouping specification to grouping sets, and checking for duplicates. Note that, not only explicitly existing columns can
cause duplicates. The grand total (grouping over all columns) is also only allowed to appear
once.
For example the expression ‘GROUPING SETS( ROLLUP(a,b), ROLLUP(a,c) )’, which is
expanded to ‘GROUPING SETS( (), (a), (a,b), (), (a), (a,c) )’ is rejected, because each ‘ROLLUP()’ produces the grouping combination (a) and () twice.
Nullable Columns and Grouping()-Function Columns
In the context of queries containing complex grouping expressions, NULL values are used
to identify super aggregate values, like (sub-) totals. To distinguish between naturally occurring NULL values and systematically generated NULL values in a result set, the current SQL
standard introduced a special ‘GROUPING()’ function. A ‘GROUPING()’ function column returns ‘1’ if the corresponding value of that column denotes a super-aggregate; otherwise, it
returns ‘0’. Our algorithm requires the existence of a ‘GROUPING()’ function column if the
corresponding column is nullable. Thus, for the remainder of the section and formal description, we implicitly refer to a surrogate of a grouping column and the corresponding grouping
function column, if that column is nullable. This trick enables us to consider only non-nullable columns from a conceptual point of view.
110
4. Correctness of the ‘Cube Delta’ Approach
Correctness of the ‘Cube Delta’ Approach
The general idea of the ‘Cube Delta’ approach is that the result of a complex grouping expression can be seen as a union of the results of single grouping expressions. Since all grouping combinations are distinct and the set of all attributes involved in the complex grouping
expression (potentially ‘extended’ by their ‘GROUPING()’ function columns) defines the
primary key of the result, we can deduce that each delta stream grouped by a specific grouping combination affects only that part of the summary table partitioned by the same grouping
combination. In other words, since modifications are local to grouping combinations, the delta stream itself may consist of a concatenation of partitions produced through different
grouping combinations. To formally describe this locality property of an update operation of
a specific grouping combination, we introduce the following notation of data streams:
• τ stands for the summary table stream; δ reflects the original delta stream, i.e. the result
of the original insert, update or delete operation on the base table.
• τi denotes the partition of the summary table stream, which corresponds to the grouping combination grpi. Analogous, δi represents the specific partition of the delta stream
according to the grouping combination grpi.
Lemma 1:
• The summary table stream τ is defined as the union of partitions τi according to the set
n τ
of grouping combinations‡: i∪
= 1 i = τ.
• in analogy to the summary table stream τ, the delta stream δ consists of the union of
n δ
partitions δi, δ = i∪
= 1 i , after applying the complex grouping operation.
Proof of Lemma 1:
Lemma 1 directly reflects the definition (and basic idea) of complex grouping expressions
([GBLP96]).
A second formal prerequisite is the modeling of the UDI-logic, which performs the actual
changes to the summary table. Therefore we introduce a single operation UDI() and a sequence of UDI-operations.
Definition: UDI-operation
The operation UDI ( δ → τ ) denotes the update of the τ stream by values of the δ stream resulting in a modified stream τ’. In the case of an empty input stream (δ = ε), the UDI-operation does not perform any modifications on the output stream τ, thus UDI ( ε → τ ) = τ .
Lemma 2:
The application of an UDI operation on a partitioned data stream is equal to a union of modified partitions, updated by multiple UDI operations with the same delta, i.e.:
‡ The values of those attributes, which are an element of the overall grouping attributes of τ but not an
element of the current τi, are NULL.
4 Correctness of the ‘Cube Delta’ Approach
n
UDI  δ → ∪ τ i =

i=1 
111
n
∪ UDI ( δ → τi )
i=1
Proof of Lemma 2:
It is obvious that applying modifications to a whole stream leads to the same set of rows as
applying the same modifications to each single partition of that stream and concatenate the
modified partitions to yield the same output stream. Of course, the necessary prerequisite is
the required property of disjunctness of the partitions
Definition: Sequence of UDI-operations
The sequence operator » concatenates multiple UDI-operations so that the output of a former
UDI-operation becomes one of the input stream of the directly following UDI-operation. The
order of the single UDI operations is arbitrary but fixed.
The following example shows the specification of applying multiple delta streams (or partitions of one single delta stream) to the same data stream τ. The resulting stream τ’ of the sequence of operations is denoted as:
τ’ = UDI(δ1->τ) » UDI(δ2->τ) » ... » UDI(δn->τ) =
» UDI ( δi → τ )
∪
i=1
n
Lemma 3:
An UDI operation of a partitioned delta stream applied to a data stream τ is equal to a sequence of UDI operations where each single UDI operation applies a single partition of the
delta stream to the data stream, i.e.:
n
UDI  ∪ δ i → τ =
i = 1

n
∪
» UDI ( δi → τ )
i=1
Proof of Lemma 3:
Again, it is obvious that applying a whole delta stream to a data stream τ is the same as repeatedly applying a partition of the delta stream to the data stream.
Using the notion of an UDI-operation and a ‘sequential execution’ of a set of UDI-operations, we are able to proof the correctness of the Cube-Delta Approach.
Theorem: Correctness of the Cube-Delta Approach
The Cube-Delta-Approach is correct if the result of applying the complete delta stream after
performing the complex grouping operation to the summary table data stream according to a
set of grouping combinations is equal to independently applying each partition of the delta
stream to the corresponding partition of the summary table data stream for a specific grouping combination, i.e.:
UDI ( δ → τ ) =
∪
n
UDI ( δ i
i=1
→ τi )
112
5. Full Refresh of ASTs
We show the equality by applying Lemmata 1, 2, and 3:
n
n
UDI ( δ → τ ) = UDI  ∪ δ j → ∪ τ i
j =1
i=1 
n
(Lemma 1)
=
δ j → τ i
∪ UDI  j∪

i=1
=1
=
∪∪
» UDI ( δ j → τi )
i=1j=1
=
∪ ( UDI ( δ1 → τi ) » … » UDI ( δi → τi ) » … » UDI ( δn → τi ) )
i=1
=
n
n
n
(Lemma 2)
(Lemma 3)
n
∪ UDI ( δi → τi )
n
i=1
Recall for the last step that the UDI-operation does not modify the data stream for the
cases 1≤j≠i≤n for any given i. The UDI-operation only shows an effect when the grouping combination of the delta and the data stream are the same, i.e.
» UDI ( δ j → τi )
∪
j = 1, j ≠ i
n
= τ i and UDI ( δ i → τ i ) = τ i'
Thus, the UDI-operations without any effect can be eliminated and the final result can
be derived.
˘
5
Full Refresh of ASTs
Although the incremental maintenance strategy provides an automatic synchronization for
ASTs, when the underlying base tables change, there are two scenarios, where a ‘DEFERRED’ refresh is justified:
• When incremental maintenance is too expensive due to a high update frequency of the
base tables and/or a high number of incrementally maintainable summary tables.
• When the aggregation functions of the AST are not incrementally maintainable. ASTs
with complex aggregation functions like statistical functions or scalar aggregate functions mostly require a full recomputation.
5 Full Refresh of ASTs
113
The current version of IBM DB2/UDB allows the full refresh of multiple ASTs within a single statement:
REFRESH TABLE ast1, ast2, ..., astn;
During the computation of a set of ASTs, DB2 tries to share as much work as possible by
detecting and utilizing common local predicates and - if possible - common joins. Moreover
aggregation dependencies between the individual ASTs are also exploited so that even partial
aggregation operations are shared during the computation. In general, the two major query
rewrite techniques ‘Query Stacking’ and ‘Query Sharing’, which are outlined in the remainder of that section, may be distinguished.
Sequential Summary Table Refresh
Recall that an AST definition may be visualized as a mixture of a table and a view definition.
To populate or recompute a summary table, the current content of the table is deleted in a first
step. In a second step, the associated full-select statement is executed and the resulting set of
tuples is finally inserted into the summary table; consider the following AST definition:
CREATE TABLE AST1 AS (
SELECT
p.pname, SUM(amount) AS sum_amt, COUNT(*) AS cnt
FROM
trans t, transitem ti, location loc, pgroup p
WHERE
t.transid = ti.transid
AND
ti.pgid = p.pgid
AND
t.locid = loc.locid
AND
loc.country = ‘USA’
GROUP BY p.pname )
DATA INITIALLY DEFERRED REFRESH DEFERRED;
The refresh command ‘REFRESH TABLE AST1’ may then be expressed in SQL as a sequence of deletion and insertion:
DELETE FROM AST1;
INSERT INTO AST1 (pname, sum_amt, cnt)
SELECT
p.pname, SUM(amount), COUNT(*)
FROM
trans t, transitem ti, location loc, pgroup p
WHERE
AND
ti.pgid = p.pgid
AND
t.locid = loc.locid
AND
GROUP BY p.pname;
Refreshing multiple ASTs simultaneously without further optimization results therefore in a
sequential execution of the associated SELECT-statements.
Query Stacking
The ‘Query Stacking’ technique basically corresponds to the detection of common subexpressions ([Sell88]) in the query graph refreshing multiple summary tables. Considering two
subgraphs of the complete query graph, the following conditions must be satisfied so that the
result of one subgraph (called ‘subsumee’) can be derived from the result of the other subgraph (called ‘subsumer’):
114
• set of output columns:
The subsumer must produce all columns, which are needed to evaluate the subsumee.
• local predicate subsumption:
The local predicates of the subsumee subgraph must be subsumed by the predicates of
the subsumer.
• join predicates:
In general both subqueries must refer to the same set of base tables. Lossless RI-joins,
realizing a 1:n relationship are an exeption for this rule: the subsumer may exhibit joins
to tables, which are not referenced by the subsumee; Analogous, a subsumee may have
an extra child connected by an RI-join, which is not referenced by the subsumer.
• group-by expression:
Existing group-by expressions of the subsumee must be derivable from the subsumer.
Derivability in this context encompasses either subsetting in the case of a simple
grouping expressions consisting of a set of grouping columns or slicing in the presence
of complex grouping expressions based on CUBE(), ROLLUP(), or GROUPING
SETS().
If all these conditions are fulfilled, the system generates a compensation graph. This query
graph corresponds basically to the original subsumee subgraph but considers necessary adjustments to generate the same result based on the subsumer in comparison to be based on
the base tables. Once the subsumee subgraph is built, we can stack the compensation graph
of the original subsumee subgraph on top of the unmodified subsumer subgraph (Figure 5.1),
thus avoiding raw data access and evaluating join operations twice. For further details concerning these techniques, we refer the reader to [ZCPL00].
Query Sharing
The general idea of ‘Query Sharing’ is that, if the query stacking technique introduced before
is not applicable for two given query subgraphs, an artificially constructed common subexpression is injected into the global query graph. This subexpression is customized in such a
way that it can be exploited by both queries. Referring to the sample database scenario, Figure 5.2 shows an example of two summary table specifications (queries), which are not suitable for query stacking, because one query computes the sum of transaction amount, whereas
the other query computes the average transactional amount. Furthermore, one query is grouping by product groups and states within USA, which is not compatible with the grouping
combination of product group and customer name of the other query.
When constructing a common subexpression for two given query graphs, the following components of a query specification have to be considered and adjusted adequately in order to
come up with a suitable solution:
5 Full Refresh of ASTs
115
insert into AST2
SELECT
FROM
Stacking
WHERE
AND
GROUP BY
insert into AST2
SELECT
FROM
WHERE
AND
AND
AND
GROUP BY
location
location
p.pname,
SUM(amount) AS sum_amt,
COUNT(*)
AS cnt
trans t, transitem ti,
location loc, pgroup p
ti.pgid = p.pgid
t.locid = loc.locid
p.pname
transitem
trans
p.pname,
SUM(sum_amt) AS sum_amt,
SUM(cnt)
AS cnt
// ==> COUNT(*)
$temp t,
location loc
t.locid = loc.locid
p.pname
insert into AST1
$temp
SELECT
FROM
WHERE
AND
AND
GROUP BY
pgroup
p.pname, c.cname.t.locid
SUM(amount) AS sum_amt,
COUNT(*)
AS cnt
trans t, transitem ti, acct a,
cust c, pgroup p
ti.acctid = a.acctid
a.custid = c.custid
p.pname, c.cname, t.locid
acct
cust
Abb. 5.1: Sample Query Graph for ‘Query Stacking’
• column output list adjustment:
The common subsumer has to provide all columns which reflect an input column of
each query. Therefore, the set of columns encompasses all columns used to perform aggregation operations on it, used in join or local predicates and used in the GROUP BY
expressions of both query subgraphs.
• child adjustment:
In general, both queries defining the common subexpression have to exhibit the same
set of children with the same join predicates. This restriction is relaxed if extra children
of both queries are implementing a lossless join, i.e. the tables which are not referenced
by the other query are the 1-side of a 1:N relationship.
• predicate adjustment:
All local predicates of the single queries referring to the same table are disjunctioned
and form the local predicate of the common subexpression. Local predicates on tables,
which are not shared by the two subexpressions are ‘deferred’ to the single query.
116
insert into AST3
SELECT
FROM
WHERE
GROUP BY
insert into AST4
pname, state,
SUM($1) AS sum_amt,
SUM($2) AS cnt
$temp
country = ‘USA’
pname, state
SELECT
FROM
GROUP BY
pname, cname,
SUM($1) / SUM($2) AS avg_amt,
// ==> AVG(amount)
SUM($2)
AS cnt
$temp
p.pname, c.cname
$temp
SELECT
Sharing
FROM
WHERE
AND
AND
AND
GROUP BY
location
transitem
p.pname, c.cname,
loc.state, loc.country,
SUM(amount) AS $1,
COUNT(*)
AS $2
trans t, transitem ti, acct a,
cust c, pgroup p, location l
ti.acctid = a.acctid
a.custid = c.custid
t.locid = loc.locid
p.pname, c.cname,
loc.state, loc.country
trans
pgroup
Sharing
acct
cust
Abb. 5.2: Sample Query Graph for ‘Query Sharing’
• grouping column adjustment:
The set of grouping columns in the common subexpression obviously consists of the
union of all grouping columns of the two queries. Moreover, all columns, which appear
as local predicates in the original subexpressions must be added to the set of grouping
columns in order to ‘survive’ the common subexpression.
Figure Figure 5.2 shows an example of an artificially constructed common subexpression for
two query subgraphs. Since the left query has a local predicate on country, this column is
added to the GROUP BY list of the common subexpression. Moreover, it is worth to mention
that adjustments are again necessary in the original query graphs to produce the correct result. Specifically, the COUNT()-values are computed by a summation over the COUNT()-values in the common subexpression. A detailed discussion of the technique may be found in
[LZCP01].
6 Summary and Future Work
6
117
Summary and Future Work
This paper outlines the current state-of-the-art and ongoing research work in maintaining
ASTs in the IBM DB2/UDB database system. The incremental maintenance technique,
which is part of DB2 V7.1 allows the specification of a complete OLAP scenario using a single summary table and maintaining the precomputed data incrementally, i.e. avoiding potentially expensive full recomputations. If, however, a full recomputation is required, research
work is outlined, how to speed up that process. The ‘Query Stacking’ technique detects and
uses existing common subexpressions in a query graph needed to recompute multiple ASTs.
If stacking is not possible, the ‘Query Sharing’ approach tries to build a generic common subexpression which can be exploited by the single queries.
In summary, the discussed maintenance strategies provide a sound basis for a powerful data
warehouse infrastructure based on DB2. Of course, ongoing work is extending this infrastructure in several directions ([BCLS00], [BCP+99]).
References
BCLS00
BCP+99
Beyer, K.; Cochrane, B. Lindsay, B.; Salem K.: How To Roll a Join. In: SIGMOD’2000
Beyer, K.; Cochrane, B.; Pirahesh, R.; Sidle, R.; Shanmugasundararm, J.; Mohan, C.; Salem, K.:
Intermediate Propagate Deferred Apply for Incremental Maintenance of Materialized Views. IBM
research paper, 1999
GBLP96 Gray, J.; Bosworth, A.; Layman, A.; Pirahesh, H.: Data Cube: A Relational Aggregation Operator
Generalizing Group-By, Cross-Tab, and Sub-Total. In: ICDE’96, pp. 152-159
HaRU96 Harinarayan, V.; Rajaraman, A.; Ullman, J.: Implementing Data Cubes Efficiently. In: SIGMOD’96,
pp. 205-216
LZCP01 Lehner, W.; Zaharioudakis, M.; Cochrane, R.; Pirahesh, H.: Active Query Matching for
Optimization of Multiple Summary Tables. In: ICDE’2001
MuQM97 Mumick, I.; Quass, D.; Mimick, B.: Maintenance of Data Cubes and Summary Tables in a
Warehouse. In. SIGMOD’97, pp. 100-111
Sell88
Sellis, T.K.: Multiple Query Optimization. In: ACM Transactions on Database System, 13(1988)1,
pp. 51
ZCPL00 Zaharioudakis, M.; Cochrane, R.; Pirahesh, H.; Lapis, G.: Answering Complex SQL Queries Using
Summary Tables. In: SIGMOD’2000
118
6. Summary and Future Work
SERVER SIDE WEBSITE WRAPPING:
THE XWEB APPROACH
Jürgen Lukasczyk, Wolfgang Hümmer
EMail: {jnlukasc, huemmer}@immd6.informatik.uni-erlangen.de
Abstract
The Internet offers a vast pool of information. This information is primarily presented in a format for human readers. There are several good reasons why this huge
amount of data should be integrated into applications instead, e.g. for processing it or
comparing it with other sources/experiences. Extracting this information from web sites
is not trivial because what human eyes can perceive quite easily is most of the time rather
complicated HTML code. The main problems are that first HTML knows only little
semantic tags a program could infer the logical structure of a document from. Second
HTML is often used in a very abusive way on some web sites for the sake of a beautiful
layout.
The task of extracting information from web sites is in general left to so called wrappers:
they have to communicate with a site, retrieve the information from it and transform it
to a format a higher level application desires. In this paper, we give an overview over
general concepts that should be kept in mind when building wrappers. Furthermore we
present our XWeb* wrapper framework. XWeb completely relies on open standards like
XML and XSL. The benefit is twofold: the full power of the expressiveness of XSL can
be used for the extraction process itself and not another query language has to be learnt.
* XWeb stands for eXtract Web sites
120
1
1. Introduction
Introduction
Nowadays a vast pool of information is accessible via the World Wide Web (WWW) through
various sources most of which offer their service for free. With the sources being that manifold it is nearly impossible or at least very time consuming to monitor each and every web
site of interest, because the amount of information being offered and being of value at the
same time might be very small for a given site.
Obtaining, grouping and selecting information by relevance has ever since been the task of
editorial staff. The so called “portals” seen throughout the internet these days are truly signs
of a flooding of information sources. A more cost effective, flexible and elegant solution than
“hand-made” portals would surely be a software automatically retrieving all new information
from various sources as soon as it becomes available, groups related information together and
finally informs all users about availability of new information matching their specified profile.
Such a system for information integration needs foremost one thing: Source data. Unfortunately, raw web sites can hardly be used directly as input sources, as they deliver their data
packaged into HTML [W3C99c] markup, combined with all sorts of (at least for the given
use case) irrelevant data. Web sites are created for human readers or potential customers
(think of all the ads) but not at all for computer programs. The fact that all data in a HTML
page is already "marked up" does not make things considerably easier for programs, as the
markup elements (commonly referred to as "tags") describe the page layout, not the meaning
of the contained data, and – which is the dilemma of HTML – are abused for various things
not meant that way by the creators of HTML. An arbitrary news page for example might
come with the headings in bold letters, but this visual emphasis could be achieved through a
variety of tags like <b> (for bold) or <strong> or some heading level (<h1>), but neither of
these would clearly state that the enclosed text actually is the (news) headline.
As there is no easy to learn and widely used query tool for HTML like there is SQL for databases, tapping the Web needs some work. In principle it is possible to write wrappers for
web pages by hand. Programming languages like Perl offer all necessary capabilities. But
this manual process is rather error prone and rather time consuming. And if the source changes the whole wrapper has to be reimplemented. So the idea of seperating the whole wrapping
task into smaller steps is pretty close: fetching data from the source and generating the desired output format is a rather constant step wrappers for all kinds of pages have in common.
The only differing (and difficult) step is extracting the right information from the sources.
In this paper we describe our XWeb system, a framework for wrapping arbitrary web sites.
XWeb is heavily based on the open standards XML and XSL, thus not creating another web
query language. All configuration files for XWeb are formulated in XML. Further this approach allows us to choose from a number of implementations to construct our wrapper
framework. In addition to that nearly every application able to produce or read XML can take
part in our system – and the number of applications like this is growing quickly.
1 Introduction
121
Terminology
Throughout this paper, the term web wrapper will refer to a single program that is able to
extract a given set of information from the contents of a given web site. A
wrapper framework is a set of modules and usage instructions that facilitate the execution of
a wrapper. Finally, a web source denotes an information source that is reachable (e.g. via a
network) from the machine executing the wrapper. The web source publishes information in
a data format† which can be processed by the wrapper.
A web source’s scheme describes how information is embodied within a web page, e. g. how
the semantic elements are placed on the page and by which syntactic tokens they are delimited.These tokens could be HTML element tags or special text strings. For example, in a news
article, the author’s name could be surrounded by a certain tag, say <b>, thus making extraction very simple, but it could also be marked through a string like "written by". The scheme
has to be known at wrapper creation time and must not change while the wrapper is executing, else data would be incorrectly interpreted and the wrapper would export no or even
wrong data.
The demand for a constant scheme is truly a hard one, and one that cannot be fulfilled easily,
as web sites are subject to many changes within short time frames. Changes can affect URL
addresses, layout, content, etc. Coping with changes is not easy, since a change in the web
site can invalidate the rules set up for information extraction (see below). Thus, for a resilient
wrapper framework it is essential to support extraction rules based on content rather than
physical data structures. In addition, sophisticated error reporting is needed to detect changes
in the source that break the wrapper itself and make a wrapper rewrite necessary.
Outline
The paper is organized as follows: Section 2 presents some other projects concentrating on
the wrapping of web pages or flat text files, an even harder task to be done. After that we will
present some general observations about wrapping. We will identify three main processes,
fetching, extracting and output, and derive from that a simple reference architecture.
Section 4 describes the XWeb framework, our implementation of the concepts explained before, in detail. In section 5 we will clarify these concepts by describing how XWeb can be
used for wrapping news.com. The paper will close with a short conclusion.
† At the time of this writing, this format usually is HTML transmitted via HTTP. In the near future this could
also be XML. A good wrapper framework should be able to cope with both.
122
2
2. Related Work
Related Work
Combining information from many different sources always was an area of big interest, yet
with the huge success of WWW technologies throughout the world, research has even increased. We will now shortly discuss some other projects for wrapper construction. Nearly
all of them work in a semi automatic fashion; some of the approaches concentrate on HTML
sources, others on pure text files thus having less information about the structure of the pages
to exploit.
XWrap
XWrap [LiPH99] automatically generates Java code from XML files specifying the extraction rules in a proprietary, problem specific, high level programming language using treebased rules. Extraction is split into three stages: First, the HTML source is cleaned, then the
"interesting region" (called Area in our terminology) is extracted and finally a list of comma
separated values (CSV) consisting of all "semantic tokens" (similar to our Data items) is created. In the following output step, a hierarchical data structure is produced from the CSV, following rules from a user-defined template, and written out as XML file.
The aim in this project is to create XML files from a source web site, similar to our approach.
Yet the basis for the result is always a single HTML page and the focus lies on easy creation
of wrappers and not on the description of a whole extraction process. XWrap features a rich
set of tools to support both semi-automatic rule creation as well as output mapping.
TSIMMIS
"The Stanford-IBM Manager of Multiple Information Sources" (TSIMMIS, [CGH+94])
tries to integrate different heterogeneous information sources by presenting the user a uniform query API. It is therefore not a wrapper system like XWeb or the other approaches presented in this section but a more general mediator system using wrappers to connect different
kinds of information sources to the global information system. Unlike XWeb, source information is not converted and stored for query execution, but every user query is passed on to
the wrapper for query rewrite and execution. This way it is possible to not only integrate
WWW sources that have little or no built-in query capabilities, but also legacy and database
systems [HGC+97]. A similar approach was used in management information systems before data warehouses became popular.
The WWW wrapper [HGN+97] used in TSIMMIS uses fast RegExp matching for content
extraction. Every command binds a result in string format to a name. A command can either
be a RegExp or a retrieval function that actually fetches raw data from the web, thus it is possible to consecutively refine the RegExp searches (as the matched string is assigned to a variable that can be used as input for an other RegExp) as well as include content from more than
one source.
2 Related Work
123
After all commands have been executed, the intermediate result is exported as OEM data that
can be further processed (e. g. restricted, sorted, ...) by the wrapper framework. OEM
[PaGW95] stands for Object Exchange Model, a schemeless data format well suited for semistructured data used inside TSIMMIS, that is in fact quite similar to XML, yet not as easy.
The result’s hierarchical structure is implicitly defined by position at which the wrapper commands appear in its definition file.
XWeb itself does not feature a mediator facility, as it relies on superordinated programs to
integrate the XML content it creates. While the RegExp search approach taken in TSIMMIS
wrappers is quite fast (reported 10 MB/s) and can surely achieve the same functionality as
tree based languages, it is not as intuitive, because every match rule has to include a part to
provide for proper element nesting. The reason for that is that RegExps have no notion of the
element nesting inside HTML, and so an expression to match the text inside an element
(meaning between the start and end tag) would fail if the element is contained within an other
element of the same type. In this case, the text between the outer start and inner end tag would
be returned.
Nevertheless could RegExp searches be easily integrated into XWeb, by simply writing a
step module that treats its XML document input as a string and then executing RegExps
searches on it. The output document construction could well follow the TSIMMIS way,
though a more flexible way to structure the result document might be desirable.
As our main purpose in XWeb is to periodically execute tasks to update the storage of a data
warehouse like integration system, raw processing speed is not as much of the essence as in
TSIMMIS, where wrapping has to appear every time a user issues a query. In addition, the
need to include navigational information with correct element nesting to already complex
RegExps means the wrapper author has to be a sed hardened RegExpert, hard to read wrapper
definition and even more work during wrapper creation.
W4F
The "WWW Wrapper Factory" (W4F, [SaAz00]) creates Java source code from an abstract
wrapper definition, very much like XWrap. The result can be compiled into a single Java
class offering functions to retrieve the current information from the site. This class can be
incorporated into user applications, thereby hiding how the data is obtained and wrapped, yet
allowing access to its content being as simple as a method call. The resulting Java class uses
a shared library of basic utility functions, like a HTML parser and XML writer.
W4F uses a high-level description for the wrapper, consisting of a retrieval clause that basically specifies the source URL, extraction rules and an output mapping scheme. A very sophisticated path expression based query language called HEL (HTML Extraction Language)
is used to specify the extraction rules. The extraction phase yields a Nested String List (NSL),
a lisp like data structure consisting of lists that can contain other lists or strings. The nesting
of the NSL output depends on the formulation of the extraction rules.
124
2. Related Work
The result data is returned back to the calling application either as generic Java string objects
or custom user defined objects. For example, suppose a wrapper reads data from a web site
with movie information. Movies have a title and a list of cast members. It is now possible to
have the generated wrapper return the extracted information as simple strings, or as a object
of type Movie that is constructed from the data contained in the NSL. In the later case, the
cast to initialize the class from the NSL has to be written by the user.
It is also possible to specify a XML mapping, using a template file that is filled at runtime
with the data from the NSL. The structure of this template has to follow that of the NSL, so
this process could be simply viewed as a "labeling" of the NSL structure. Values can not be
exported as attributes or other XML structures and always appear enclosed by element tags
specified in the template.
W4F relies on the calling application to follow links and integrate data from more than one
site. It does not feature a control flow description, the whole extraction process is encapsulated inside the generated class. The extraction language, though well thought out, appears
to be less intuitive than XSLT for example, as it uses index variables to connect a rule expression (WHERE-clause) with the extraction path expression. XSLT can incorporate these expressions directly into its path expressions.
As XSLT and XPATH do not appear to be less powerful than W4F’s HEL but are more commonly used, it appears to be a good idea to base the whole extraction on these standards, as
XWeb does. When the result shall be written out as XML, usage of XML standard is preferred over proprietary solutions anyway.
Wrapper Induction
The authors of [KuDW97] present an approach for wrapping arbitrary text files, i.e. they do
not depend on additional structural clues as pure HTML wrappers need. The key contribution
is induction which means that the wrapper is generated by learning from a set of example
queries and responses. The quality of the wrapper is determined by a so called PAC model
(problem approximately correct). The authors claim that the class their wrappers can effectively learn corresponds to HLRT which in turn corresponds to a class of finite state machine
(FSM): the necessary parameters for the identification of the interesting part of a page are the
head (H) of the area, the left (L) and the right (R) delimiter of each data set and the tail (T),
i.e. the beginning of the area the wrapper is not interested in. In the case of an HTML table
for example one could use <table> as head, </table> as tail and <tr> and </tr> as left and
right.
The authors tested their approach on 100 Internet sources and found out that nearly half of it
could be wrapped using wrapper induction. At present their implementation can only wrap
information arranged in tables.
2 Related Work
125
The Wrapper Generation Toolkit
[AsKn97] presents a wrapper generation toolkit concentrating on web pages by exploiting
the additional semantics of HTML tags. Once more a semi automatic system is presented
thus allowing the user to correct errors produced by the system. The authors concentrate on
semi structured web sources having multiple instances like the CIA World Fact Book containing information about 267 countries in the world.
Wrapper generation is done in three steps. First of all the source has to be structured, i.e. sections and subsections of interest have to be identified. Sections are found by identifying tokens which in turn are identified by certain presentational features like bold fonts or upper
case letters using a lexical analyser, i.e. by regular expressions. After that the nesting hierarchy of the sections found before has to be determined. This structure is determined by following two heuristics, one based on the font size, the other on indentation. Both token as well
as hierarchy identification are done in a semi automatic fashion thus allowing the user to intervene and correct the structuring process at any time.
The second step is building a parser for the source pages. With the grammar produced in the
step before this parser can be built fully automatically by using a parser generator like YACC
together with the LEX tool.
The final step consists in adding communication capabilities. The authors see their wrappers
as part of a larger mediator system. So every wrapper has to be able to accept requests from
the mediator system, to identify the necessary network locations of the desired page(s) for
satisfying the request and finally to transfer data over the network. Obviously this step needs
user interaction.
The authors tested their wrapper generation toolkit on several multiple and single instance
sources. The necessary amount of time as well as the number of manual corrections are rather
low thus making the toolkit very useful. The authors identify the structuring step (step 1) as
the most difficult one.
NoDoSE
Brad Adelberg presents in [Adel98] the NoDoSE (Northwestern Document Structure Extractor) system for wrapping sets of similar structured or semi structured text files. The extraction
process is performed in three steps. After deciding on how to model the data in the documents, they are hierarchically decomposed into the regions of interest by the user. Finally an
adequate output format has to be chosen.
The data model developed in the first step serves as a conceptual model of the input files. NoDoSE offers several data structures for this purpose: atomic types like integer, float or string
and collections (set, bag, list and record). For semi structured data the members of the collections are not restricted to a single type. The decomposition step is started by the user
graphically processing the first document by mapping it to the single structures of the data
model created in the step before. The user need not map every single element of the docu-
126
3. General Aspects of Wrapping
ment but can the system let infer the rules for similar parts. Thus an inferred grammar is produced which is applied to the remaining documents. Two kinds of errors can happen: either
the file contains a (typographical) error or it contains a new structural element that has to be
integrated into the grammar. The third step is converting the parsed files to a certain output
format. The authors mention output for database systems, interactive reporting tools or
spreadsheet programs.
The NoDoSE system is implemented using Java Beans, so single components can be exchanged easily. It is meant as test bed for different structure mining algorithms implementing
the grammar inference in step 2; so far a mining component for HTML and plain text files is
implemented.
3
General Aspects of Wrapping
Before describing our XWeb framework in detail this section examines general concepts
about wrapping web sources. Yet most concepts presented here are also applicable to wrapping arbitrary sources. The reason for developing wrappers is to reduce the number of changes in applications utilizing external sources: The web nowadays is changing at a fast pace, so
if an application wants to integrate weather information from www.weather.com and would
include the extraction logic within itself, this application will have to be recompiled every
time the web master of weather.com decides to change the site’s layout. To avoid this a level
of indirection is introduced between the information sources and the applications which can
be realized by wrappers. Wrappers have to extract the information from the sources and convert it into an output format the application on top needs. So if weater.com changes only the
according wrapper has to be adapted, not the main application.
But also having to change only the wrappers is not too satisfactory. Wrappers are often hand
coded in processing languages like Perl which are rather hard to read and adjust. So it is reasonable to examine the single steps a wrapper executes in more detail and then to create a
corresponding modular architecture that allows to adjust a wrapper to a changing source simply by exchanging or adjusting a single module.
3.1 The Process of Wrapping
Let us now examine the single processes to be executed when extracting information. These
processes can be classified into three areas which will make up our reference architecture in
section 3.2. Consider the basic wrapping process depicted in Figure 3.1. We will now discuss
every step in detail.
127
Wrapper System
Extraction
Rules
Formatting
Rules
Fetch
Extract
Output
URL
Result
Document
HTML
Source (website)
Internal Data
Representation
Fig. 3.1: Schematic overview of a basic wrapping process
Fetch Process
First of all, the raw source data has to be fetched by the application, bringing it into the address space of the machine executing the wrapper. A flexible implementation should allow
the use of many protocols and variants, as well as an extension for future protocols.
The most common and convenient way to specify the source is to use URL references. This
allows for simple cut&paste address transfer from standard applications like web browsers
into the wrapper.
A single site can deliver a huge amount of data adhering to the same scheme but under different addresses. For example, a web site that delivers stock information might use one address per stock, though the schemes are always the same, so the same wrapper can be used.
To avoid creating more than one instance of a wrapper, the wrapper (or better its framework)
has to support parameterizable location templates. The wrapper will then try to fetch any
source data corresponding to the template.
The specification of these templates has to be protocol neutral, hiding the protocol peculiarities. The HTTP protocol for example has two “dialects” for passing parameters to sites: GET
transfers the parameters in the URL, POST sends them along with the HTTP request. A
wrapper writer should not have to care about the method of transportation except from specifying which of them shall be used.
During execution this phase is characterized by low main memory consumption, as data is
stored on mass storage, low CPU usage and long idle periods caused by network delays. It
would seem beneficial to initiate multiple page downloads at once in order to saturate a fast
internet connection.
128
Extraction Process
After the data has been made available the real extraction step can take place. This step uses
data from the fetch process before as its input and produces a set of extracted information
elements in an arbitrary (possibly proprietary) format as output. A set of rules specified in a
matching language used by the system is used against the source data. Only elements matching the rules are transferred to the result document, thereby implicitly stripping unwanted
content from the data.
Before we discuss the options for rules languages in greater detail, it is necessary to examine
in which form extractable data appears in web pages. Web pages can be decomposed into the
following hierarchically ordered levels:
(1) Area of Interest:
It is quite obvious that not every piece of data displayed on a web page is of value in a
given application context. The area of interest is simply the super set of all single data
items that are to be extracted.
The Area does not have to be continuous in any form, neither visual nor physical. It
might be, for example, that the user is interested in all headlines from a news web site
that follow a certain scheme. These headlines can appear on different positions on the
screen or within different markup tags in the document source. In Figure 3.2, the top
news articles are divided by images (visual) and they are child elements of different
element tags in the HTML code (physical).
The notion of an Area, being solely based on a binary isInteresting relation may
seem trivial at this point. It should be pointed out that the Area serves as the root for all
Information records (see below) within a web page.
(2) Information Record:
The Area consists of at least one Information Record (IR). Every IR is a container consisting of either IRs or Data Items (see below) combined with arbitrary data not to be
included in the result. All IRs are uniformly accessed from the Area (serving as a root
element), e.g. there is some sort of “path” from the Area to each IR.
The notion of IRs and their uniform indexing property is very useful in that it enables
uniform treatment (read: extraction) of every IR. In the example page in figure 3.2, the
Area consists of the entries of the top news, complete with headline, date of entry and
so on.
(3) Singular Data Item:
These are the atomic elements that contain the information. What pieces are data items
and which are not is to be determined by the wrapper generator. Although these are atomic, this does not imply that data items are simply copied into the result, it just means
Area of interest
129
Data item
Record
Fig. 3.2: A sample news webpage showing information containment relation
that the sole information can be retrieved from them. Further transformations might be
necessary to bring the data into a format that can be used for further processing by other
software.
Consider the example of figure 3.2 and assume that the headline, abstract and date of
entry shall be used for a customized information portal. Then the wrapper generator
would define these to be the data items in the wrapper program. While the headline and
abstract can possibly be used “as is”, the date information might need some special
transformation, i.e. a conversion into UTC.
Now that we have functionally decomposed our page, it is very easy to give a generalized
scheme for the extraction process itself. Every step is executed following a rule formulated
in the system’s language.
• Identify the Area of interest
• Iterate over all Information Records in the area:
If children are IR, then apply step 2
• For every data item, apply its transformation rule and output the result.
Note that every rule has a context (the item selected in the preceding step) in which it is being
evaluated, so all addresses used during execution should be expressed relative to this context.
This view is consistent with the standardized Document Object Model (DOM) [W3C98b]
specified by the W3C consortium for XML and HTML, the languages web pages are made
130
of. This fact suggests that languages used for extraction should honour this relative addressing scheme, in order to supply a natural view onto the source and to increase performance by
avoiding frequent searches for the root.
The rule language used is dependent upon the system’s or implementor’s choice and the
number of applicable languages to choose from is huge. We can divide the available languages into two categories.
• Regular Expression Languages:
These languages use Finite State Machines (FSM) to search for matching string areas.
Every single information unit from the source data is passed to the FSM which searches for an applicable transition to use. When a transition is found, the FSM’s internal
state changes to that specified by the transition. If the resulting state is an “end”, then
the current string has matched the RegExp. When no suitable transition is found, then
the current string has not matched and processing restarts with the next character. This
procedure is repeated for every possible substring from the source.
Though RegExps are well known and have many applications‡ in Computer Science,
thus good and efficient implementations are available (e.g. OroMatcher). As RegExps
require no parsing and tree creation prior to the real matching process, they are also
very fast and memory efficient compared to tree based languages**.
Unfortunately, RegExps do not appear to be well suited for extraction of information
from SGML like data formats. RegExps have no notion of the hierarchical structure of
the source and relative addressing can only be achieved by executing the RegExp search multiple times.
• Tree based languages:
These languages use path expressions that are matched against a given tree of objects.
Starting from the root, every child object is tested against the first part of the path expression. If there is no match the whole sub tree is skipped and processing resumes
with the next sibling. If a match is found the process is recursively applied to every
child of the matching node, using the next part of the expression for testing. Examples
for these type of languages are XPath [W3C99a] from the W3C consortium or HEL
used in W4F [SaAz00].
These languages honour the hierarchical structure of the physical web page formats
and addressing is relative to the context node of the preceding step. While this eases
wrapper generation for the user creating the tree from the source requires parsing it,
leading to increased usage of CPU time and memory resources.
‡ Of which the programming language syntax highlighting in Emacs undoubtedly is the most remarkable and
useful.
** One can expect RegExps to be about 5000 times faster than tree based matching for a single execution.
Consecutive searches on the result are faster with trees, because no parsing has to be done (result tree from
the first run has not to be rebuilt)
3.2 The Architecture of a Wrapper
131
Most wrapper frameworks use proprietary tree based languages for matching expressions,
because they make it easier to exploit the hierarchical structure of HTML files than regular
expression languages do. Some wrapper frameworks offer both kinds of languages.
A wrapper needs a fair amount of CPU time and memory (especially with tree-based expressions) during the execution of the extraction phase. As there is no I/O or network access involved, idle times are minimal. it is therefore not advisable to start too many threads for this
phase at once, yet extracting data while doing I/O intensive work could actually lead to a better system usage factor.
Output Process
After the result has been created, the wrapper has to prepare it for use in other applications.
Most wrappers internally use a different data format than the application they send the data
to. It is thus necessary to transform the result into the expected format, either application specific memory objects or semi-structured data formats like OEM [PaGW95] or XML
[W3C98a]. As web site content is typically semi-structured, it is convenient though not required to use semi-structured data formats, but relational models or simple text files are also
possible. A flexible wrapper should allow the user great freedom in specifying how the result
shall be formatted.
Passing the result to the target application is the final step in the wrapper. Possibilities here
range from simple file persistency (writing output to a file on disk) to sophisticated network
transport mechanism like SOAP or RPC.
3.2 The Architecture of a Wrapper
From the ideas about the single processes we gathered in the section before we will now derive a simple reference architecture, primarily meant to explain the design of different wrappers and especially XWeb. Looking at figure 3.3, it is obvious that our architecture consists
of one component per process step plus an additional orchestration element that determines
the sequence in which the processes appear, what modules are executed and so on. Furthermore the wrapper comprises an internal storage facility called cache in the figure.
This architecture is very flexible and allows usage of multiple modules for every phase as
well as the usage of data computed during execution to be used as input for subsequent step.
This is for example needed to fetch a pages that are connected by hyperlinks. Not all wrapper
frameworks actually feature the notion of control flow, and thus are either restricted to the
static phase sequence from section 3.1 or rely on some other program to control the execution
flow.
The wrapper definition holds all metadata needed during module execution, such as which
site to fetch, the extraction rules and the control flow description. Definition and framework
together form a single wrapper for one source.
132
4. The XWeb Framework
Higher
Higher Level
Level Application
Application
Wrapper
Calls
Orchestrator
Output
Formatter
Extractor
Cache
Fetch
Module
Wrapper
Source
Systems
Results
site1
site2
siten
Fig. 3.3: Reference architecture for wrappers
4
The XWeb Framework
Unlike traditional wrappers that fetch and transform the data when it is needed by the information system XWeb acts more like a loading tool in a data warehouse environment. As such
it does not offer a special API for accessing the wrapped information directly from another
program but rather generates an XML representation. As XML allows flexible access to hierarchical data through the DOM and SAX standards and nearly ever product offers the ability to import XML files, no new API is needed but data access can be based on well known
standards.
To allow for retrieval of complex information that is spread across many hyperlinked pages
while not requiring the wrapper author to use a programming language, it is necessary to describe the whole process of "how to obtain the information from the site" in an abstract way.
For example, in the news site scenario depicted in Figure 3.2, it would be desirable to construct a record containing all the data items from the start page and together with the complete article text that the headline link points to. This can be easily achieved by simply following the link, transforming the target page (which of course needs its own wrapper) and
4.1 Overview
133
combining the article’s text with its headline and other information from the start page. This
means that within the extraction process a number of child fetch processes has to be initiated,
indicated with the gray arrow in figure 3.3.
Most conventional tools do not allow this process to be formulated directly in the wrapper
itself and rather rely on superordinated application logic to perform this "information join".
For our purpose it seemed more adequate to fully describe the whole wrapping process to the
system, thereby enabling it to select best execution path.
4.1 Overview
The XWeb concept puts XML and associated standard technologies like XPATH or XSLT to
use in its knowledge extraction capabilities, acting as a descriptive framework for “what is to
be done”. Web sites – usually coded in HTML – are first converted into XML and then subjected to a multi stage extraction process using existing implementations of the previously
mentioned standards. No assumptions are made about which implementations are used nor
is there a limitation to the existing and already defined standards.
This approach bears many improvements over legacy methods, most of which derive from
the usage of open, well-known standards. The main benefits are:
• Multi stage process description using XML files for persistency:
The extraction process in XWeb is not a single step, but comprised of many consecutive steps. Every step can be formulated in any language that deems appropriate for getting to the information contained within the web page. The whole execution process
(the sequence in which modules have to be executed) is formulated through an XML
description, making platform independence possible.
• Usage of XML standards for information extraction:
XML standards like XPATH or XSLT leverages IT staff from learning yet another
problem-specific language. The learning curve is flattened, leading to quicker results
and less costs.
• Modular architecture:
The proposed modular architecture of the framework allows for extensibility and bestof-breed selection of tools. For example, it would be possible use a XSLT script to extract data from one web site and XML-QL for another; or even a future standard, yet
to be invented.
• Performance through inherent parallelism:
XWeb was designed for concurrent download and information extraction, appropriate
for a scenario in which a whole lot of pages have to be downloaded and converted. As
huge I/O wait times are associated with internet data transfers, massive speed-ups can
be expected from using otherwise wasted CPU cycles for processing already downloaded files.
134
4.2 XWeb at Work
As stated earlier, the whole extraction process is formulated through an XML document. We
will now explain the basic structure of this process definition and what happens when being
executed.
The XWeb Task
A task describes how information has to be obtained from a source system. Tasks consist of
an ordered sequence of steps describing the control flow between the modules from our reference architecture.
A step is a single execution of a module using data produced by preceding steps as input. The
term “step” was deliberately chosen to leave room for interpretation, as a step can be anything, from a simple HTTP web page download over a XSLT transformation to a SQL query
execution. Tasks are also steps and can be nested to become sub-tasks, which means that the
steps in the sub-task are executed for every instance (see below) of the enclosing task.
Steps can read and write data bound to an identifier during execution. Within a task, all steps
share the same namespace, but the actual value is different for every instance of a task. As
XWeb is modular and extensible, a step is allowed to support multiple different data formats.
There is no limitation on the type of data a step creates during its execution other than that
the following step must be able to use it and vice versa.
The XWeb framework automatically selects the best fitting format to be used by a step and
inserts bridge steps into its execution plan to convert between incompatible data types. As an
example, consider the DOM generated by the JTidy HTML parser used in XWeb. Although
DOM is a potentially vendor neutral standard, the Xalan XSLT processor cannot use it directly, as it relies on special features of its own parser implementation. When these two steps
are combined within a task, a simple "DOM copy" bridge step is automatically added by the
framework.
When execution of a task is desired, it is being instantiated with initial data coming from a
start step. For every instance of a task, every step is applied in order, one after the other. The
last step in a task can be a specially designated end step that can export data to the enclosing
task. A task ends when all instances have been processed.
Figure 4.1 shows an example of a simple task scheme that fetches a pages, does some extracting and writes the result to a file in XML format.
Start
Fetch
(HTTP get)
Extract
(XSLT)
Extract
(other Tool)
Output
(File write)
End
Fig. 4.1: Example of a XWeb task that uses GET method to retrieve a webpage, extracts
information and writes the result to a file.
4.2 XWeb at Work
135
One can think of a task as an assembly line at Ford, with the steps as workers and the data
being passed between them as car parts. When the first part (possibly the URL of the source
site) enters the assembly line, it is raw, but we know it leaves the line as a beautiful state-ofthe-art Model T (the extracted data).
Task Execution
Prior to execution, a execution plan is generated from the abstract task definition. This plan
contains executable steps that are configured to use certain data formats for in- and output,
chosen by the framework to be compatible with the following step.
When a task is run, its start step creates a series of initial data. This data can be a URL, a
string or other data. For each of these initial data elements, a task instance is created. An instance carries all data needed during processing, such as the next step to execute and the data
created by previously executed steps. When an instance was processed by its task end step,
it is considered finished and can be disposed. When no more instances can be created by the
start step and all task instances have been processed, the whole task execution has finished.
Execution of steps may fail at arbitrary points. Changing URLs, wrapper breaking layout
changes or internal errors such as memory overflow are possible error sources. XWeb can recover from some of these errors due to its pipeline architecture.
Though the steps are processed sequentially in order, one after the other, for a given instance,
and thus are not subject to multithreading (MT), different instances can be executed in parallel. As instances share no data that has to be written concurrently, huge speedups can be
gained through MT, and idle times caused by I/O activity with exceptionally high delays
(such as internet data transfers) can be avoided.
Fetch
(HTTP get)
Extract
(XSLT)
failed
Fetch
(HTTP get)
Extract
(XSLT)
Extract
(other Tool)
Output
(File write)
Start
End
Fetch
(HTTP get)
Extract
(XSLT)
Extract
(other Tool)
failed
Fetch
(HTTP get)
Extract
(XSLT)
Extract
(other Tool)
Output
(File write)
Fig. 4.2: A possible instance view of the previous task. Note that some instances produced
failues at different stages.
136
4.3 Implementational Remarks
The current prototypical implementation comes with a small set of modules for every phase.
More modules will be added when the need for them arises or new standards or better implementations become available. As mentioned before it is one major goal of XWeb to use common standards extensively. So implementation may not be seen in the classical way, it is
more "gluing" together standard components. Besides a rather rapid development this solution allows to exchange single modules without having to redesign the whole wrapper application. Further it does not seem sensible to reinvent the wheel, e.g. developing another DOM
implementation. XML seems to have a natural affinity to Java, so we used Java as the gluing
language, but also its network capabilities are very useful.
In the fetch phase XWeb supports standard HTTP transfers using get and post methods.
These modules are sufficient to retrieve pages from the majority of freely available sites. Support for authentication by username and password or cookies can easily be added if necessary.
For the extraction phase XWeb comes with an XSLT [W3C99b] module that uses the free
Xalan processor from the Apache project. Xalan is also used for evaluating simple XPath
[W3C99a] expressions. We plan to include other standards like XML-QL [DFF+98] as well
in our portfolio in the future.
At present the wrapper author has to create an XSLT stylesheet with any tool he desires.
There are currently no tools from our project to support the creation of such stylesheet directly, but standard XML tools, especially the "Visual XML Query" from IBM, have proven to
be of great help. A stylesheet normally consists of an XSLT template to match the Area of
interest. The template’s body contains commands to transfer the Data items into the result
document.
As XSLT combines the power of XPath search expressions with strong transformation functions the user has total control over the result creation. The extraction process is nevertheless
quite fast and will get even faster in the future through improvements made to Xalan and XSLT.
In the output phase XWeb currently supports a direct file system write of the extracted document and transfer into the higher level application system using SOAP. Here again further
modules could be added in the future.
5 An Example: XWeb wraps news.com
137
Fig. 5.1: Sample page from news.com
5
An Example: XWeb wraps news.com
We will now illustrate how to wrap a sample news page using our XWeb framework. The
script will extract the top news headlines from the opening page of C|net’s news.com site
(http://news.cnet.com) that is depicted in Figure 5.1. This site has a rather complex markup
structure, so it will serve well to demonstrate the power of XPath and XSLT for information
extraction. A snippet of the source code showing one of the records is depicted in Figure 5.2.
The start page shows the headlines of recent news, a short abstract, the date and the category
this news belongs to. If the article was updated the abstract is marked with a red label "update". The headline itself is actually a link to a different page containing the whole article
...
<font size='+1'><b>
<a href="..." >Stock tickers, sports scores and Yeats?</a>
</b></font><br>
<font size="-1">Just as the early days of the Web abounded with sites full of culture, artists with a
penchant for cutting-edge technology are now going mobile with their messages.
<br>January 20, 6:00 a.m. PT <b>in</b>
<a href="...">Communications</a>
</font>
:::
Fig. 5.2: HTML source code snippet from news.cnet
138
5. An Example: XWeb wraps news.com
text. These items are nicely grouped together in an easy to perceive block structure. Judging
from the visual appearance it should be quite easy to extract the information we want. Yet the
HTML code in this particular page is highly structured and complex in order to allow the nice
layout.
The records are contained within two tables. Between these tables lie the rows with pictures
that also lead to news articles, yet the have a different structure, as they lack the information
from the other records (most notably the abstract) and are somewhat resistant to extraction
as the news headline is incorporated into the image data. We thus concentrate on the textual
headlines.
5.1 Converting XML to HTML
HTML sources cannot be parsed directly by XML tools, because HTML does not enforce
well-formedness as XML does. For example HTML allows to omit the end tags of nearly all
elements (like <p> or <li>) and does not require empty elements to carry the ending slash in
their tags (HTML allows <br>, XML syntax would require <br />). In addition, tags of certain elements (like <html> or <body>) can be omitted completely [W3C99c], while the
HTML parser has to treat the document as if they were present.
The good news is that XML files can be easily created from an HTML source by using a parser that creates a correct (including elements actually missing in the source document) tree
representation from the source. This tree can then be used to create a XML DOM tree that
can be written out as a valid XML document which other tools can read directly. For our prototype, we used JTidy [Quic00], a Java implementation of Dave Raggett’s Tidy [Ragg99] program for cleaning HTML code, for parsing.
5.2 Generating Extraction Rules
Now that we have a standard XML file, we can use arbitrary XML tools to create XPath expressions and XSLT stylesheets. While it would be possible to use a simple viewing tool like
XML TreeView [IBM00] for displaying the tree structure and generating the respective search
expressions, it is by far more comfortable to employ a tool that allows for expression execution and refinement.
We used IBM’s Visual XML Query for interactively creating XPath expressions, which is
available from the alphaworks web site. The expression generation for the news.cnet site is
captured in Figure 5.3. Visual Query displays a tree view (1) of the source document. Expressions can be entered directly into window (3) or constructed by the wizard functions accessible through the buttons under the expression window. The expression can be evaluated directly within the tool, matching results are listed in a separate window (4).
5.2 Generating Extraction Rules
139
1
3
4
2
Fig. 5.3: Interactive XPath Expression Generation using Visual XML Query.
(1: Tree View, 2: Search Bar, 3: Expression, 4: Result Window)
The first thing to do when building a wrapper is to localize the various semantic elements
inside the source document. The search function is a great help here, as it allows to search
the whole document for a matching string. This way Data items can be located, and starting
from here the nesting structure of the Record can be analysed. Once the Record structure has
been understood, it is easy to come up with a match pattern for the Area.
The next step would then be to come up with XPath expressions that match the different semantic hierarchies in reverse order, starting with the Area. The news.com example is quite a
challenge in this point, as the interesting area spreads across two tables. Fortunately, XPath
mighty expression language makes it easy to cope with this situation by using the path
/html/body/table[position()=8 or position()=9]
meaning "match the 8th and 9th table in the document".
The pictures also leading to news articles present the next hurdle. From the HTML source
view, links to articles with an abstract following are very similar to those labelled only with
pictures, the only difference being that links with images have no text. Thus we can use the
following rule to match the Records relative to our identified Area:
tr[1]/td[1]/p/font [font[1]//a/text()]
This rule effectively matches only those <font> elements that contain a hypertext reference
labelled with arbitrary text.
XPath expressions for Data items can be deduced in similar ways, but are omitted here for
brevity. Figure 5.3 actually shows the combined Area and Record matching expressions in
action, with the result shown in the appropriate window. While rule creation is quite easy using Visual XML Query, it is not an ideal tool as it does not offer direct support for writing
wrappers or a notion for our semantic hierarchy. Yet the possibility to execute expressions is
very helpful in the authoring process.
140
5. An Example: XWeb wraps news.com
<?xml version = "1.0"?>

<xsl:transform version = "1.0" xmlns:xsl = "http://www.w3.org/1999/XSL/Transform">


<xsl:template match='text()' />

<xsl:template match = "/html/body">

<TOP-NEWS>

<xsl:for-each select="table[position()=8 or position()=9]/tr[1]/td[1]/p/font [font[1]//a/text()]" >

<xsl:variable name="aElement" select="font[1]/b/a"/>
<xsl:variable name = "textBody" select = "font[last()]"/>

<NEWS isUpdate="{count(font) = 3}" url="{$aElement/@href}"
time="{normalize-space($textBody/br/following-sibling::text()[1])}">
<TITLE><xsl:copy-of select="$aElement/text()" /></TITLE>
<ABSTRACT><xsl:copy-of select="$textBody/br/preceding-sibling::text()" /></ABSTRACT>
</NEWS>
</xsl:for-each>
</TOP-NEWS>
</xsl:template>
</xsl:transform>
Fig. 5.4: XSLT Transformation Script for news.com
5.3 Producing the Output
Once the expressions have been finalized, it is necessary to create a XSLT stylesheet that actually does a transformation from the source document into our desired result document. Our
transformation just has to be different from normal stylesheets in that it contains only a single
active template that has a match template that will only be matched by one nodeset, namely
the Area in our hierarchy. As no other elements can match the template, extraction is
achieved this way.
The wrapper author can freely determine the structure of the of the result being produced. He
is not bound by any constraints other than that the target application has to be able to make
use of the created XML files. Figure 5.4 shows a sample script a author might select. Note
that the XSLT standard template that matches all text elements has to be overwritten, as it
normally as it copies all text from the source into the result document, which is truly unwanted for our purpose.
While requiring the author to write a file that actually specifies how to extract the data and
produce the output burdens the wrapper author with a bit more work, it also widens the application field of XWeb, as it does not restrict the user to a certain structure. The result produced is always an XML file, complete with elements to describe the contained data, suitable
for processing in other applications Figure 5.5 shows the result created through applying the
script from Figure 5.4 to an actual newsfeed from news.com.
6 Conclusion
141
<?xml version="1.0" encoding="UTF-8"?>
<TOP-NEWS>
<NEWS time="January 20, 6:00 a.m. PT"
url="/news/0-1004-201-4540042-0.html?tag=st.ne.1002.thed.sf"
isUpdate="false">
<TITLE>Stock tickers, sports scores and Yeats?</TITLE>
<ABSTRACT>Just as the early days of the Web abounded with sites full of culture, artists with a penchant for cuttingedge technology are now going mobile with their messages.</ABSTRACT>
</NEWS>
<NEWS time="January 19, 4:50 p.m. PT"
url="/news/0-1007-200-4540228.html?tag=st.ne.1002.thed.ni"
isUpdate="false">
<TITLE>Study: More Net merchants need anti-fraud technology</TITLE>
<ABSTRACT>Credit card and debit card fraud could cost online merchants billions of dollars during the next five years
unless they implement the technology to detect it, new research says.</ABSTRACT>
</NEWS>
<NEWS time="January 19, 3:35 p.m. PT"
url="/news/0-1003-200-4539749.html?tag=st.ne.1002.thed.ni"
isUpdate="false">
<TITLE>Stopping light could lead to quantum advance in computing</TITLE>
<ABSTRACT>Two teams of scientists have accomplished the seemingly impossible feat of trapping and stopping light-an achievement that could lead to major advances in quantum computing.</ABSTRACT>
</NEWS>
...
</TOP-NEWS>
Fig. 5.5: Resulting XML from Wrapping news.com
6
Conclusion
In this paper we have summarized the general concepts behind wrapping Internet sources.
The reasons for the necessity of wrappers are manifold and obvious: the high speed the Internet is still growing at makes it impossible to keep track of the changes on new and already
existing web sites. Furthermore it is desirable to exploit the information offered by the sites
in a larger context by feeding them into an information system like a data warehouse to process or compare the information gained. The problem of wrapping in general is that an automaton has no idea of the semantic structure of a document while the human reader perceives that rather easily.
Wrapping takes place in three steps: first of all the data as is has to be fetched from the desired
web site. After that the interesting information has to be extracted. Therefore parts of the
HTML code have to be transformed into a more semantic structure. The extraction rules have
to be defined by the user. Finally the extracted information has to be converted to a desired
output format which again has to be provided by the user. From this process model it follows
immediately that wrapping has to be a semi automatic task.
XWeb is our implementation framework for wrappers following the processes determined
before. It completely relies on open standards, especially on XML and XSL because robust
implementations of theses standards exist and because XML is extremely well suited for
storing and processing semi structured data as found in web sources. So the fetched document can be held in a DOM structure, queries on this DOM can be performed using either
142
6. Conclusion
XSL/XPath expressions or future XML query languages like XML-QL or XQL. The result
of this filtering process finally has to be transformed to a desired output format, once more
using XSL transformations. In that way XWeb is a component based wrapping system gluing
third party components together. This approach allows to exchange single components like
the XSL processor if a better implementation becomes available. Further XML is good on its
way to become a widely accepted standard, so nearly every application program offers the
ability for XML import and/or export. So users can configure our wrapper from the environment they are used to. Still an integrated user interface is one of our next tasks to be done.
The benefit of our task oriented approach is that also pages containing links to child pages –
which certainly is very common in HTML pages – can be retrieved, a topic not mentioned
so far. In principle an arbitrarily deep recursion is possible but far from being feasible.
Possible directions for future research might be the use of machine learning and artificial intelligence methods, in order to create wrappers more automatically or at least with less effort.
This approach would also fit nicely into our architecture, as it only involves changes to the
rule creation phase; for example the extraction could still use XSLT stylesheets, but these
could be created by a machine.
6 Conclusion
143
References
Adel98
AsKn97
HGC+97
HGN+97
KuDW97
PaGW95
CGH+94
SaAz00
SaHö99
W3C98a
W3C98b
W3C99a
W3C99b
W3C99c
DFF+98
Quic00
Adelberg, B.: NoDoSE – A Tool for Semi-Automatically Extracting Structured and Semistructured
Data from Text Documents. In Proceedings ACM SIGMOD International Conference on
Management of Data (SIGMOD 1998), June 2-4, 1998, Seattle, Washington, USA. p. 283-294
Ashish, N.; Knoblock, C.: Semi-automatic Wrapper Generation for Internet Information Sources. In
Proceedings of the Second IFCIS International Conference on Cooperative Information Systems
(CoopIS 1997), Kiawah Island, South Carolina, USA, June 24-27, 1997. p. 160-169
Hammer, J.; Garcia-Molina, H.; Cho, J.; Aranha, R.; Crespo, A.: Extracting Semistructured
Information from the Web. In Procedings of the Workshop on Management of Semistructured Data.
Tucson, Arizona. May 1997
Hammer, J.; Breunig, M.; Garcia-Molina, H.; Nestorov, S.; Vassalos, V.; Yerneni, R.: TemplateBased Wrappers in the TSIMMIS system. In Proceedings of the Twenty-Sixth SIGMOD
International conference on Management of Data. Tucson, Arizona. May 1997
Kushmerick, N.; Weld, D.; Doorenbos, R.: Wrapper Induction for Information Extraction. In
Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (IJCAI 97),
.Nagoya, Japan, August 23-29, 1997. p. 729-737
Papakonstantinou, Y.; Garcia-Molina, H.; Widom, J.: Object Exchange Across Heterogenous
Information sources. IEEE International Conference on Data Engineering, pp. 251-260. Taipei,
Taiwan. March 1995
Chawathe, S.; Garcia-Molina, H.; Hammer, J.; Ireland, K.; Papakonstantinou, Y.; Ullman, J.;
Widom, J.: The TSIMMIS Project: Integration of Heterogenous Information Sources". In
Proceedings of IPSJ Conference, pp. 7-18. Tokyo, Japan. October 1994
Sahuguet, Arnaud; Azavant, Fabien: Building Intelligent Web Applications Using Lightweight
Wrappers. In Data and Knowledge Engineering. 2000
Sattler, K.-U.; Höding, M.: Adapter Generation fro Extracting and Querying Data from Web
Sources. In ACM SIGMOD Workshop on The Web and Databases (WebDB'99). June 3-4, 1999,
Philadelphia, Pennsylvania, USA, Informal Proceedings, INRIA. p. 49-54
World Wide Web Consortium (W3C): Extensible Markup Language (XML). 1998.
http://www.w3.org/TR/REC-xml
World Wide Web Consortium (W3C): The Document Object Model. 1998.
http://www.w3.org/TR/REC-DOM-Level-1
World Wide Web Consortium (W3C): XML Path Language (XPath). 1999.
http://www.w3.org/TR/REC-xpath
World Wide Web Consortium (W3C): XSL Transformations (XSLT). 1999.
httep://www.w3.org/TR/REC-xslt
World Wide Web Consortium (W3C): HTML 4.01 Specification. December 1999.
http://www.w3.org/TR/html
Deutsch, Alin; Fernandez, Mary; Florescu, Daniela; Levy, Alon; Suciu, Dan: XML-QL. A Query
language for XML, 1998. http://db.cis.upenn.edu/XML-QL
Quick, Andy: Java HTML Tidy.
http://www3.sympatico.ca/ac.quick/jtidy.html
Ragg99
Raggett, Dave: Clean up your Web pages with HTML Tidy. 1999
http://www.w3.org/People/Raggett/tidy
IBM00
International Business Machines: Alphaworks Website. 2000
http://www.alphaworks.ibm.com
LiPH99
Liu, Ling; Pu, Calton; Han, Wei: XWRAP: An XML-enabled Wrapper Construction System for
Web Information Sources. In Proceedings of the ACM SIGMOD International Conference.
Philadelphia, 1999
144
6. Conclusion
PLATFORM INVARIANT DATABASE DESIGN FOR
INFORMATION APPLIANCES
Ulrich Grießer*, Wolfgang Hümmer, Wolfgang Lehner
EMail: {uhgriess, huemmer, lehner}@immd6.informatik.uni-erlangen.de
Abstract
The idea of a smaller, lighter and a handier devices compared to huge desktop personal computers or the still big and heavy notebooks came up already in the end of the
80’s and the beginning 90’s. Many applications are ported to those platforms. This
papers describes the experiences in developing a full-fledged database system and porting it to Palm, Epoc, and Windows CE platforms. In a first part of this paper, those operating systems are explained from a developing point of view. This discussion will
expose many different properties which should be considered in the development phase
of applications running on all of those devices. As an example of a platform-invariant
development, this paper details the resulting challenges and some solutions in the context of a platform invariant design of the IBM DB2 Everyplace 7.1 database system.
* Work was performed while author was visiting scientist at the IBM Silicon Valley Lab, San Jose
(CA).
146
1
1. Introduction
Introduction
The idea of a smaller, lighter and a handier device compared to the huge desktop personal
computers or the still big and heavy notebooks came up already in the end of the 80’s and the
beginning 90’s. One of the first products available on the market was Apple’s Newton Message Pad in 1993, but new inventions in the majority of cases exhibit a lot of difficulties in
their beginning state. Apple’s Newton for example never reached a significant amount of sold
devices because the market was not yet ready for this device and the handwriting recognition
software was not good enough.
Only some years later, Palm Computing picked up again the idea of handwriting recognition
software, simplified it and created a fast and accurate way to input data in a natural handwriting way, called Graffiti ([RhMc99]). After the success of the first handheld devices in the beginning of the 90’s, the market for these small and handy devices increased dramatically. According to a market share analysis of IDC, they expect an average handheld companion shipment increase of 44.9% each year during the next four years. Not only will the market share
of handheld devices increase in the next years, also the market of smart phones is expected
to reach around 70% of the handheld companion market ([IDC00]).
The market growth and the more widespread and increasing usage of these devices in a lot
of different fields such as health care, sales automation, insurance, and education, will make
it more and more important to access vital company data. Simple reading access is most of
the time not enough. A sales representative would like to download the current sales prices
onto his information appliance. Then being out in the field he will create a lot of new orders
directly on his device — back in his office he can synchronize the created orders with the
back–end database management system (DBMS). The complete data transfer has to be consistent and secure.
Using a database management system as backend suggests to use a DBMS also on the handheld device. However simply porting a backend database system to the device does not provide a feasable way because of the application size and the limitations of the devices.
The following sections focus on the design of multi–platform applications for information
appliances and will take care of these limitations. Therefore an overview of most commonly
used operating systems for PDAs will follow an broad overview of different handheld devices. The major operating systems will be described in detail, their strengths and weaknesses
discussed and their differences exposed. Thereafter, the paper addresses the platform invariant design of handheld applications using a database system as example. For this reason,
IBM’s database management system DB2 Everyplace, its design, the porting problems to the
EPOC operating system and the solutions for them are discussed. The paper closes with a
summary in the last section.
2 Overview of Handheld Device Operating Systems
2
147
Overview of Handheld Device Operating Systems
After several years of invisibility, handheld devices have now developed a real market share.
During the last years a huge variety of new devices and with them new operating systems
have tried to gain a part of this growing market. Not all of them were and will be successful.
Good operating systems and devices will not necessarily survive in this hard fight market.
Only those with the right technical and marketing strategy will stay and may expand their
position in the handheld device market and in the new smart phone market. This contribution
does not deal with the fact whether a device or an operating system has the right strategy to
survive. Instead this chapter explains the three major operating systems ([IDC00]) in detail:
• Symbian Ltd. – EPOC Release 5
• Palm Inc. – Palm OS Version 3.x
• Microsoft Corp. – Windows CE Version 3.0, ”PocketPC”
The following table 2.1 mentions only a small amount of the myriad of available devices, but
nevertheless it gives an overview of the big variety of handheld devices, cellular phones and
operating systems, including newer or minor operating systems, which do not yet have a significant market share.
This table shows the huge variety of personal digital assistant (PDA) operating systems and
- especially for Windows CE - an enormous number of different processor types with different clock speeds. Even if most of the devices provide a Compact Flash Memory extension,
the internal memory is always limited — currently up to 32MB RAM and 24MB Flash ROM.
But there are still many devices on the market that have a lot less memory, such as an average
Palm OS device with 4MB RAM and 2MB ROM. Despite memory size, processor and processor speed differences, the most important distinction is the device type. Personal digital
assistants can be categorized into three major types:
• Palm–size PCs (P/PC):
Their small screen size, a memory amount between 2–16 MB and handwriting recognition software to enter data with a pen characterize these devices.
• Handheld PCs (H/PC):
Compared to Palm–size PCs Handheld PCs have a touch–type able keyboard to enter
a bigger amount of data, more memory, and a bigger screen. These devices are placed
in the market between the Palm–size PCs and traditional notebooks.
• Smart phones (SP):
These are mobile phones with extended applications like mail functionality, address
book and others.
These three different device types do not have a very precise distinction, so any kind of mixtures between them is possible, for example some so called Communicators are a Handheld
device with an included mobile phone.
148
2. Overview of Handheld Device Operating Systems
Device
Processor Type
CPU
Memory in MB
Speed
Type
RAM
ROM
Flash
36 MHz
16 MB
12 MB
ü
n.a.
1.2 MB
n.a.
û
EPOC Release 5
Ericsson Mobile
ARM710T RISC, 32-bit
Companion MC218
Ericsson R380
n.a.
SP
SP
H/PC
Psion Series 7
Intel StrongARM SA-1100
RISC, 32-bit
133 MHz
32 MB
n.a.
ü
Psion revo
ARM710T RISC, 32-bit
36 MHz
8 MB
8 MB
û
Handspring Visor
Delux
Motorola DragonBall
MC68328
16 MHz
8 MB
2 MB
ü
IBM WorkPad c3
PC Companion
Motorola DragonBall
MC68EZ328
16 MHz
2 MB
2 MB
û
PalmVII
Motorola DragonBall
MC68EZ328
16 MHz
2 MB
2 MB
û
TRGpro
Motorola DragonBall
MC68EZ328
16 MHz
8 MB
2 MB
ü
QualComm pdQ
Motorola DragonBall
MC68EZ328
16 MHz
n.a.
n.a.
û
SP
Casio E-125
Cassiopeia
NEC VR4122 MIPS RISC,
64-bit
150 MHz
32 MB
16 MB
ü
P/PC
Compaq iPAQ
H3600 Series
RISC, 32-bit
206 MHz
32 MB
16 MB
ü
P/PC
Hewlett Packard
Jornada 548
Hitachi SuperH SH-3
RISC, 32-bit
133 MHz
32 MB
16 MB
û
P/PC
Symbol PPT 2700
Series
NEC VR4181 MIPS RISC,
64-bit
131 MHz
16 MB
12 MB
û
P/PC
BossaNova
RISC, 32-bit
16 MB
8 MB
û
P/PC
µ ForCE
AMD ÉLanSC400, 32 -bit
16 MB
4 MB
ü
Demo
8 MB
2 MB
û
P/PC
H/PC
Palm OS Version 3.x
P/PC
P/PC
P/PC
P/PC
Windows CE Version 3.0
JBlend
133 MHz
Neutrino
100 MHz
VTech OS
VTech Helio
RISC, 32-bit
75 MHz
Zaurus
Zaurus Icruise
(MI-EX1)
Hitachi SuperH SH-3
RISC, 32-bit
100 MHz
8 MB
16 MB
û
P/PC
Zaurus Igetty
(MI-P2)
Hitachi SuperH SH-3
RISC, 32-bit
100 MHz
2 MB
8 MB
ü
P/PC
Tab. 2.1: Overview of the variety of handheld devices and their operating systems
Handheld Device versus Desktop Personal Computer
Desktop personal computer and palm–size or handheld devices do not only differ in their
limited screen size and the processor computing power. There are also some less obvious differences like the starting and loading time of applications or the instant power on and off,
compared to the booting phase of a desktop computer. Desktop PC users do not mind to wait
several seconds to start an application because they want to use it for an extended amount of
time. In contrast a handheld or palm–size device user is often switching between his applications and is using an application for a shorter period of time. This fact implicates an easy
2 Overview of Handheld Device Operating Systems
149
use and minimized navigate, select, and execute commands. The high frequency of application usage makes it also necessary that each application takes care of its memory consumption even after the application has already been closed, because there is no reboot phase,
which could be used to clean up the memory. The application’s memory consumption is always limited during its execution time. To save memory all applications should be optimized
for minimal memory consumption. In the Palm OS Programmer’s Companion, the authors
even go further and say: ”Because of the limited space and power, optimization is critical.
To make your application as fast and efficient as possible, optimize for heap space first, speed
second, code size third.” ([PPC99])
Desktop computers normally have a mouse and a keyboard to input data. For palm– size devices users have a touchable screen and use a pen to enter characters and symbols. Applications for palm–size devices require it to be designed for minimized data input and fast and
easy use. Also the user interface is designed for easy use, efficient in its navigation and includes an unambiguous and straightforward screen design. The screen size is different for
most of the devices.
The most interesting part for a database management system is, that, compared to the desktop
PC, handheld or palm–size devices have only memory and no external or internal disk drives.
That means that all applications are stored and executed in memory. Only the operating system is persistently stored in a flash ROM. A database system on a PDA does not have to take
care of the persistent storage of its information in memory and on a hard drive. The only exception right now is IBM’s ultra small hard drive called Microdrive, which can fit into a
Compact Flash Card or PCMCIA socket ([PPC99]).
Data input
Screen
GUI
Applications
Device on/off
RAM memory
Hard drive
PC
keyboard + mouse
19” monitor
complex and complete
long program start up
long reboot
more than 64 MB
multiple GB
PDA
small keyboard or pen
small color or black and white LCD
minimized navigation for efficient use
instant on/off for frequent short time use
instant power on/off
limited to 2-32 MB
none
Tab. 2.2: Major differences between desktop PCs and PDAs.
Table 2.2 outlines the major differences between desktop PCs and PDAs. These differences
determine the graphical user interface and the application design of software for handheld or
palm–size devices.
Technical Comparison of the Major Handheld Operating Systems
Before describing the major handheld operation systems in detail, table 2.3 gives an overview of handheld devices compared to desktop operation systems like Unix and Windows
NT from a technical perspective.
150
3. EPOC Release 5
Operating
system
EPOC
Palm OS
Windows
CE
Neutrino
VT-OS
Unix
Windows
NT
Multitasking
ü
ü
ü
ü
ü
ü
ü
ü
Multithreading
ü
ü
Multiapplication
ü
û
ü
ü
ü
ü
ü
ü
ü
ü
ü
û
ü
Event driven
û
ü
ü
û
ü
ü
ü
û
û
Type
VFAT
Record,
database
FAT
CD-ROM,
FAT, NFS,
QNX, etc.
?
VFS, FFS,
EXT2, etc.
NTFS,
FAT
Name
EIKON
JBlend
Kernel
File system
?
User interface
GUI builder
Windows
AWT, Swing
Photon
microGUI
Helio GUI
X Windows,
etc.
Windows
?
û
?
ü
ü
û
ü
ü
Programming
language
C/C++,
Java
Assembler,
Basic,
C/C++,
Java
C/C++,
Java, Basic
Java
C
C
C/C++, Java,
etc.
C/C++,
Java, etc.
Development
platforms
Windows
Mac, Unix,
Windows
Windows
?
Windows,
Neutrino
Windows
Unix
Windows
Development
Tab. 2.3: Comparison of the major Handheld Operating Systems, Unix, and Windows NT
When comparing Palm OS or VT–OS with all the other operating systems one may note that
the characteristics of single event driven applications of the Palm OS is important. Therfore,
Palm OS will always play a special role in the platform-invariant application development.
JBlend, Neutrino, or Windows CE are interesting because of their design, which is mostly
focused on portability to different hardware platforms, such as handheld devices, or desktop
PCs.
3
EPOC Release 5
The EPOC operating system has been specifically designed for its use on memory limited
and battery powered handheld or palm–size devices. To save battery power EPOC has integrated a power management into its operating system, for example if the battery power is low
the user cannot turn on the back–light of the device.
In general EPOC is a multi–threaded, fully preemptive multitasking, multi– platform operating system. Besides that, EPOC is component based to facilitate design across multiple
platforms and resources, such as different screen sizes, colors and resolutions, keyboard and
non–keyboard, and touch screen and non– touch screen based designs. To accomplish this
cross platform design and other features the EPOC operating system makes use of an object
oriented approach, therefore nearly all components are written from the ground up in C++.
But the most important goal what the EPOC operating system wants to achieve is a very high
reliability in all operating conditions, because the operating system and major applications
may run up for several years without even being closed or reset in any way [Symb99b].
3 EPOC Release 5
151
The major six parts of the EPOC operating systems, visualized in figure 3.1, are ([Symb99c], [TAD+00]):
Development
Development
Support
Support
GUI
GUI
PC
PC
Connect
Connect
Graphics
Graphics
Communi
Communi -cations
cations
Engine
Engine
Support
Support
• The base, which includes the kernel, file system,
device drivers, and the user library.
• The engine support, which provides fundamental
APIs for data management, text, clipboard, graphics, sound, alarms, locale specific information etc.
• The graphics module, which provides together
with the GUI an input/output interface for the user.
• The graphical user interface (GUI), which forms
the basis for all EPOC Release 5 application GUIs.
Base
Base
Figure 3.1: Structure of the
EPOC Operating System
• The communications module which provide any
kind of communication to the outside, like TCP/IP, Serial Ports, and dial–up
• The PC connect module, which provides the connection to the desktop PC to synchronize and backup the data of the device.
The Base
Taking a closer view at the base of the EPOC operating system shows that it can be again
split up into four smaller parts. The base consists of the E32 kernel, E32 user library, device
drivers, and a file server; all of these parts are shown in the diagram of figure 3.2.
E32
E32
Library
User
User Library
Library
File
File
Server
Server
E32
E32
Kernel
Kernel
Device
Device
Drivers
Drivers
Figure 3.2: Constitution of the EPOC Base
The EPOC core is a 32–bit portable, multi–threaded, fully preemptive multitasking, multi—
platform micro kernel. That means that only a minimum portion of the whole system runs
with kernel privilege, while many system components actually run as servers on user side
with user privilege. One of the kernel side responsibilities is thread marshaling based upon
priorities and interrupts; consequently any thread may be preempted at any time.
Every application in EPOC runs in its own process with visibility only for its own memory
space. Each process has one or more conceptually concurrent threads of execution. Every
time a process is initialized EPOC creates a primary thread, which is for many applications
the only thread. However, processes can create additional threads.
152
3. EPOC Release 5
Threads are executed individually and are unaware of other threads in a process. The scheduling of threads is preemptive. Each thread is assigned a priority; at any time, the running
thread is the highest priority thread that is ready to run. Threads with equal priority are time–
sliced on a round–robin basis. There is also a special process, the kernel server process,
whose threads run at supervisor privilege level. This process normally contains two threads
([Symb99a]):
• The kernel server thread, which is the initial thread whose execution begins at the reset
vector, and which is used to implement all kernel functions requiring allocation or deallocation on the kernel heap. This is the highest priority thread in the system.
• The null thread, which runs only when no other threads are ready to run. The null
thread places the processor onto idle mode to save power and set an inactivity timer,
which will turn off the entire machine if no user activity initiated within a given interval.
Kernel resources (shown in figure 3.3), such as the kernel heap, may be used by the kernel
executive, operating in the context of any thread, or by other kernel–side threads. The kernel
executive and the user library can be thought of largely as collection of functions.
Device
Driver
Kernel Executive
User Process
User Process
User Process
User Library
User Process
Kernel Server Process
Kernel
Heap
Figure 3.3: EPOC Kernel Structure
A user thread may use services provided by the kernel, by I/O devices, or from other threads,
which function as servers. User threads request kernel services by using the user library API.
The services, such as extending the heap, timers, semaphores, creating processes and threads,
in fact anything fundamental to the operation of the machine, are carried out by the kernel
server thread. User threads can request services from I/O devices, like keyboard, digitizer,
screen, sound, RS232, infrared and others, by using an API provided by a user–side device
driver library. The kernel-side device driver handles the device request itself.
As already mentioned above EPOC has specialized memory management to take care of the
circumstances on handheld devices. EPOC supports a conventional two– level memory management unit, but unlike other operating systems EPOC uses only a single page directory,
with each process represented by as many page directory entries as are needed to hold the
3 EPOC Release 5
153
relevant page tables. This approach saves RAM and is possible because there is no requirement to support very large virtual address spaces. Memory in EPOC is allocated in consecutive chunks in the virtual address space. A thread typically uses a single chunk which may
expand because it includes a stack at the bottom a heap on top ([Task99]). The most important point in the memory management of EPOC is that every application is planned to run for
several years without closing it or resetting the machine. Therefore it is essential that all
memory leaks are eliminated, because every even very small memory leak will accumulate
over the time.†
The File Server
The EPOC file server provides file systems for ROM, RAM and removable media, and an
interface to allow dynamically installable file systems, such as those required communicating with remote disks over a network. The drive, directory and file hierarchy is VFAT, thus
making the file system naturally compatible with desktop PCs running Windows, OS/2 or
DOS.
The file server implements a program loader, which supports both executables and dynamic
link libraries (DLL), which are executed in–place from ROM, or loaded as needed from
RAM or from removable media. A distinctive aspect of the EPOC file system is the use of
32–bit unique identifiers (UIDs), which allow the type of every executable to be identified.
This serves as a form of identification, used among other things for associating an application
file with its owning application. It also protects against accidental loading of a DLL which
is not the version or type required. UIDs are normally checked by the EPOC native loader,
but in the EPOC emulation mode on Windows, program loading is handled by the running
Windows platform, so UIDs will not be checked ([Task99]).
The Graphical User Interface
The EPOC graphical user interface consists of the following four parts([Symb99b]), which
are also shown in figure 3.4:
• The so called EIKON provides the particular look–and–feel to the user interface.
• The CONtrol Environment (CONE), which provides a better interface to the window
server.
• The window server controls access of applications to the machine’s interaction devices,
like screen, keyboard and pen.
• The Graphics Device Interface (GDI) specifies the EPOC support for graphics.
† This would be too nice to be true, but in reality there are always memory leaks. The EPOC development environment has some features in the debugging mode of the emulator to check for memory
leaks. The EPOC emulator will exit with a warning, if the developer is trying to close an application
that has memory leaks.
154
3. EPOC Release 5
To implement a user interface, the application code in an EIKON application will typically
not just use the EIKON API but also the CONE and GDI API as shown in the figure 3.4.
Application Code
Code
EIKON
CONE
CONE
Server
Window
Window Server
Server
GDI
GDI
Base
Base
Figure 3.4: Setup of the EPOC Graphical User Interface
Because EIKON has been designed for devices with small screens and a pen instead a mouse,
there are some notable differences between EIKON and desktop GUIs. Instead of using double clicking, which is very difficult with a pen, items are activated by selection and then
clicking. The menu bar is not displayed until the user activates it, to make the most out of the
limited screen space. The dialog–programming framework forces the developer to create a
simple dialog layout suitable for small screens. Applications normally occupy the whole
screen.
The CONtrol Environment (CONE) is used to provide a simplified access to the window
server, to set a recommended framework for all user interface libraries without imposing a
particular user interface policy. The window server is responsible for the control of the machine’s interaction devices. The window server is a process, which is started up when the system is booted, and remains running all the time. Because the window server process is shared
between all running applications, applications are therefore clients of the window server.
The Graphics Device Interface (GDI) specifies the support for graphics. Its main target is to
provide a rich set of drawing primitives, which include points, lines, rectangles, ellipses,
arcs, polylines, device–independent bitmaps, and text.
Programming Tools and Languages
For EPOC Release 5 Symbian provides three different software development kits for various
languages. EPOC right now supports C/C++, JAVA, and OPL (Organizer Programming Language)‡, which is a proprietary basic dialect. The C/C++ development process is based on
Microsoft Visual C++. First the developer writes the application in MS Visual C++, uses its
‡ Some EPOC devices (e.g. Psion 5) already include an OPL programming environment directly on
the device.
4 Palm OS Version 3.x
155
debugger to remove all the errors out of the code while running it in the EPOC emulator. After this step the code has to be recompiled for the target machine, which is right now an ARM
processor device. Then the code can be downloaded to the device and executed.
Symbian’s Java software development kit is based on JDK 1.1.4. It also provides AWT as
graphical user interface and JNI to write native methods. The Java SDK is normally just used
to test the Java programs first in the emulator mode before downloading it onto the device,
or for writing an interface between Java and the native C++ code of the EPOC operating system. Developing in OPL means most of the time writing the code in either any available editor or directly in the EPOC emulator. After compiling the code it can be directly executed
on the device or in the emulator. The functionality of OPL in comparison to C/C++ is very
restricted, but easy to learn.
4
Palm OS Version 3.x
Unlike the other handheld operating systems the hardware for the Palm Operating System
(Palm OS) has been designed after the software. The reason for that was Graffiti, a third–
party handwriting recognition software, formerly developed for Apple’s Newton and other
PDAs. Palm Computing improved the Graffiti software in a way that it became easy to use,
accurate, and fast with only a little bit of learning the different strokes ([RhMc99]). The pen
used to write the Graffiti strokes or used to type on a little virtual keyboard is still the only
input mechanism for Palm devices.**
The Palm operating system runs on top of a preemptive multitasking, event driven, and
multi–threaded micro kernel. One task runs the user interface; others handle things like monitoring input. Unlike other handheld operating systems, Palm OS allows only one application
being open at a time, because applications run in the single user interface thread and therefore
cannot be multi–threaded. The fact that the device had been designed for the software and its
** External keyboards already exist, but are not part of the Palm product.
156
4. Palm OS Version 3.x
operating system causes a very tight relationship. It is going so far that the device is understood as an ”extension of the desktop” ([RhMc99]). This understanding is important for Palm
users and software developers.
User
User
Interface
Interface
Memory
Memory
System
System
Management
Management
Comms
Comms
Forms
Forms
Controls
Controls
Fonts
Fonts
Dialogs
Dialogs
Menus
Menus
Drawing
Drawing
Databases
Databases
Space
Runtime
Runtime Space
Space
Space
System
System Space
Space
Globals
Globals
Events
Events
Strings
Strings
Intl.
Intl. Text
Text
Time
Time
Alarm
Alarm
Sounds
Sounds
Exchange
Exchange
Serial
Serial
TCP/IP
TCP/IP
IrDA
IrDA
Microkernel
Microkernel
Licensee Additions
Applications
Applications
Hardware
Hardware
Figure 4.1: Architecture of the Palm Operating System
The following sections will give a more detailed overview about the Palm operating system,
looking into the Kernel with its thread and memory handling, the file system, the optimized
graphical user interface, and Palm OS development in general (figure 4.1).
Kernel
As already mentioned above, the Palm OS is a preemptive multitasking, multi– threaded,
event driven operating system. The Palm OS kernel internals are not documented contrary to
other handheld operating systems. As known so far, the Palm operating system supports between 4 and 6 threads, dependent on the operating system version, which are all but one used
for system internal functions. The one free thread is used by any user application, therefore
only one application runs at a time. Compared to the internals of the micro kernel the memory
management is precisely documented.
The memory in Palm OS is handled in an unusual fashion. Like every other operating system
Palm OS uses both ROM and RAM. The flash ROM contains the operating system itself that
can be updated with a newer Palm OS ROM image. But the interesting part of the memory
management is the RAM, which is used for dynamic memory allocation and persistent storage. The dynamic memory is used by any applications or the system while it is running. Besides the operating system globals it includes, the dynamic allocation of OS and applications,
the application globals, and the application stack ([RhMc99]). The permanent storage includes all uploaded applications as well as any data the user creates, views, or edits such as
names and phone numbers, to–dos, memos, and data that is created by any other application.
Memory for both types in Palm OS is allocated in chunks, which are collected for permanent
storage in so called databases.
5 Windows CE Version 3.0 and ”PocketPC”
157
Unlike traditional desktop operating systems, data and code are not copied from persistent
storage, like a hard drive, to dynamic memory but are used in place. The system will execute
any application in place, because the persistent storage itself is RAM and therefore can be
read like dynamical memory. This memory execution concept makes it possible to have an
operating system and applications running in a memory limited environment ([RhMc99]).
File System
Palm OS does not use a traditional file system, because of its limited memory storage and an
efficient synchronization with a desktop computer. Instead data is stored in memory chunks
called records, which are grouped into databases that are similar to files. Because of this design, files are not stored as a continuous piece, but are broken down into multiple records. To
save space any database is edited in place instead of copying from the persistent storage into
the RAM ([PPC99]).
Graphical User Interface
Since Palm OS is an event driven operating system the whole graphical user interface is
based on this principle. The GUI consists of different forms, but only one at a time has the
focus and is active. Only this form will react on any events like pen down, pen up, system
key down, and system key up. But not only the form is reacting to events, also all the objects
inside the form like buttons, list boxes, tables, fields, etc. respond to events.
Programming Tools and Languages:
The software development for Palm OS is right now based on C/C++ with a lot of different
development environments, such as Metrowerks CodeWarrior or simply the GnuCC environment. Compared to all other handheld platforms, Palm OS has the most and best graphical
user interface editors, like Satellite Forms from Puma Inc. or IBM’s Personal Application
Builder ([RhMc99]).
Besides C/C++ the developer may also develop in Assembler or CASL a proprietary basic
language. Some months ago Sun provided Java 2 Platform, Micro Edition with a K Virtual
Machine (KVM) for the Palm operating system. The Palm OS platform is so far the only platform for which development on the Macintosh or Unix environment is possible.
5
Windows CE Version 3.0 and ”PocketPC”
Microsoft Windows CE was designed as a multithreaded, fully preemptive, multitasking,
multi–platform operating system for devices with limited resources. Compared to older version of Windows CE, it now includes certain realtime operation system features, such as support for nested interrupts, better thread response, additional task priorities, and semaphores.
158
5. Windows CE Version 3.0 and ”PocketPC”
The design still has been kept modular to allow it to be customized for products ranging from
consumer electronic devices to specialized industrial controllers. The Windows CE operating
system consists of four primary groups of modules (Figure 5.1).
• The kernel supports basic services, such as process, thread handling, and memory.
• Persistent storage of information is provided by the file system.
• The Graphics, Windowing, and Event Subsystem (GWES) controls graphics and window related features.
• The communications interface which supports the exchange of information with other
devices.
Besides these four main modules, Windows CE also contains a number of additional modules that support tasks as COM/OLE, communication modules, PCMCIA support, RSA encryption, multimedia support, or managing installable device drivers. The following illustration (figure 5.1) explains how these features fit into the overall structure of the Windows CE
operating system.
Windows
Windows CE
CE Applications
Applications
Development
Development Tools
Tools
Shell
Shell
Persistent
Persistent Storage
Storage
Kernel
Kernel
GWES
GWES
Communications
Communications
Build
in Drivers
Build--in
Drivers
Installable
Installable Drivers
Drivers
Hardware
Hardware
Figure 5.1: Architecture of the Windows CE Operating System
The Kernel
The Windows CE kernel contains the core operating system functionality that must be
present on all Windows CE–based platforms. It includes support for process management,
exception handling, multitasking, multithreading, and memory management.
The Windows CE kernel uses dynamic link libraries, which are written as reentrant code to
allow applications to simultaneously share common routines, to maximize the available
memory. With this approach the amount of code required to execute applications, which
stays in the memory for a long time can be minimized.
5 Windows CE Version 3.0 and ”PocketPC”
159
As a multitasking operating system, Windows CE supports up to 32 simultaneous processes
and each of them is a single instance of an application. Additional multithreading support allows each process to create multiple threads of execution. A thread is part of a process that
runs concurrently with other parts. Threads operate independently, but each one belongs to a
particular process and shares the same memory space. The total number of threads is limited
only by available physical memory. Although threads can operate independently, they are
managed by the thread owning the process. One thread, for example, may depend on another
for information or runs concurrently to others. Thread synchronization suspends a thread’s
execution until the thread receives the notification to proceed. Windows CE provides several
synchronization objects that enable you to synchronize a thread’s actions with those of another thread. These objects include: critical sections, mutexes, events and semaphores. Additionally, you can use interlocked functions or wait functions to synchronize a thread.
Windows CE implements thread synchronization with minimum processor resources, which
is an important feature for battery–powered devices. Compared to other operating systems,
Windows CE uses the kernel to handle thread–related tasks, such as scheduling, synchronization, and resource management, so that an application need not poll for process or thread
completion or perform other thread–management functions. As a preemptive operating system Windows CE allows the execution of a process or thread to be preempted by any other
with higher priority. For thread scheduling it uses a priority–based, time–slice algorithm,
with 256 levels of thread priority.
The Windows CE kernel supports a single, flat, or unsegmented, virtual address space in the
size of 4 GB that all processes share. Instead of assigning each process a different address
space, Windows CE protects process memory by altering page protections. Approximately 1
GB of the total virtual memory, divided into 33 memory slots, each 32 MB in size, is reserved
for process execution. The kernel protects each process by assigning it to a uniquely reserved
memory slot, which causes the limitation to 32 processes.
Because handheld or palm–size devices usually have no disk drive, physical memory, a combination of ROM and RAM, plays a substantially different role on one of these platforms
than it does on a desktop computer. The unmodifiable ROM is used for permanent storage
and includes the Windows CE operating system and any built–in applications that the manufacturer provides.
The Windows CE system maintains RAM continuously to compensate the lack of a disk
drive, the application to use RAM for persistent storage as well as application execution. To
serve these two purposes, RAM is divided into storage, also known as the object store, and
program memory. Program memory is used for application execution, while the object store
is used for persistent storage of data and any executable code not stored in ROM. To minimize RAM requirements on Windows CE–based devices, executable code stored in ROM
usually executes in–place, not in RAM. Because of this, the operating system only needs a
small amount of RAM for such purposes like stack and heap storage.
160
5. Windows CE Version 3.0 and ”PocketPC”
Third party applications are commonly stored and executed in RAM. These RAM– based applications are stored in compressed form, so they must be uncompressed and loaded into program memory for execution. To increase the performance of application software and reduce
RAM use, Windows CE supports on–demand paging. With that, the operating system needs
to uncompress and load only the memory page containing the portion of the application that
is currently executing. When execution is finished, the page can be swapped out, and the next
page can be loaded.
The Persistent Storage
The persistent storage memory portion of RAM is called object store, which includes the file
system for storage of application and data files, the database providing structured storage as
an alternative to keep application data files in the registry, and the Windows CE system registry used to store the system configuration and any other information that an application
must access.
The Windows CE file system holds executable files and data files that the user installs or creates. It supports up to nine FAT volumes, which are treated as a storage card. If a storage card
has multiple partitions, then each partition is treated as a separate volume. Files in the FAT
file system are typically stored in compressed form, which are accessible with standard
Win32 file system functions. To reduce the data loss during a critical failure, such as loss of
power, the Windows CE file system is transactional. In addition, the file system implements
a transactional mirroring scheme to track FAT file system operations that are not transactional. The mirroring scheme restores the FAT volume if power is lost while a critical operation
is performed.
The Windows CE database provides general–purpose, structured storage of data, but it is not
a full–fledged database. In particular, Windows CE databases have only one level of hierarchy. Records cannot contain other records, nor can they be shared between databases.
The Graphics, Windowing, and Event Subsystem
The Graphics Windowing and Events Subsystem (GWES) is the graphical user interface of
the Windows CE operating system. User input is handled by translating keystrokes, stylus
movements, and control selections into messages that convey information to applications
and the operating system. The output to the user is either created in windows, graphics, or
text that are displayed on display devices and printers.
GWES is supporting all the windows, dialog boxes, controls, menus, and resources that make
up the Windows CE user interface. This interface allows users to control applications by
choosing menu commands, pushing buttons, checking and un-checking boxes, and manipulating a variety of other controls. The user is also provided with information by the GWES
in the form of bitmaps, carets, cursors, text, and icons.
Even Windows CE–based platforms that lack a graphical user interface use GWES’ basic
windowing and messaging capabilities. These provide the means for communication between the user, the application, and the operating system. As part of GWES, Windows CE
6 Other Handheld Operating Systems
161
provides support for active power management to extend the limited lifetime of battery–operated devices. The operating system automatically determines a power consumption level
to match the state of operation of the device.
Programming Tools and Languages
Windows CE ”PocketPC” applications can be written in Basic or C++/C using the Microsoft
eMbedded Visual Tools 3.0 ([MCE00]), which are used to develop, compile, and debug the
applications.
A Java Virtual Machine from Microsoft does not yet exist. Right now only other companies
provide a Java Virtual Machine (JVM) for Windows CE. Sun for example provides a PersonalJava Application Environment, NSIcom has its CrEme, an augmented Java Virtual Machine (cf. Appendix ’Internet Links’).
6
Other Handheld Operating Systems
This section provides a brief overview of some other operating systems, which are not yet
popular in Europe or the US, but have for example a significant market share in Japan. However, this does not imply that these operating systems are less interesting. References to Internet pages can be found in the Appendix ’Internet Links’.
JBlend
JBlend, developed by the Aplix Corporations, is an operating system that adheres from the
JTRON architecture specification. JTRON, announced by the TRON Project, is a real–time
operating system architecture based on ITRON, a high performance real–time operating system specification, and the Java runtime environment ([JBLEND]). JBlend has been implemented on several systems such as car navigation system, panel computers, and Personal
Digital Assistance (PDA) devices.
Processor
Processor dependent
dependent portion
portion
(thread,
(thread, monitor,
monitor, interruption)
interruption)
JavaOS
PAL
PAL
ITRON
specification
kernel
Task
Task manipulation,
manipulation, system
system calls,
calls,
Semaphore
Semaphore manipulation
manipulation
CPU
CPU
Figure 6.1: Architecture of the JBlend Operating System
162
6. Other Handheld Operating Systems
To merge Java and ITRON Aplix decided to allow both to coexist in JBlend, to minimize the
loss of performance. To harmonize the interaction between both parts an intermediary layer
called Processor Abstraction Layer (PAL) has been introduced, as shown in figure 6.1.
Neutrino
QNX Neutrino 2.0 is a scalable — from a tiny, resource–constrained system up to high–end
distributed computing environments —, event–driven, real–time operating system built following the POSIX application programming interface. The main part of Neutrino is a true
micro kernel design that provides multitasking, threads, priority–driven preemptive scheduling, and fast context switching. The micro kernel can be extended by a pool of dynamically
plugged in and out operating system modules, such as process manager, different file managers for DOS, QNX, CD–ROM, NFS, etc. file systems, TCP/IP manager, character manager, network manager, and Photon GUI manager [QNX99], [QNX01].
VTech OS
VTech’s VT–OS Version 1.1 is an open source operating system, which has mainly been designed for the Helio PDA. The operating system consists of three layers, the device driver,
kernel and user interface. The device driver supports pen, display, sound, I/O keys, serial
port, interrupt, and hardware module functions. The Kernel includes a memory, database,
and power management and besides a queue, a scheduler, and a timer. The user interface has
functions for strings, alarms, resources, a clipboard, and a display.
VT–OS contains a simple single process kernel that provides basic process management.
Any application is generally a single–threaded and event–driven program. The communication between the three layers is through a message– and event queue ([VT99]).
Zaurus
Zaurus, developed by Sharp, is a very popular operating system in Japan. It is very hard to
find any information about the structure of this operating system, because unfortunately most
information is only available in Japanese (cf. Appendix ’Internet Links’). Nevertheless this
operating system should be mentioned here.
Summary
There is no ultimate operating system that can fit every demand, like hardware platform flexibility, easy customer usage, customer acceptance, big variety of development tools and platforms, and clear and flexible programming APIs. Every operating system has a different way
to achieve most of these targets, but the future will show which operating system will succeed.
7 Platform Invariant Design of Applications
7
163
Platform Invariant Design of Applications
Platform invariant design is an old and for a long time known problem, but nowadays it is
getting a new orientation and dynamic. Reasons for this developmnent are the myrades of
handheld devices with their diversity of operating systems, which differ more than any two
desktop PC operating systems. The high diversity and the different and new constraints ask
for new solutions for an application design. This section describes how to successfully develop multi–platform applications for PDAs using the IBM satellite database system DB2
Everyplace as an example.
DB2 Everyplace is a relational database management system developed by IBM for handheld devices running Palm OS, Windows CE, EPOC, embedded Linux, or QNX Neutrino.
The database system is less then 150 kBytes and allows a synchronization of relational data
from other data sources such as DB2 Universal Database for UNIX, OS/2 and Windows NT,
DB2 for OS/390, and DB2 for AS/400. DB2 Everyplace Version 7.1 supports a subset of the
SQL 99 standard:
• CREATE TABLE with up to eight columns in a primary key, referential and check constraints. Only the data types INTEGER, INT, SMALLINT, DECIMAL, CHAR, CHARACTER, VARCHAR, BLOB, DATE, and TIME are supported.
• CREATE INDEX with up to eight columns for the index key.
• DELETE one or more rows from a table.
• DROP tables or indexes.
• INSERT a single row into a table.
• SELECT statement supporting DISTINCT, GROUP BY, ORDER BY, and LIMIT.
(no support of nested statements)
• UPDATE the values of specified columns in rows of a table.
DB2 Everyplace also supports a subset of three interfaces: DB2 Call Level Interface, Open
Database Connectivity (ODBC), and Java Database Connectivity (JDBC).
The following subsections discuss the design of the first version of IBM’s database management system for handheld devices named DB2 Everywhere. This discussion is followed by
a presentation of problems of porting this first database management system to the EPOC
platform. Problems encountered there led to the design turnaround and reimplementation of
the new DBMS. With these major changes and the marketing requirement of an registered
trademark comes a new, in functionality only slightly changed database system, named DB2
Everyplace Version 7.1. Therefore, the two different product names are used explicitly to distinguish both versions.
164
7. Platform Invariant Design of Applications
7.1 General Design of a DBMS for Handheld Devices
Due to the limited size of resources on these handheld devices a general textbook six layers
approach is dammed to fail, because of the big overhead created by the different design layers. The best will be to reduce as many layers as possible to make the database management
system very compact. Fitting the special requirements of handheld devices asks for a reduced
architecture ([HäRa99]) as shown in figure7.1.
Data management
MetaMetadata
Access management
Transactions
Memory management
Figure 7.1: The Architecture of a DBMS for handheld devices
The data management layer deals with the translation and optimization of requests. The
management of physical records and access paths is done in the access management tier.
Buffer and memory management is supported by the memory management layer. The transactions layer is ensuring consistent data. The meta-data management stores and handles all
internal information about system tables, indexes, user tables, columns and records. Using
this reduced design makes it possible to fit a full database management system within only
few kilobytes.
7.2 Design of DB2 Everywhere Version 1.x
DB2 Everywhere is designed as a small footprint [Footnote: Small footprint means minimal
memory consumption for storage and during runtime.] relational database, which should offer the users the functionality they know from DB2 or any other database management system on the back–end server or on their desktop PC. Aiming at a small library code size (ca.
100 kB), nobody is able to provide the whole functionality of a mainframe or desktop database system. To achieve the small footprint, DB2 Everywhere has been designed with a limited functionality, which is reduced to a core of necessary supported functions (cf. [DB2e99],
Appendix A and B). The Call Level Interface (CLI) or the Open DataBase Connectivity
(ODBC) are supported interfaces for DB2 Everywhere. In the next version a JDBC interface
for certain platforms will also be available.
7.2 Design of DB2 Everywhere Version 1.x
Compiler
Parser
Semantic
Optimizer
Runtime
RDS
Interpreter
Catalog
Data Management Services (DMS)
Palm OS
Windows CE
DB2 Everywhere Version 1.x
The CLI/ODBC interface is the topmost layer
of any database management system and provides the users a standardized interface with
the whole functionality to the database system.
CLI / ODBC
Operating System Services (OSS)
Palm OS
Windows CE
OS
The limitation of the CLI/ODBC interface
functionality and the implementation of only a
subset of SQL make it possible to perform a
small footprint database system. The following picture shows the internal design of DB2
Everywhere version 1.x with its six main
parts: CLI/ODBC, Runtime, Compiler, Catalog, Data Management Services, and Operating System Services.
165
After passing the CLI layer any command,
Operating System
Operating System
which has not already been executed, will
Figure 7.2: Architecture of
reach the Runtime part, which consists of the
DB2 Everywhere Version 1.x.
relational data service and the interpreter. The
relational data services (RDS) works as a dispatcher for all database commands. In this module it will be decided whether the compiler needs to be called or the command can be passed
directly to the interpreter. The interpreter will execute the SQL commands and will interpret
any aggregated functions and predicates.
The Compiler, like in any other database system, consists of a parser, a semantic analyzer and
an optimizer. It will be called if it is necessary to parse, check and optimize an SQL statement. The parser is responsible splitting up any entered SQL statement in its tokens and
check whether the SQL statement is grammatically correct. After the syntax check the semantic analyzer will test the query for it’s semantical correctness by checking for example
whether the table– and column names are correct. If the statement has passed these two steps
and it has been granted as a correct statement the optimizer will improve the statement to
achieve the best performance for execution.
All information about all databases and tables are saved in the catalog. DB2 Everywhere
does not support different databases, so the catalog will only store meta information about
tables, it’s columns, their data types and indexes.
The Data Management Services (DMS) are responsible for the direct database access on table, record, or column level. To speed up prototyping in DB2 Everywhere Version 1.x, this
layer has been designed as an interface to the native database functions in each operating system. The DB2 Everywhere internal data representation has been mapped to the databases of
Palm OS or Windows CE.
166
The lowest layer is the Operating System Services (OSS) layer, which takes care of any operating system platform dependencies to hide them to any layer on its top. Normal interfaces
supported are memory access, character or string manipulation, file system access and other
functionalities. In the case of porting DB2 Everywhere to another hard– and software platform only the OSS layer should have to be customized and afterwards the engine should run
on the new operating system immediately.
The statement from above is only partially correct. Looking inside DB2 Everywhere one will
discover a lot of platform dependent implementation in any other layer than the DMS or the
OSS layer. But why are there so many platform dependent implementations in any other levels than the DMS and the operating system services layer?
Platform dependent implementations above the DMS layer are caused by the design of the
DMS layer. The decision to use the native database management systems (NDBMS) of Palm
OS or Windows CE, helped to speed up the development process but caused a lot of problems
in the implementation of DB2 Everywhere. As you already shown, the Palm OS and the Windows CE platform are totally different in their structure. Not only the operating system is different but also the implementation of the NDBMS are not compatible. Windows CE is using
its object store to implement the database system; the Palm OS is using ”records” or ”databases” for the file system.
Implementing the data management services on base of two so different operating systems
brings all their differences with it. There are, for example, the problems of big and little endian formats for numeric operations, or sorting differences of character strings, etc. On the
other hand there are the dissimilar implementations of the NDBMS. For example Windows
CE is storing its records after a sequential insertion into a table in the opposite order than
Palm OS does it. This example is only one problem which is caused by the different design
of the NDBMS on both platforms, but there are still other problems like the incompatible
NDBMS interfaces and the constraints you have in both NDBMS, like maximum table size,
number of records, size of keys, limitation in size of data types, etc.
These various behaviors of the NDBMSs cause as well a major customization of the Compiler and the Runtime part of DB2 Everywhere. Porting this version of DB2 Everywhere to
the EPOC platforms will add more specialized implementations in all these parts. Having
more and more specialized implementations will increase the problems of porting DB2 Everywhere to other platforms. The next platform will again increase the complexity of the
code and then rise more porting problems. The only way out of this vicious circle is to implement an own database management service, which is no longer based on the native databases of each platform; instead it is only based on the file system. But even that will cause
problems, because the Palm OS does not have a file system like any other operating system.
Therefore a Palm OS customized solution would always be needed.
7.3 Porting to EPOC
167
7.3 Porting to EPOC
This section explains the way to port DB2 Everywhere Version 1.x to the EPOC platform. It
will focus not only on the already mentioned problems, but also about typical and characteristic problems for the EPOC platform.
DB2 Everywhere in Version 1.x has been designed to use the NDBMS of the Palm and the
Windows CE operating system. The first focus was to realize the same approach for the
EPOC platform. But before starting to port the database system engine to the EPOC platform,
it is important to understand the development environment and especially those parts that are
used in the engine. The main parts to know about are memory handling, string manipulation
and of course EPOC’s native database system.
First it is necessary to discuss one of the major differences between the EPOC platform compared to Palm OS and Windows CE. As already mentioned, EPOC is a truly object–oriented
operating system and therefore mostly written in C++. This fact creates conflicts with the
DB2 Everywhere engine written in C, because any needed C++ method must have a wrapper
to C. That especially means every C++ NDBMS access function requires a C wrapper, which
will increase the code size without increasing the functionality.
After taking a look at the NDBMS of EPOC it was already obvious that there are some restrictions, which will limit DB2 Everywhere in its future and present functionality. For example, the native database only supports a maximum record size of 8200 bytes and therefore
the maximum number of columns is limited by this record size. Each table can only have 32
indexes, which are supporting multi column keys, but the key size must not exceed 244 bytes.
Trying to implement the DMS interface we faced a lot of problems, which where very hard
to solve. We produced a lot of system crashes, which we were not able to understand. The
EPOC documentation was not helpful in these cases at all. But overall the biggest struggle
we fought was against the interface and the way of calling the NDBMS, because its implementation was not at all compliant with the DB2 Everywhere logic. Because no fast solution
was possible, IBM decided to drop the idea of using the NDBMS. This started the implementation of an own data management structure, based on the file system.
168
7.4 The New DB2 Everyplace Version 7.1
CLI / ODBC
Compiler
Parser
RDS
Semantic
Optimizer
Interpreter
Catalog
Data Management Services (DMS)
File System
Operating System Services (OSS)
For this version of DB2 Everyplace it was not
necessary anymore to deal with the NDBMS,
EPOC
Palm OS
Windows CE
but instead of that you have to work with the
Operating
Operating
Operating
System
System
System
file system. Implementing the new DMS layer
is easier, because EPOC Development environFigure 7.3: Internal Structure of
ment includes a C standard library, which is
DB2 Everyplace Version 7.1.
providing all functions needed for the database
system, e.g. memory management, string manipulation and file access.
DB2 Everyplace Version 7.1
Runtime
OS
The decision not to use the functionalities of
the NDBMS anymore marks the starting point
of DB2 Everyplace Version 7.1. This version
mainly changed the DMS layer, which is now
fully platform independent. The biggest goal
which has been achieved by introducing the
new DMS layer is, making it possible to isolate
any platform dependencies in the OSS or in the
file system layer. This is a major improvement
in the sense of portability and made it much
easier to port DB2 Everywhere to the EPOC
platform.
With this library it seems that all problems that came up are solved, but this approach raised
some new problems, such as internal memory handling of the C standard library and writable
static data in a dynamic link library. No support of writable static data (global variables) is a
very specific problem of EPOC applications ([TAD+00]). While programming C++ this constraint does not really hurt, but for C code global variables are sometimes necessary and cannot be avoided. The only solution for that problem is, collecting all global variables in a global structure. At the start of the engine memory will be allocated for this structure and the
values will be initialized. Thereafter a pointer to this memory will be stored in the thread local storage, which can be accessed from any file of the engine. Closing the engine will release
the memory of the global structure. But within the addition, that memory used inside the C
standard library has to be released explicitly by using an EPOC system call.
After solving all these problems mentioned above, the porting effort to EPOC and the time
spent was much smaller, than it would have been with the approach used in DB2 Everywhere
1.x.
7.5 Porting Remarks
169
7.5 Porting Remarks
After the introduction of the new data management services layer, it was and will be a fairly
easy and quick task to port DB2 Everyplace Version 7.1 to Embedded Linux, Neutrino, and
any other operating system. Being independent from platform dependent database management implementations highly improved the portability of the code. Now only the operating
system services layer and the file system support is platform dependent. Supporting the same
file system behavior across all platforms is still not an easy task, because for example Palm
OS does not have a real file system, which makes a custom solution necessary. But these
problems are still minor compared to using the NDBMS.
Due to the fact that only the OSS layer and the file system are platform dependent, only these
layers may cause some more trouble because of the different file system behavior (e.g. EPOC
can have 32 files maximum open at a time). But reducing and keeping the platform dependent code as small as possible will definitely pay off in future.
8
As seen in the introduction the handheld market has definitely a big potential, so that a lot of
software companies are now developing software for this market. Most of the time these
companies start with a prototype only developed for one or maybe two platforms (e.g. Palm
OS, and Windows CE). Later when they see that there is a market requirement, they will add
different other platforms. If then the first application design is not already well structured the
addition of other platforms requires to rewrite the application partially. Developers may minimize or even avoid the additional work by pursuing platform invariant application development.
The first section of this paper was introducing the handheld device market from a marketing
aspect. Afterwards differences between PDAs and desktop computers were described to give
an introduction into the handheld device world. This world has been build by many devices
and operating systems, which were shown in more detail in chapter two. After that general
introduction the paper is concentrating on the business–case of DB2 Everyplace and the
handheld application design guidelines and implementation rules.
Finally this paper has given any reader the necessary knowledge to help him to understand
the PDA world in the past, present, and future. The past was Apple’s Newton, the present is
dominated by Palm with its devices and operating system, and the future might be any operating system and device running Java.
What ever will happen in future, every application that should run on various operating system platforms has to be properly designed for this multi–platform use. Only with a good
multi–platform design, which will minimize the porting effort, time consumption, and costs,
a company will be able to react fast and successful to the recent market trends.
170
References
IBM99
Symb99a
Symb99b
Symb99c
Task99
HäRa99
IDC00
JBLEND
MCE00
PPC99
TAD+00
QNX99
QNX01
RhMc99
VT99
IBM Corporation, Administration and Application Programming Guide, DB2 Everywhere for
Windows CE and Palm OS Software, Version 1.1, August, 1999, Publication No. SC26-9675-00
Symbian Ltd., EPOC Release 5 Software Development Kit for C++, SDK documentation, 1999
Symbian Ltd., EPOC Developers Course, 1999
Symbian Ltd., EPOC Technical Paper: EPOC Overview: Summary June, 1999
http://www.symbian.com/technology/papers/e5oall.html
Martin Tasker, EPOC Technical Paper: EPOC Overview: Core June, 1999
http://www.symbian.com/technology/papers/e5ocore.html
Theo Härder, Erhard Rahm, Datenbanksysteme: Konzepte und Techniken der Implementierung,
Springer Verlag, Berlin, Heidelberg, New York, 1999
Jill House, Market Mayhem: The Smart Handheld Devices Market Analysis and Forecast, 1999–
2004, IDC Report #W22430, June 2000; http://www.idc.com
Aplix Corporation, JBlend operating system online documentation; http://www.jblend.com/en/
Microsoft eMbedded Visual Tools 3.0, Online documentation for the SDK, 2000
http://www.microsoft.com/pocketpc/developer.asp
Palm Computing Platform, Palm OS Software Development Kit: Palm OS Programmer’s
Companion, 9/1999; http://www.palm.com/devzone/docs.html
Martin Tasker, Jonathan Allin, Jonathan Dixon, John Forrest, Mark Heath, Tim Richardson, Mark
Shackman, Professional Symbian Programming — Mobile Solutions on the EPOC Platform, Wrox
Press Ltd. & Symbian Ltd., 2000.
QNX Software Systems Ltd., QNX System Architecture, Online Documentation, 1999
http://www.qnx.com/literature/qnx_sysarch/index.html
QNX Software Systems Ltd., Handheld Computers for Mission Critical Applications – A WinCE
Alternative is "In Hand"; http://www.qnx.com/literature/whitepapers/handheld.html
Neil Rhodes, Julie McKeehan, Palm Programming: The Developer’s Guide, O’Reilly (and)
Associates, Inc., 1. Edition, January 1999, USA
VT–OS V1.1 Overview, (Included in the SDK document); http://www.pdabuzz.com/vtech/
Internet Links
Apple Newton Message–Pad
http://www.pdastreet.com/newton.html
EPOC — Symbian developer network
http://www.symbiandevnet.com/
JTRON specification overview
http://www.jblend.com/en/jtron/3-e.html
JTRON specification
http://tron.um.u-tokyo.ac.jp/TRON/ITRON/SPEC/
FILE/jtron-200e.pdf
EPOC — Symbian technical papers
Microsoft Windows CE ”PocketPC”
http://www.symbian.com/technology/papers/papers.html http://www.microsoft.com/pocketpc/default1.asp
Helio — device
Palm OS
http://www.myhelio.com/cgi-bin/vtechhelio.storefront http://www.palm.com/
Helio — VT–OS
Palm OS developer network
http://www.pdabuzz.com/vtech/
http://www.palmos.com/dev/
JBlend
TRON Project
http://www.jblend.com/en/
http://tron.um.u-tokyo.ac.jp/TRON/
D
PubScribe
172
Part D: PubScribe
PUBSCRIBE MIKRO- UND MAKROARCHITEKTUR
M. Redert, C. Reinhard, W. Lehner, W. Hümmer
{mlredert, cnreinha, lehner, huemmer}@immd6.informatik.uni-erlangen.de
Kurzfassung
In dem folgenden Beitrag wird die Architektur des PubScribe-Systems aus zwei
unterschiedlichen Perspektiven beleuchtet. Auf der einen Seite wird die Makroarchitektur vorgestellt, die sich aus einem Verbund von Vermittlungskomponenten mit unterschiedlichen Strategien der verteilten Abarbeitung einzelner Subskriptionsanforderungen ergibt. Auf der anderen Seite wird die Architektur mikroskopisch untersucht; Es
wird gezeigt, aus welchen Modulen sich jeweils eine einzelne Vermittlungskomponente
zusammensetzt. Aus operationeller Sicht wird dabei insbesondere die Tatsache des Rollenmodells herausgearbeitet.
174
1
1. Einleitung
Einleitung
Die grundlegende Idee eines Subskriptionssystems, wie es ausführlich in [LeHü01] und
[LeHR01] geschildert und demgemäß an dieser Stelle nur skizzenhaft wiedergegeben wird,
besteht darin, dass ein Benutzer eine Subskription in Sinne einer permanenten Anfrage (standing query) absetzt und beim Eintreffen relevanter Informationen über die entsprechenden
Vorgänge informiert wird, in dem Daten proaktiv unter Rückgriff auf diverse Auslierungsmechanismen zurückgeliefert werden. Ziel ist ein skalierbares und effizientes datenbankgestütztes Subskriptionssystem zu entwerfen. Der vorliegende Artikel beschreibt die architekturorientierten Merkmale des PubScribe-Systems aus zwei unterschiedlichen Perspektiven: So
wird aus makroskopischer Sichtweise ein verteilter Architekturansatz in Abschnitt 4 erläutert. Darin wird das PubScribe-System als ein Netz verteilter Server mit unterschiedlichen
Strategien zur kooperativen Auswertung von Subskriptionen beschrieben. Darüberhinaus
wird eine mikroskopische Sichtweise verfolgt, indem die einzelnen Module einer PubScribeVermittlungskomponente erläutert werden. Im Kontext dieser Aufarbeitung, wie sie in den
nachfolgenden Abschnitten erfolgt, wird weiterhin die Metadatenverwaltung und die Persistenzsicherung erläutert.
2
Architekturentwurf
Dem PubScribe-System liegt eine drei-schichtige Architektur aus mehreren Subskribenten als
Clients, einem zentralen Broker auf der Server-Schicht und durch Wrapper gekapselten
Datenquellen zugrunde (Abbildung 2.1). Der Informationsfluss erfolgt dabei von den Quellen durch das System, wo sie verarbeitet werden, hindurch zu den jeweiligen Subskribenten.
ClientSchicht
S
S
S
S
Deliverer
Mail SMS Web
ServerSchicht
...
Port
Core Broker
Informationsfluss
Publisher Hander
Wrapper
WrapperSchicht
Web-Seite
Warehouse
Systemzeit
Abb. 2.1: Mikroarchitektur des PubScribe-Systems
2.1 Client-Schicht
175
2.1 Client-Schicht
Die Client-Schicht beinhaltet verschiedene Benutzerschnittstellen zum Erstellen, Editieren
und manuellen Löschen von Subskriptionen. Im einfachsten Fall handelt es sich dabei um
eine Web-Seite, über die der Benutzer eine vorgegebene Subskription parametrieren kann.
Diese wird dann durch ein CGI-Programm (‘Common Gateway Interface’) erzeugt und zum
Server übertragen. Weit komplexer ist ein graphisches Werkzeug, das ein interaktives Erstellen eines beliebigen Operatorenbaumes erlaubt (z.B. mit Drag&Drop). Denkbar ist auch eine
Integration der Schnittstelle in bestehende Software (z.B. eine Data-Warehouse-Anwendung
oder ein Überwachungssystem für Börseninformationen). Unabhängig davon, wie die Benutzerschnittstelle realisiert wird, erfolgt die Kommunikation mit dem Server immer über
XML, indem der Client die Subskriptionsdefinition in Form eines XML-Dokuments erzeugt
und es an den Server übermittelt.
2.2 Wrapper-Schicht
Jede Datenquelle, die in das PubScribe-System integriert werden soll, ist durch einen Wrapper
zu kapseln, der Änderungen der Daten erkennt und diese in einem einheitlichen, für die Datenquelle spezifischen Schema, ihrem Exportschema, publiziert.
Die Erkennung von Änderungen erfolgt abhängig von der Datenquelle entweder synchron
(z.B. durch einen Trigger in einem Datenbanksystem) oder asynchron durch regelmäßiges
Abfragen der Quelle und den Vergleich mit ihrem letzten Zustand (z.B. bei Web-Seiten). Um
die Anzahl der Publikationen zu reduzieren, kann der Wrapper Änderungen akkumulieren
und gesammelt an den Broker schicken. Dabei muss er jedoch sicherstellen, dass keine Änderung älter als die bei der Registrierung angegebene Mindestaktualität ist. Bei asynchroner
Änderungserkennung lässt sich der genaue Zeitpunkt der Modifikation von Daten häufig
nicht exakt ermitteln; per Konvention muss daher ein anderer angenommen werden. Naheliegende Möglichkeiten sind der frühestmögliche Zeitpunkt der Änderungen, ihr Erkennungszeitpunkt oder der Mittelwert dieser beiden Extreme.
Entdeckte Änderungen werden in Form einer Queue, die als XML-Dokument repräsentiert
ist, an den Broker übertragen. Während die Daten in Datenbanksystemen in strukturierter
Form vorliegen, ist dies z.B. bei Web-Seiten nicht der Fall. Ein generischer Wrapper könnte
das gesamte HTML-Dokument in einem einzigen Datenfeld publizieren. Ein Zugriff auf Inhalte (z.B. den Titel oder die einzelnen, enthaltenen Informationen) ist einer Subskription
dann jedoch nahezu unmöglich. Zwar ist jede beliebige Extraktion durch geeignet mächtige
Skalarfunktionen prinzipiell machbar, jedoch sehr aufwendig und benutzerunfreundlich. Daher publizieren individuell angepasste Wrapper die relevanten Informationen der Quelle in
einem geeigneten Schema. Im Kontext eines bestimmten Informationskanals als unnötig erachtete Daten wie z.B. Werbebanner, Verweise auf andere Seiten o.ä. bleiben unberücksichtigt. Zur automatischen Erzeugung solcher Wrapper sei beispielsweise auf die in [LiPH00]
beschriebene XWRAP-Technologie verwiesen.
176
2. Architekturentwurf
2.3 Server-Schicht: Vermittlung/Broker
Im Server laufen Subskriptionen und Publikationen zusammen. Hier erfolgt die Auswertung
der Subskriptionen auf den durch Publikationen gelieferten Daten. Um eine Skalierbarkeit
innerhalb dieser Schicht zu gewährleisten, ist der Server in kleinere Funktionseinheiten aufgeteilt, die bei Bedarf in beliebiger Kombination auf mehrere Rechner verteilt werden können. Im einzelnen sind dies der Publisher Handler, der Subscription Handler, der Deliverer
und der Core Broker. Sie sind schematisch in Abbildung 2.1 dargestellt und werden in den
folgenden Abschnitten beschrieben.
Der Subscription Handler nimmt neue Subskriptionen sowie Modifikations- und Löschaufforderungen des Benutzers entgegen. Subskriptionen werden zuerst in Kooperation mit den
anderen Broker-Komponenten auf Validität und Ausführbarkeit überprüft. So kontrolliert
beispielsweise der Core Broker, ob alle Datenquellen existieren, ob die Aktualitätsbedingung
erfüllt werden kann und ob die Operatoren und Operanden typkompatibel sind. Der Deliverer prüft, ob er in der Lage ist, Nachrichten über das geforderte Medium auszuliefern. Ist die
Subskription ausführbar, so bekommt sie eine eindeutige Identifikation (sID) zugewiesen
und wird an den Core Broker weitergeleitet, der sie zur Bearbeitung vorbereitet.
Publisher Handler
Der Publisher Handler ist für die Kommunikation mit den Wrappern der Datenquellen verantwortlich. Er nimmt Registrierungen und Publikationen entgegen und validiert diese. Zur
Verarbeitung werden sie an den Core Broker weitergeleitet. Bei der Registrierung neuer Produzenten wird eine eindeutige Kennung (pID) generiert, die an den Wrapper zurückgegeben
wird. Dieser benutzt sie als Identifikation, um die Zugehörigkeit einer Publikation zu dem
entsprechenden Informationskanal kenntlich zu machen.
Deliverer
Die Übermittlung von Anfrageergebnissen an den Anwender wird vom Core Broker initiiert
und vom Deliverer durchgeführt. Dabei ist prinzipiell zwischen der Auslieferung der Ergebnisse und der Notifikation zu unterscheiden. Bei der Auslieferung erhält der Anwender das
gesamte Ergebnis, bei der Notifikation wird ihm lediglich eine Referenz darauf zugestellt
und er kann es bei Bedarf abrufen. Der Deliverer enthält Teilkomponenten, die die Übermittlung über verschiedene Medien realisieren. Naheliegend ist die Zustellung von Anfrageergebnissen über E-Mail, Fax oder SMS. Zur Notifikation ist es denkbar, die Resultate auf einem Web-Server zu hinterlegen und den entsprechenden Link als Nachricht über eines der
genannten Medien zu versenden. Sollen die Daten auf Anwenderseite nachbearbeitet werden
(z.B. im Fall einer Web-Seite mit generischem Wrapper), so erscheint es hilfreich, diese dem
entsprechenden Programm direkt über eine Netzverbindung zu übermitteln. Dieser Mechanismus wird auch zur internen Kommunikation im Server benötigt (Abschnitt 3.3).
2.4 Rollenmodell des PubScribe-Systems
177
Core Broker
Der Core Broker bildet die Kontroll- und Verarbeitungskomponente des PubScribe-Servers.
Hier treffen Publikationen und Subskriptionen aufeinander und die eigentliche Verarbeitung
der Daten erfolgt. Bei der Ankunft einer neuen Publikation koordiniert der Core Broker die
Auswertung der Bedingungen. Abhängig von dem Resultat wird eine Subskription je nach
ihrem Zustand aktiviert, aus dem System entfernt oder die Auswertung und Auslieferung ihres Rumpfs angestoßen. Um die initiale Auswertung einer Subskription vornehmen zu können, kommuniziert der Core Broker mit den Wrappern der Datenquellen. Damit ein Broker
möglichst viele Subskriptionen und Publikationen ohne Überlastung verarbeiten kann, werden Bedingungen und Rümpfe von Subskriptionen vor ihrer Ausführung in einem mehrstufigen Prozess optimiert.
2.4 Rollenmodell des PubScribe-Systems
Die Begriffe ‘Produzent’ und ‘Subskribent’ sind im PubScribe-System Namen für verschiedene Rollen. Eine Komponente spielt die Rolle eines Produzenten, wenn sie aktiv Daten zur
Verfügung stellt. Die Rolle eines Subskribenten ist dadurch gekennzeichnet, dass er über bestimmte Dinge informiert werden will und dies durch eine Subskription mitteilt.
Die Systemkomponenten ‘Publisher’ und ‘Subscriber’ sind nach den Rollen, die sie im Gesamtsystem spielen, benannt. Der Broker hat als Vermittler zwischen beiden Teilen eine
Doppelrolle. Gegenüber den Subskribenten verhält er sich wie ein Produzent, für einen Produzenten wie ein Subskribent. Der Unterschied ist lediglich der, dass die Aktivität bei der Registrierung einer Subskription vom Subskribent ausgeht, bei der Anmeldung eines Informationskanal dagegen vom Produzent. Der Registrierungsvorgang kann jedoch auch so gedeutet werden, dass der Produzent den Broker von der Existenz eines neuen Kanals informiert,
worauf sich dieser (implizit) für den gesamten Kanalinhalt subskribiert.
3
Informationskanäle des Brokers
Jeder Broker besitzt neben den Informationskanälen, die für die registrierten Produzenten erzeugt werden, mehrere Kanäle, die die aktuelle Systemzeit sowie Metadaten von Produzenten und Subskriptionen beinhalten. Ebenso wird die Wiederherstellbarkeit des Systemzustands nach einem Neustart durch die Speicherung des Zustands aller Komponenten in Informationskanälen sichergestellt.
Die einheitliche Integration von Systemzeit, Metadaten und Persistenzsicherung durch Informationskanäle gestaltet den Entwurf des Gesamtsystems einfacher und klarer. Die Speicherung von Daten erfolgt durch die Sicherung der Kanalzustände, der Zugriff durch Subskriptionen. Diese gleichartige Repräsentation von Systemdaten hat allerdings den Preis des erhöhten Aufwands. So erzeugt z.B. jede Publikation eine weitere, die die Metadaten
178
3. Informationskanäle des Brokers
aktualisiert. Das Einfügen einer Subskription erzeugt zusätzlich zur Publikation ihrer Metadaten mehrere weitere Publikationen, die den aktualisierten Zustand der veränderten Systemkomponenten bekannt machen. In den nachstehenden Abschnitten werden die genannten Informationskanäle näher beschrieben, ihr Schema dargestellt und ihre Funktionsweise verdeutlicht.
3.1 Systemzeit
Die Bereitstellung der Systemzeit zur Verwendung in Subskriptionen erfolgt über einen Informationskanal Zeit. Da dieser zu jedem Zeitpunkt nur einen gültigen Wert enthält, ist keine
Identifikation erforderlich; er besitzt das folgende Schema:
([], [(zeit, datetime)])
Beim Start des Brokers wird ein Produzent erzeugt, der sich registriert und dann in regelmäßigen Abständen die aktuelle Systemzeit publiziert. Die Häufigkeit der Publikationen und
damit die zeitliche Auflösung hängt von den an das System gestellten Ansprüchen ab.
Diese Sichtweise vereinfacht die Ereignisbehandlung z.B. im Vergleich mit dem OpenCQProjekt ([LiPT99]) erheblich, da die Unterscheidung zwischen zeitabhängigen und datenabhängigen Bedingungen entfällt. Das Konzept des Gesamtsystems gestaltet sich dadurch
einheitlicher.
3.2 Metadaten
Informationen über die registrierten Produzenten sowie die im System vorhandenen Subskriptionen werden als Metadaten in eigenen Informationskanälen verwaltet. Sie werden somit genau wie die Daten ‘normaler’ Informationskanäle behandelt und sind auch wie diese
durch Subskriptionen zugreifbar.
Metadaten der Produzenten
Die Metadaten eines Produzenten enthalten neben den Registrierungsdaten Informationen
über seine bisherigen Publikationen. Das Schema des Kanals ist zusammen mit der Bedeutung der Attribute im Folgenden aufgeführt:
([(pID, integer)
],
[(channel, string),
(registration, string),
(regTime, datetime),
(publCount, integer),
(publTupleCount, integer),
// ID des Produzenten
//
//
//
//
//
Name des Informationskanals
XML-Dokument der Registrierung
Zeitpunkt der Registrierung
Anzahl der Publikationen
Anzahl der publizierten
Tupel
(lastPublTime, datetime), // Zeitpunkt der letzten
Publikation
(lastPubl, string)
// XML-Dokument der letzten
3.2 Metadaten
179
Publikation
]
)
Die meisten der Attribute sind selbsterklärend. Der Unterschied zwischen publCount und
publTupleCount ergibt sich aus der Möglichkeit, mehrere Tupel in einer Publikation zusammenzufassen. Das erstgenannte Attribut bezeichnet die Anzahl der Publikationen, das
letztgenannte die Anzahl der publizierten Tupel. Die Verwendung der Metadaten wird in dem
am Ende dieses Abschnitts folgenden Beispiel illustriert.
Metadaten der Subskriptionen
Analog zu den Produzenten verwaltet das System auch für jede Subskription einen Metadatensatz. Sein Schema gliedert sich einerseits in Daten, die in der Subskriptionsdefinition enthalten sind, und andererseits in Informationen zum bisherigen Verlauf der Subskription.
Nachfolgend ist das Schema des Metainformationskanals sowie die Bedeutung der Attribute
dargestellt:
([(sID, integer)
],
[(recipient, string),
(subscription, string),
(state, string),
// ID der Subskription
// Typ und Adresse des Empfängers
// XML-Dokument der Subskription
// Zustand der Subskription
(nicht aktiviert, aktiviert)
(regTime, datetime),
// Zeitpunkt der Registrierung
(actTime, datetime),
// Zeitpunkt der Aktivierung
(delivCount, integer),
// Anzahl der Auslieferungen
(lastDelivTime, datetime), // Zeitpunkt der letzten
Auslieferung
(lastDeliv, string)
// XML-Dokument der letzten
Auslieferung
]
)
Die Publikation von Metadaten in eigene Informationskanäle fügt sich nahtlos ins Gesamtkonzept ein. Subskriptionen haben Zugriff auf Metadaten wie auf jede andere Datenquelle
auch. Eine Sonderbehandlung ist jedoch bei den Subskriptionsmetadaten notwendig: aus
Gründen des Datenschutzes sollen Subskriptionen nur auf ihre eigenen Metadaten zugreifen
können. Um dies zu gewährleisten, fügt der Core Broker beim Aufbau des Operatorenbaums
implizit vor jedem Ressourcen-Knoten, der die Metadaten bezeichnet, eine Selektion ein, die
fremde Metadaten herausfiltert.
Die Verwendung von Metadaten in Subskriptionen wird durch das folgende Beispiel illustriert. Dazu zeigt Abbildung 3.1 den Rumpf einer Subskription, die die Namen aller seit ihrer letzten Auslieferung registrierten Informationskanäle sowie deren Registrierungszeitpunkt liefert. Dazu werden die Metadaten der Produzenten (P-Meta) mit denen der Subskription (S-Meta) verbunden. Dieses kartesische Produkt enthält jedoch nur so viele Tupel wie
der P-Meta-Kanal, da der rechte Ast des Verbundes nur einen Datensatz liefert. Der folgende
Skalaroperator belegt das neue Attribut regLater mit dem Ergebnis des Vergleichs vom Zeit-
180
3. Informationskanäle des Brokers
([pID, sID, regLater], [channel, regTime])
Selektion
regLater=true
([pID, sID, regLater], [channel, regTime])
Shift
regLater
([pID, sID], [channel, regTime, regLater])
Skalar
channel=id(channel)
regTime=id(regTime)
regLater=greater(regTime, lastDelivTime)
([pID, sID], [channel, regTime, ..., lastDelivTime, ...])
Verbund
([pID], [channel, regTime, ...])
P-Meta
([sID], [lastDelivTime, ...])
Selektion
([pID], [channel, regTime, ...])
sID=SID
([sID], [lastDelivTime...])
S-Meta
Abb. 3.1: Rumpf einer Subskription mit Metadaten
punkt der Registrierung und dem der letzten Auslieferung. Das Attribut wird anschließend
in den Identifikationsteil verschoben, um dann diejenigen Tupel, für die es wahr ist, auszuwählen. Die oben erwähnte implizite Selektion zum Herausfiltern der Metadaten fremder
Subskriptionen ist in der Abbildung grau schattiert dargestellt. SID ist die vom Subscription
Handler vergebene, eindeutige Kennung der Subskription.
3.3 Persistenz des PubScribe-Systems
Damit ein Broker bei einem Neustart an der Stelle wiederaufsetzen kann, an der er heruntergefahren wurde (bzw. abgestürzt ist), muss die Persistenz seines Verarbeitungszustands sichergestellt werden. Zu diesem Zweck publiziert jede Systemkomponente ihre Zustandsänderungen in einen oder mehrere dafür vorgesehene Informationskanäle. Bei einem Wiederanlauf verschafft sie sich ihre Zustandsinformationen durch eine Subskription auf den
gesamten Kanalinhalt und rekonstruiert daraus ihren letzten Zustand. Damit eine solche Subskription realisierbar ist, muss der Informationskanal abfragbar sein, d.h. es muss eine Komponente existieren, die seinen Zustand sichert. Um den Mehraufwand gering zu halten, wird
diese Aufgabe vom Broker selbst übernommen. Dabei muss sichergestellt werden, dass die
Zustandsänderung, ihre Publikation und die Sicherung transaktional geschützt erfolgen.
Hierzu stehen verschiedene Mechanismen verteilter und geschachtelter Transaktionen zur
Verfügung, auf die jedoch an dieser Stelle nicht näher eingegangen werden soll.
Die Subskription zur Wiederherstellung des Zustands muss die asynchrone Auslieferung der
Ergebnisse in eine synchrone Weiterverarbeitung einbinden. Dazu macht sie Gebrauch von
der direkten Auslieferung der Ergebnisse über eine Netzverbindung. Sie öffnet einen Kommunikationsendpunkt (‘Port’), schickt die Subskription ab und wartet auf die Ergebnisse. Da
es sich um eine synchrone Subskription handelt, werden diese sofort nach der Auswertung
verschickt.
3.3 Persistenz des PubScribe-Systems
181
Es sei noch angemerkt, dass die Zustandsinformationen der Komponenten auch zur Administration des Systems und zur Fehlersuche verwendet werden können. Sollen beispielsweise
die in der Data Processing Unit enthaltenen Operatorenbäume überwacht werden, so kann
ein Werkzeug eine Subskription auf den entsprechenden Informationskanal einrichten. Es
wird somit bei jeder Änderung sofort informiert und kann darauf z.B. mit der Aktualisierung
seiner Darstellung reagieren. In einem realen System sollte jedoch ein Zugriffsschutz für derartige Informationen vorgesehen werden, damit nicht jeder Anwender Einsicht in interne
Strukturen des Brokers erlangen kann.
Die Sicherung des Zustands einer Komponente wird am Beispiel des Publisher
Handlers und einer eintreffenden Publikation erläutert. Erreicht eine neue Publikation den Publisher Handler, so ist
er für deren Verifikation, die Aktualisierung der Metadaten und die Weiterleitung an den Core Broker verantwortlich
(Abschnitt 2.3). Sein Zustand beinhaltet
die noch nicht weitergeleiteten Publikationen und den Status ihrer Verarbeitung
und wird in drei Informationskanälen
gespeichert. Die zur Zustandssicherung
notwendigen Publikationen sind in Abbildung 3.2 schematisch in zeitlicher
Abfolge dargestellt und werden im Folgenden erläutert.
Produzent
Publisher Handler
Core Broker
Publikation
Publikation sichern ID-Vergabe
Bestätigung
Verifikation
Zustandsaktualisierung
BI der Metadaten sichern
Metadatenaktualisierung
Zustandsaktualisierung
BI der Metadaten löschen
Weiterleitung
Persistenzsicherung
Bestätigung
Publikation löschen
Trifft eine neue Publikation ein, so bekommt diese zuerst eine eindeutige
Abb. 3.2: Ablauf der Zustandssicherung
Identifikation zugewiesen und wird in
dem Kanal, der unvollständig bearbeitete Publikationen enthält, gesichert. Damit ist ihre Persistenz gewährleistet und dem Produzent kann der Erhalt der Nachricht bestätigt werden.
Nachdem die Publikation verifiziert ist, wird ihr Bearbeitungszustand durch eine Publikation
in einen weiteren Kanal auf unbearbeitet gesetzt. Da die Verifikation jederzeit ohne Seiteneffekte wiederholt werden kann, muss ihr Abschluss nicht protokolliert werden. Dann
werden die Metadaten des Produzenten ausgelesen und als Kopie (‘Before-Image’, BI) in einem dritten Kanal gesichert. Jetzt können die aktualisierten Metadaten publiziert und die Sicherungskopie gelöscht werden. Damit keine Inkonsistenzen entstehen, darf der Prozess der
Metadatenaktualisierung nicht für mehrere Publikationen eines Kanals gleichzeitig erfolgen.
Der Zustand der Bearbeitung wird nun durch eine entsprechende Publikation auf Metadaten aktualisiert gesetzt.
182
4. Verteilter Architekturentwurf
Im letzten Schritt wird die Publikation an den Core Broker weitergegeben. Dieser publiziert
sie zur Persistenzsicherung in einen seiner Zustandskanäle und schickt eine Bestätigung an
den Publisher Handler. Er kann die Publikation nun aus seinem Kanal der zu bearbeitenden
Publikationen löschen und die zugehörigen Zustandsinformationen entfernen. Damit dieser
Mechanismus funktioniert, darf der Zustand des Publisher Handlers bei der Verarbeitung von
Zustands-Publikationen nicht gesichert werden, weil sonst eine endlose Kette derartiger Publikationen entstehen würde. Es wird dadurch in Kauf genommen, dass die Korrektheit der
Metadaten der Zustandskanäle im Fall eines Systemabsturzes nicht mehr garantiert werden
kann. Da sie jedoch nicht zum Betrieb benötigt werden, ist dies unproblematisch.
4
Verteilter Architekturentwurf
Um eine bessere Skalierbarkeit zu erzielen, werden mehrere lokale Instanzen des PubScribeSystems über ein Netzwerk verteilt. Wie in Abbildung 4.1 dargestellt, sind Subskriptionen
und Publisher dabei wie im unverteilten Fall einer der Instanzen zugeordnet.
Sowohl zum Benutzer als auch zu den Produzenten hin soll sich dieser Verbund wie ein unverteiltes System verhalten (Verteilungstransparenz). Dies bedeutet insbesondere, dass eine
Subskription unabhängig von dem Broker, bei dem sie registriert wird, und unabhängig von
dem Broker, bei dem die referenzierten Datenquellen angemeldet sind, immer dasselbe Ergebnis liefern muss. Um dies zu ermöglichen, ist die Kommunikation zwischen den Verbundpartnern erforderlich. Es wird davon ausgegangen, dass das Verbindungsnetz den Nachrichtenaustausch zwischen beliebigen Komponenten erlaubt.
Die Kommunikation zwischen Brokern erfolgt über bereits von der lokalen Architektur her
bekannte Mechanismen, nämlich Informationskanäle und Subskriptionen, und fügt sich somit nahtlos in das Gesamtkonzept des Systems ein. Benötigt ein Broker z.B. zur Beantwortung einer Anfrage Informationen, über die er nicht selbst verfügt, so subskribiert er sich bei
einem anderen für diese. Er nimmt damit die Rolle eines Subskribenten ein, der andere Broker die Rolle eines Produzenten. Wie in Abbildung 4.1 durch die Verbindungspfeile dargestellt, können zwischen den Broker-Instanzen eines Gesamtsystems eine Vielzahl solcher
Abhängigkeiten existieren, sodass ein Broker auch beide Rollen gleichzeitig innehaben
kann. Der linke untere Broker der Abbildung ist z.B. Publisher für die beiden oberen und
Subskribent des rechten unteren. Das entstehende Netz von Abhängigkeiten zwischen Brokern kann beliebig komplex werden, es dürfen jedoch keine Zyklen auftreten. Das in Abschnitt 2.4 für ein unverteiltes System dargestellte Rollenkonzept wird durch dieses Konzept
auch in einer verteilten Architektur eingehalten.
Eine Liste der dem Gesamtsystem angehörenden Broker wird ebenso über einen globalen
Metadatenkanal zur Verfügung gestellt, wie Informationen darüber, welcher Broker über
welche Informationskanäle verfügt. Das Schema der beiden Kanäle wird in Abschnitt 4.1 beschrieben. Ein ausgezeichneter Broker (in Abbildung 4.1 links oben schattiert dargestellt)
4.1 Der globale Metadatenkanal
183
übernimmt die Verwaltung der Kanäle, was jedoch eine Replikation nicht ausschließt. Um
die Daten aktuell halten zu können, müssen sich Broker bei ihm registrieren und er muss über
neue und entfernte Informationskanäle oder Replikate informiert werden.
4.1 Der globale Metadatenkanal
In einer verteilten Umgebung dienen die globalen Metadaten einerseits dazu, die am Gesamtsystem beteiligten Broker zu verwalten, andererseits enthalten sie Informationen darüber,
welche Datenquellen bei welchem Broker mit welcher Aktualität verfügbar sind. Sie werden
durch zwei Informationskanäle realisiert. Der erste Kanal enthält alle registrierten Broker
und hat das folgende Schema:
([(bID, integer)
],
[(hostname, string),
S
S
// ID des Brokers
// Rechnername
S
S
S
S
Core Broker
Publisher Handler
Wrapper
S
Deliverer
Deliverer
S
Wrapper
Core Broker
Publisher Handler
Wrapper
Wrapper
Wrapper
Wrapper
Warehouse
Web-Seite
Metadaten
S
S
S
S
S
S
Core Broker
Publisher Handler
Wrapper
Wrapper
Wrapper
Core Broker
Publisher Handler
Wrapper
Warehouse
Web-Seite
S
Deliverer
Deliverer
S
Wrapper
Wrapper
Warehouse
Systemzeit
Web-Seite
Abb. 4.1: Makroarchitektur des PubScribe-Systems
184
(port, integer)
// Port-Nummer
]
)
Das Attribut bID identifiziert einen Broker eindeutig und wird bei seiner Registrierung von
dem ausgezeichneten Broker vergeben. Die beiden Informationsattribute hostname und
port geben an, über welche Adresse eine Subskription bei diesem Broker in des System eingebracht werden kann.
Der zweite Metadatenkanal enthält Informationen darüber, welcher Kanal (name) bei welchem Broker (bID) mit welcher Aktualität (qos) geführt wird. Das Attribut master gibt an,
ob es sich bei dem Informationskanal um den primären Kanal oder ein Replikat (Abschnitt
4.2) handelt. Die Metadaten besitzen das folgende Schema:
([(name, string),
(bID, integer)
],
[(qos, integer),
(master, boolean)
]
)
// Name des Informationskanals
// ID des Brokers
// Aktualität der Daten
// primärer Kanal oder Replikat
Globale Metadaten müssen abfragbar sein, um einerseits den Zugriff auf in der Vergangenheit publizierte Daten zu erhalten, aber auch, um einen Neustart des ausgezeichneten Brokers
zu überdauern. Da die Metadaten bei ihm anfallen, ist er auch für die Realisierung der Abfragbarkeit zuständig.
4.2 Zusammenführung von Publikationen und Subskription
Referenziert eine Subskription einen Informationskanal, der nicht beim lokalen Broker vorhanden ist, so existieren prinzipiell zwei Möglichkeiten: entweder verschafft sich der Broker,
bei dem die Subskription eingegangen ist, die Daten (‘data shipping’) oder sie wird bei einem Broker ausgeführt, der über die entsprechenden Daten verfügt (‘query shipping’). Die
erste Variante wird im Kontext des PubScribe-Systems als Kanalreplikation (‘channel replication’) bezeichnet, die letztere je nach Ausprägung als entfernte Auswertung (‘remote evaluation’) bzw. Subskriptionsweiterleitung (‘subscription forwarding’). Die drei Verfahrensweisen werden nachfolgend beschrieben.
Kanalreplikation
Bei der Kanalreplikation wird die Subskription von dem Broker, bei dem sie eingegangen ist,
ausgeführt. Da die benötigten Daten nicht vorhanden sind, müssen sie beschafft werden. Dieser Vorgang ist in Abbildung 4.2 schematisch dargestellt. Durch eine synchrone Subskription
ermittelt der Broker aus den globalen Metadaten zuerst diejenigen Broker, die über die geforderten Daten in genügend hoher Aktualität verfügen (Schritt 1 und 2). Einer der möglichen Kanäle wird als Quelle des neu anzulegenden Replikats ausgewählt. Bei diesem (in der
Abbildung links unten) subskribiert sich der Broker dann für die gesamten Publikationen des
185
([name, bID, qos, bID2], [master, hostname, port])
Verbund
bID = bID
name = ‘Aktien’
qos ≤ 10
Selektion
Broker
([bID],
[hostname, port])
([name, bID, qos], [master])
qos
Shift
([name, bID, qos], [master])
Kanäle
([name, bID],
[qos, master])
Abb. 4.3: Subskription zur Abfrage der globalen Metadaten
Kanals und erzeugt somit eine lokale Kopie der Informationen (Schritt 3). Über die Auslieferungsbedingung kann er die gewünschte Aktualität des Replikats steuern. Nun verfügt der
Broker über die notwendigen Daten und kann die Auswertung der Subskription lokal vornehmen. Durch eine Publikation in den Metadatenkanal macht er bekannt, dass er nun auch
über eine lokale Kopie des Informationskanals verfügt (Schritt 4). Regelmäßige Auslieferungen der Subskriptionsergebnisse halten das Replikat auf den aktuellen Stand (Schritt 5). Das
folgende Beispiel illustriert den Vorgang der Kanalreplikation.
Als Beispiel zum Ablauf einer Kanalreplikation sei folgendes Szenario gegeben: Zur Auswertung einer Subskription muss ein Broker ein Replikat des Informationskanals Aktien erzeugen, welches die von der Subskription geforderte Aktualität von ‘10’ besitzen soll. Der
Rumpf der synchronen Subskription, die die als Quelle in Frage kommenden Broker ermittelt, ist in Abbildung 4.3 dargestellt. Zunächst werden darin alle Aktien-Kanäle mit einer aus-
S
2
S
Deliverer
Deliverer
1
Core Broker
Publisher Handler
Wrapper
Core Broker
Publisher Handler
Wrapper
Stub
Wrapper
Metadaten
5
S
Deliverer
4
3
Core Broker
Publisher Handler
Wrapper
Web-Seite
Abb. 4.2: Ablauf der Kanalreplikation
186
([wp, hp, zeit], [kurs, vol])
Schalter
([sID], [bedingungOK])
Skalar
Aktien
([wp, hp, zeit],
[kurs, vol])
bedingungOK = greater(differenz, 10)
([sID], [differnz])
Skalar
differenz = minus(uhrzeit, last)
([sID], [uhrzeit, lastDelivTime, ...])
Verbund
([], [uhrzeit])
Zeit
([], [uhrzeit])
Selektion
id = SID
Metadaten
Abb. 4.4: Subskription zur Replikation des Aktien-Kanals
reichend hohen Aktualität selektiert. Um die Adressen der zugehörigen Broker zu erhalten,
werden diese anschließend mit den entsprechenden Metadaten verbunden und ein Broker als
Quelle für die Kopie ausgewählt.
Die Etablierung des Replikats erfolgt durch die in Abbildung 4.4 dargestellte Subskription
(Rumpf und Auslieferungsbedingung). Mit Hilfe der Metadaten der Subskriptionen initiiert
die Bedingung eine Auslieferung alle 10 Minuten. Für die Einhaltung des Zeitintervalls sorgt
der Skalaroperator mit der Bedingung differenz ≥ 10. Da der Broker lediglich an Änderungen
interessiert ist, erfolgt die Auswertung des Rumpfes partiell, d.h. zwischen zwei Benachrichtigungen gleich gebliebene Daten werden nicht erneut ausgeliefert. Die Startbedingung der
Subskription ist konstant wahr, sodass die Kopie mit sofortiger Wirkung erstellt wird. Wie
lange das Replikat tatsächlich benötigt wird, ist zum Zeitpunkt seiner Etablierung noch unklar, da es möglicherweise auch von noch nicht existierenden Subskriptionen genutzt wird.
Die Stopbedingung wird daher so gewählt, dass sie niemals wahr wird. Zur Entfernung des
Replikats wird die Löschfunktion der Administrationsschnittstelle benutzt. Als letzter Schritt
des Vorgangs muss die Existenz des neuen Replikats allen Brokern durch eine Publikation in
den globalen Metadatenkanal bekannt gemacht werden, die wie folgt aussieht:
([’Aktien’, 3], [10, false])
Zur initialen Auswertung einer Subskription muss der betreffende Informationskanal abfragbar sein. Verfügt der das Replikat haltende Broker bereits über des Gesamtzustand des Kanals, so kann die initiale Auswertung direkt bei diesem erfolgen. Andernfalls wird der entsprechende Operatorenbaum in Form einer Subskription zum Broker, der das Replikat speist,
weitergeleitet. Realisiert dieser die Abfragbarkeit des Zustands, so erfolgt die Initialauswertung bei ihm. Ist auch das nicht der Fall, wird der Operatorenbaum zum Publisher selbst weitergeleitet. Ist der Informationskanal abfragbar, so ist spätestens hier eine initiale Auswertung möglich.
187
Prinzipiell lässt sich der Mechanismus der Kanalreplikation auf alle Informationskanäle anwenden. Aus Gründen des Datenschutzes sollte jedoch eine Replikation der SubskriptionsMetadaten untersagt werden, sodass der Zugriff auf sie immer beim die Subskription koordinierenden Broker erfolgen muss. Da sehr viele Subskriptionen die Systemzeit benötigen,
empfielt es sich, diese nicht durch Replikation verfügbar zu machen, sondern bei jedem Broker lokal zu generieren. Der Mehraufwand wird so reduziert und die Netzbelastung minimiert.
Entfernte Auswertung
Die entfernte Auswertung ist eine Verallgemeinerung der Kanalreplikation. Während bei der
Letztgenannten immer komplette Quellen repliziert werden, kann die entfernte Auswertung
als Subskription auf einen bereits vorverarbeiteten Kanal angesehen werden.
Nachdem der Operatorenbaum durch die DPU analysiert wurde, wird ein Teil, der nur von
einer nicht lokal vorhandenen Datenquelle abhängt, abgetrennt. Er wird dann in Form einer
Subskription an einen Broker gesendet. Es entsteht ein neuer virtueller Informationskanal,
der bereits vorverarbeitete Daten zur Verfügung stellt. Dieser wird statt des Teilbaums in der
Subskription referenziert. Jedesmal, wenn die weitergeleitete Subskription ausgewertet wird,
gelangen die Ergebnisse durch Publikationen in den virtuellen Kanal und werden so in die
Auswertung des Operatorenbaums integriert. Der Ablauf ist also exakt der gleiche wie der in
Abbildung 4.2 dargestellte. Der einzige Unterschied zur Kanalreplikation besteht darin, dass
in Schritt 3 eine andere Subskription registriert wird. Das nachfolgende Beispiel verdeutlicht
die Vorgehensweise.
Zur Beantwortung einer Subskription muss besipielsweise der in Abbildung 4.5 dargestellte
Ausschnitt eines Operatorenbaums ausgewertet werden. Die Daten der mit den Knoten
Ressource2 und Ressource3 korrespondierenden Informationskanäle liegen lokal vor, die
von Ressource1 nicht. Um möglichst wenige Daten transferieren zu müssen, wird der größte
Teilbaum, der nur von diesem Kanal abhängt (in der Abbildung grau hinterlegt), auf den zuvor ermittelten Broker übertragen. Die dazu benutzte Subskription entspricht der in Abbildung 4.4, jedoch ist ihr Rumpf durch den grau hinterlegten Teilbaum ersetzt. Die Häufigkeit
der Auslieferung im rechten Teil wird durch die geforderte Aktualitätsbedingung der Subskription ausgetauscht. Im lokalen Broker wird der Teilbaum durch den Ressourcenknoten
des durch die Subskription entstandenen virtuellen Informationskanals ersetzt.
Gehört der entfernt auszuwertende Operatorenbaum dem aktiven Teil eines Schalters, also
der Bedingung an, so wird die Subskription zur entfernten Auswertung beim Einbringen des
Baums aktiviert und bei dessen Löschung entfernt. Liefert sie Ergebnisse – erfolgen also Publikationen in den virtuellen Kanal – so löst dies im Operatorenbaum der Benutzer-Subskription Trigger aus. Ist der entfernt auszuwertende Baum hingegen Teil des passiven Zweigs,
muss sein Ergebnis bei jeder Auswertungsaufforderung durch eine synchrone Subskription
abgefragt werden.
188
Skalar
Gruppierung
Verbund
Selektion
Vereinigung
Shift
Selektion
Gruppierung
Skalar
Ressource2
Skalar
Ressource1
Ressource3
Abb. 4.5: Entfernte Auswertung eines Operatorenbaums
Subskriptionsweiterleitung
Im Gegensatz zur entfernten Auswertung, bei der lediglich Teile des Operatorenbaums bei
einem anderen Broker ausgewertet werden, die Kontrolle aber beim Empfänger der Subskription bleibt, wird bei der Subskriptionsweiterleitung die gesamte Verarbeitung an einen
anderen Broker übertragen. Damit die Subskription mit einem geringeren Aufwand an Ressourcen ausgewertet werden kann, sollte dieser über möglichst viele der benötigten Datenquellen in ausreichender Aktualität verfügen. Ein geeigneter Broker wird aufgrund der Metadaten ausgewählt (Schritt 1 und 2 in Abbildung 4.6). Die Subskription wird dann an diesen
weitergeleitet (Schritt 3) und dort wie gewohnt integriert. Ergebnisse werden direkt von diesem an den Subskribent ausgeliefert (Schritt 4). Bei dem Verfahren zur Auswahl des ausführenden Brokers muss sichergestellt werden, dass dieser die Subskription nicht erneut weiterleitet und die Folge der Weiterleitungen zyklisch wird.
5
Um das PubScribe-System auch im großen Maßstab skalierbar zu machen, wird in diesem
Beitrag eine verteilte Architektur miteinander kooperierender Broker entworfen. Dazu wird
zunächst im ersten Abschnitt dieses Beitrags die drei-schichtige Architektur eines unverteilten PubScribe-Systems dargestellt. Für die Client-Schicht werden verschiedene Realisierungsalternativen aufgezeigt, die über XML-Dokumente mit dem auf der zentralen Mittelschicht angesiedelten Broker kommunizieren. Die in das System integrierten Datenquellen
werden jeweils durch individuelle Wrapper gekapselt und treten ebenfalls über XML-Doku-
189
2
S
S
4
Deliverer
1
Core Broker
Publisher Handler
Deliverer
Core Broker
Publisher Handler
Wrapper
Stub
Wrapper
Metadaten
Deliverer
3
Core Broker
Publisher Handler
Wrapper
Web-Seite
Abb. 4.6: Ablauf der Subskriptionsweiterleitung
mente mit dem Broker in Kontakt. Um eine bedingte Skalierbarkeit zu erreichen, ist der Broker selbst in mehrere Module untergliedert, die auf verschiedene Rechner verteilt werden
können. Zur Verwaltung von System- und Metadaten werden Informationskanäle verwendet,
sodass für Subskriptionen kein Unterschied zwischen den Daten eines externen Produzenten
und diesen internen Informationen besteht. Ebenfalls über Informationskanäle wird der Zustand des Brokers gesichert, sodass er einen Neustart überdauert.
Aus makroskopischer Sichtweise wird Verteilungstransparenz dadurch erzielt, dass Produzenten und Subskribenten sich wie im zentralen Fall bei einem Broker registrieren. Um Daten und Abfragen zur Auswertung zusammenzubringen, werden drei Mechanismen dargestellt: Kanalreplikation, entfernte Auswertung und Subskriptionsweiterleitung.
Literatur
BCM+99 Banavar, G.; Chandra, T.; Mukherjee, B.; Nagarajarao, J.; Strom, R.E.; Sturman, D.C.: An Efficient
Multicast Protocol for Content-Based Publish-Subscribe Systems. In: Proceedings of the 19th
International Conference on Distributed Computing Systems (ICDCS‘99, Austin (TX), USA, 31.
Mai - 4. Juni), 1999, S. 262-272
BEK+00 Box, D.; Ehnebuske, D.; Kakivaya, G.; Layman, A.; Mendelsohn, N.; Nielsen, H.F.; Thatte, S.;
Winer, D.: SOAP: Simple Object Access Protocol, 2000
CaRW00 Carzaniga, A.; Rosenblum, D.S.; Wolf, A.L.: Achieving Scalability and Expressiveness in an
Internet-Scale Event Notification Service. In: Proceedings of the 19th ACM Symposium on
Principles of Distributed Computing (PODC 2000, Portland (OR), USA, 16.-19. Juli), 2000, S. 219227
190
Memory based Publish/Subscribe System. In: Lehner, W. (Ed.): Advanced Techniques in
Personalized Information Delivery. Arbeitsbericht des Instituts für Informatik, Friedrich-Alexander
Universität Erlangen-Nürnberg, 2001
LeHü01 Lehner, W.; Hümmer, W.: The Revolution Ahead: Publish/Subscribe meets Database Systems. In:
Lehner, W. (Ed.): Advanced Techniques in Personalized Information Delivery. Arbeitsbericht des
Instituts für Informatik, Friedrich-Alexander Universität Erlangen-Nürnberg, 2001
LiPH00 Liu, L.; Pu, C.; Han, W.: XWRAP: An XML-enabled Wrapper Construction System for Web
Information Sources. In: Proceedings of the 16th International Conference on Data Engineering
(ICDE 2000, San Diego (CA), USA, 28. Februar - 3. März), 2000, S. 611-621
(Elektronisch verfügbar unter: http://www.cse.ogi.edu/~lingliu/Papers/xwrapTechRep.ps)
LiPT99 Liu, L.; Pu, C.; Tang, W.: Continual Queries for Internet Scale Event-Driven Information Delivery.
In: IEEE Transactions on Knowledge and Data Engineering (Jahrgang 11, Nummer 1), 1999,
S. 610-628
SAB+00 Segall, B.; Arnold, D.; Boot, J.; Henderson, M.; Phelps, T.: Content Based Routing with Elvin4. In:
Proceedings of the AUUG 2000 Technical Conference (AUUG 2000, Canberra, Australien, 25.-30.
Juni), 2000
W3C99 N.N.: Extensible Markup Language (XML) 1.0. World Wide Web Consortium, 1999
(Elektronisch verfügbar unter: http://www.w3c.org/TR/REC-xml)
YaGa99
Yan, T.W.; Garcia-Molina, H.: The SIFT Information Dissemination System. In: ACM Transactions
on Database Systems (TODS, Jahrgang 24, Nu
STRUKTURELLE UND OPERATIONELLE
MODELLIERUNGASPEKTE IN PUBSCRIBE
M. Redert, W. Lehner, W. Hümmer
{mlredert, lehner, huemmer}@immd6.informatik.uni-erlangen.de
Kurzfassung
Wesentlich für den Bereich des ’Personalized Information Delivery’ ist die Eigenschaft,
dass Informationen in Form von Nachrichten sequentiell in ein Subskriptionssystem eingehen, dort verarbeitet und wiederum sequentiell dem Benutzer zugestellt werden.
Diese Eigenschaft schlägt sich ebenfalls auf Modellierungsebene im PubScribe-System
nieder. So wird im Rahmen dieses Beitrags der Modellierungsaspekt aus struktureller
und operationeller Perspektive eruiert. Die strukturelle Einheit bildet dabei die Datenstruktur einer "Queue". Basierend auf diesen Einheiten werden in diesem Beitrag eine
Vielzahl von unären und binären Operatoren definiert, die im Kontext dieser Datenstruktur abgeschlossen sind. Neben der fundamental wichtigen Filteroperation finden sich
Skalar-, Verbund- und Vereinigungsoperatoren. Als zentrale Erweiterung gegenüber
anderen Subskriptionsansätzen ist die Einführung von Gruppierungs- und Partitionierungsoperatoren hervorzuheben.
192
1
1. Einleitung
Einleitung
In diesem Beitrag werden die theoretischen Grundlagen des vorgestellten Subskriptionssystems PubScribe ([LeHR01], [LeHü01]) erörtert. Die Darstellung orientiert sich an der Subskriptionssystemen inhärenten Eigenschaft sequenzbasierter Kommunikation und Verarbeitung (Abbildung 1.1). So werden von Produzenten Nachrichten in einer zeitlichen Abfolge
produziert, im System evtuell mit anderen Nachrichten verknüpft und an Subskribenten ausgeliefert.
Produzenten
Subskribenten
1) subscribe
Subskriptionssystem
2) publish
3) notify
Abb. 1.1: Prinzip des Publish/Subscribe-Mechanismus
Diese sequenzbasierte Eigenschaft spiegelt sich im PubScribe-System auf Modellebene insbesondere aus struktureller Sichtweise wider. So wird nach der Einführung in das laufende
Beispielszenrio im Abschnitt 3 die "Queue" als zentrale Datenstruktur definiert und am Beispiel eingeführt. Der nachfolgende Abschnitt 4 erläutert basierend auf diesen strukturellen
Einheit eine Vielzahl unterschiedlicher Operatoren. Die Menge an Operatoren erstreckt sich
dabei von einfachen Filteroperatoren über Verknüpfungs- und Vereinigungsoperatoren hin
zu komplexen Gruppierungs- und Partitionierungs-/Windowing-Operatoren.
2
Beispielszenario
Das im Folgenden beschriebene Beispielszenario wird in diesem und im folgenden Beitrag
([RRHL01]) als Grundlage für exemplarische Illustrationen verwendet. Dazu wird das Börsenumfeld gewählt, da die dort anfallenden Kursinformationen hochdynamisch sind und einem Anwender möglichst zeitnah zur Verfügung gestellt werden müssen, um ihm ein promptes Handeln zu erlauben. Weiterhin haben neben den einzelnen Kursdaten auch berechnete
3 Struktureller Aspekt
193
Werte, wie beispielsweise der gleitende 30-tages Durchschnitt einer Aktie eine wichtige Bedeutung. Börsenkurse sind somit ein prädestiniertes Einsatzgebiet für Subskriptionssysteme.
Ein Produzent stellt die anfallenden Kursinformationen in der folgenden Schema zur Verfügung:
(Wertpapier, Handelsplatz, Uhrzeit, Kurs, Volumen)
Wertpapier ist darin die Bezeichnung der Aktie oder des Fonds (z.B. Oracle oder IBM);
der Handelsplatz bezeichnet den Börsenplatz (z.B. BER für Berlin oder HAM für Hamburg), für den die Kursinformationen zur angegebenen Uhrzeit gelten. Diese drei Attribute
beschreiben ein Tupel eindeutig und bilden somit im relationalen Sinn seinen Primärschlüssel. Das Attribut Kurs gibt den ermittelten Wert des Papiers an, Volumen ist die Anzahl der
Aktien, die zum angegebenen Kurs gehandelt werden. Das Tupel
(‘IBM’, ‘BER’, ‘29.8.2000 9:30’, ‘96,70’, ‘53’)
besagt somit, dass am 29. August 2000 um 9:30 Uhr in Berlin 53 IBM-Aktien zum Kurs von
96,70 gehandelt wurden. Ein weiterer Publisher stellt Informationen darüber zur Verfügung,
welches Wertpapier in welchen Aktienindex eingeht:
(Wertpapier, Index, Beginn)
Der Primärschlüssel enthält die Bezeichnung der Aktie und des Indexes; das Attribut
Beginn gibt an, seit wann das Wertpapier dem Index angehört. Mögliche Indizes sind z.B.
DAX (Deutscher Aktien Index) oder NASDAQ (‘National Association of Securities Dealers
Automated Quotation’).
Durch Kombination der Daten beider Publisher kann eine Subskription beispielsweise die
täglichen Höchstkurse aller im DAX enthaltenen Aktien bestimmen. Eine andere Subskription könnte verschiedene Indizes bezüglich der durchschnittlichen Kurssteigerung darin enthaltener Aktien vergleichen.
3
Struktureller Aspekt
Die zentrale Struktur des hier beschriebenen Subskriptionssystems ist die sogenannte Queue.
Sie dient zur Repräsentation aller Datenströme, die vom Produzenten zum Subskribent und
innerhalb des Systems fließen. Um Intension und Extension einer Queue zu trennen, wird bei
ihrer Definition zwischen Schema und Instanz unterschieden.
3.1 Schema einer Queue
Das Schema einer Queue setzt sich aus Attributen zusammen. Der Begriff des Attributs wird
hier mit derselben Bedeutung wie im Kontext relationaler Datenbanken verwendet und ist
wie folgt definiert:
194
3. Struktureller Aspekt
Definition: Das Schema Ai eines Attributs ist durch einen Bezeichner N und einen Typ T definiert: A i = ( N , T ) mit T ∈ { boolean, integer, float, string, datetime, lob }. Der
Typ boolean kennzeichnet dabei Wahrheitswerte, integer Ganzzahlen, float rationale
Zahlen und string Zeichenketten. Attribute vom Typ datetime können ein Datum oder
ein Zeitintervall aufnehmen; der Datentyp lob (‘Large OBject’) bezeichnet uninterpretierte Daten. Der Name Ni des Attributs Ai wird durch die Funktion N(Ai) geliefert, sein
Typ durch T(Ai).
Das Schema einer Queue besteht aus zwei Mengen von Attributen, die Identifikationsteil
(‘id’) und Informationsteil (‘content’) genannt werden. Die Bezeichner der Attribute müssen
dabei über beide Mengen hinweg eindeutig sein, und der Informationsteil muss mindestens
ein Attribut enthalten.
Definition: Das Schema Q einer Queue wird durch zwei Attributmengen ID und C definiert:
Q = ( ID, C ) = ( [ ID 1, …, ID n ], [ C 1, …, C m ] ) mit n ≥ 0 , m > 0 . Die Vereinigung
ID ∪ C der Attributmengen wird mit A bezeichnet. Für die Bezeichner Nk der Attribute
einer Queue muss gelten:∀i, j ≤ A , i ≠ j: N i ≠ N .j
Das Schema einer Queue dient dazu, die Struktur ihrer Daten zu beschreiben. Der Grund für
die Trennung von Identifikations- und Informationsteil besteht in der Unterscheidung zwischen kennzeichnenden und nicht kennzeichnenden Attributen und wird im Verlauf des
nächsten Abschnitts deutlich.
Die im Falle des in Kapitel 1 beschriebenen Börsenszenarios anfallenden Daten lassen sich
in einer Queue mit dem nachfolgenden Schema repräsentieren, wobei die Attributnamen gegenüber den dort verwendeten Bezeichnern abgekürzt sind:
([(wp, string),
(hp, string),
(zeit, datetime)
],
[(kurs, float),
(vol, integer)
]
)
// Name des Wertpapiers
// Name der Börse, z.B. FWB
// Datum und Uhrzeit
// Kurs des Wertpapiers
// Zahl der gehandelten Aktien
Der Identifikationsteil besteht in diesem Schema aus der Wertpapierbezeichnung (wp), dem
Handelsplatz (hp) und der Uhrzeit (zeit). Der Informationsteil enthält den zu diesen Angaben passenden Kurs (kurs) sowie das Handelsvolumen (vol).
3.2 Instanz einer Queue
Die Instanz einer Queue ist eine geordnete Menge von Tupeln, die dem Schema der Queue
genügen. Die Identifikationsattribute kennzeichnen dabei ein Tupel innerhalb einer Instanz
eindeutig. Ein neues Tupel wird hinten an eine bestehende Queue angehängt. Ist bereits ein
Tupel mit dem Identifikationsteil des neuen Tupels enthalten, so wird dieses ersetzt.
4 Operationeller Aspekt
195
Definition:
Die Instanz q einer Queue, die durch das zugeordnete Schema
Q = ( ID, C ) = ( [ ID 1, …, ID n ], [ C 1, …, C m ] ) beschrieben wird, ist als geordnete
Menge von Tupeln definiert: q = [ ( id , c ) ] = [ ( [ id 1, …, id n ], [ c 1, …, c m ] ) ] . Die
Datentypen der idi und ci müssen dabei mit denjenigen des Schemas übereinstimmen.
Die Identifikationsattribute kennzeichnen ein Tupel innerhalb einer Instanz eindeutig.
Mit q i wird das i-te Tupel der Queue q bezeichnet, wobei als natürliche Ordnung der
Tupel ihre Einfügereihenfolge zugrundegelegt wird. Die Funktion pos ( q i, q ) liefert
die Position des Tupels q i in q. Die Anzahl der in q enthaltenen Tupel wird mit q
bezeichnet.
Für den Spezialfall einer Queue ohne Identifikationsattribute sei angemerkt, dass sie aufgrund der Kennzeichnungseigenschaft der ID maximal ein Tupel enthalten kann. Die nachfolgend dargestellte Instanz der im vorangegangenen Abschnitt definierten Queue enthält
willkürlich gewählte Börseninformationen:
[([‘ORACLE’,
([‘IBM’,
([‘ORACLE’,
([‘IBM’,
([‘ORACLE’,
([‘ORACLE’,
]
‘HAM’,
‘BER’,
‘FWB’,
‘STU’,
‘BER’,
‘FWB’,
‘29.8.2000
‘29.8.2000
‘29.8.2000
‘29.8.2000
‘29.8.2000
‘29.8.2000
9:07’],
9:17’],
9:30’],
9:30’],
9:33’],
9:58’],
[‘ 97,00’,
[‘145,30’,
[‘ 96,70’,
[‘144,20’,
[‘ 96,50’,
[‘ 97,10’,
‘ 8’]),
‘37’]),
‘53’]),
‘12’]),
‘21’]),
‘23’])
Da eine Aktie an einem Ort zu einem Zeitpunkt nur einen gültigen Wert besitzen kann, ist
die Eindeutigkeit des Identifikationsteils im Falle konsistenter Daten stets gewährleistet.
4
Operationeller Aspekt
Die im letzten Abschnitt erläuterten Queues lassen sich durch Operatoren modifizieren und
kombinieren. Allgemein agiert ein solcher Operator auf einer Menge von Queues als Operanden und erzeugt als Ergebnis eine neue Queue. Nach der Zahl der Operanden lassen sich
die Operatoren in unäre, binäre und allgemein n-äre Operatoren einteilen.
Im Folgenden werden Operatoren dargestellt, die sich zur Verarbeitung von Daten durch ein
Subskriptionssystem eignen. Es wird dabei zwischen den unären Operatoren Selektion, Skalar, Sortierung, Gruppierung, Shift und Window sowie den binären Operatoren Vereinigung
und Verbund unterschieden. Da auch hier der Zweck in einer Verarbeitung und Modifikation
von Daten besteht, ist eine – nicht nur namentliche – Ähnlichkeit mit SQL-Operatoren kein
Zufall. Es existieren jedoch auch Operatoren, die im SQL-Standard ([SQL99]) kein (ShiftOperator) bzw. noch kein (Window-Operator) Gegenstück haben.
196
4. Operationeller Aspekt
4.1 Der Selektionsoperator
Der Selektionsoperator dient dazu, bestimmte Tupel aus einer Queue herauszufiltern. In das
Ergebnis werden nur diejenigen Tupel des Operanden aufgenommen, die eine Selektionsbedingung erfüllen. Diese setzt sich aus geschachtelten und logisch verknüpften Teilbedingungen zusammen, die jeweils einen Attributwert mit einer Konstanten desselben Typs oder
einem anderen Attributwert vergleichen.
Definition: Jede gültige Selektionsbedingung Bed lässt sich durch die folgenden Regeln in
Backus-Naur-Form (BNF) ableiten:
Bed
AtomBed
::= ‘(’ Bed ‘)’ LogOp ‘(’ Bed ‘)’ | AtomBed
::= <IDname> VergleichsOp RechteSeite|
‘not’ <IDname> VergleichsOp RechteSeite
VergleichsOp ::= ‘=’ | ‘<>’ | ‘<’ | ‘>’ | ‘<=’ | ‘>=’ |
‘substring’
LogOp:
::= ‘AND’ | ‘OR’
RechteSeite ::= <Wert> | <IDname>
<IDname> ist dabei der Name eines Identifikationsattributs und <Wert> ein Wert aus
dem zugehörigen Wertebereich. Mit Bed(M) wird eine Bedingung bezeichnet, bei der
die Variablen <IDname> nur mit Elementen der Menge M belegt sind.
Durch die rekursive Definition der Bedingung ist die beliebige Schachtelung geklammerter
Teilausdrücke möglich. Der Selektionsoperator lässt sich mit Hilfe dieser Definition wie
folgt formalisieren:
Definition: Angewendet auf die Queue q in = [ ( id , c ) ] erzeugt der Selektionsoperator
select ( q in, Bed ( id ) ) das Ergebnis q out = [ ( id , c ) ∈ q in Bed ( id ) ] . Bed ( id ) ist
eine Selektionsbedingung im Sinne der voranstehenden Definition, die nur von den
Identifikationsattributen id des Operanden abhängt.
Die Wirkungsweise des Selektionsoperators wird an der Ermittlung aller Kurse von OracleAktien an der Frankfurter Börse verdeutlicht, was durch die Selektionsbedingung "(wp =
‘Oracle’) AND (hp = ‘FWB’)" ermöglicht wird.
4.2 Der Skalaroperator
Der Skalaroperator erlaubt die Modifikation des Informationsteils einer Queue durch sogenannte Skalarfunktionen. Die Informationsattribute des Operators werden im Ergebnis durch
die Resultate der Skalarfunktionen ersetzt.
Definition:
Der Skalaroperator scalar ( q in, [ ( N 1, f 1 ( id , c ) ), …, ( N k , f k ( id , c ) ) ] )
erzeugt
aus
die
Queue
q in = [ ( id , c ) ]
mit
dem
Schema
q out = [ ( id , [ f 1 ( id , c ), …, f k ( id , c ) ] ) ( id , c ) ∈ q in ]
Q out = ( ID, [ N 1, …, N k ] ) als Ergebnis. Die Funktionen fi werden Skalarfunktionen
genannt.
4.3 Der Sortieroperator
197
Um eine Übernahme eines Attributs aus dem Operanden zu ermöglichen, existiert die
Identitätsfunktion Id. Als weitere Skalarfunktionen sind arithmetische (z.B. plus, minus,
mult, div) und bool’sche Funktionen (z.B. greater, smaller, equal) denkbar, aber auch spezielle Funktionen, die z.B. eine Berechnung des Monats oder der Kalenderwoche aus einem
Datum ermöglichen (month, week). Da verschiedene Funktionen nur auf Daten eines bestimmten Typs angewendet werden können, muss die Typkonformität zwischen dem erwarteten und dem tatsächlichen Datentyp der Operanden einer Skalarfunktion gewährleistet
werden.
Im laufenden Aktienszenario kann das je Wertpapier umgesetzte Kapital als Produkt aus dem
Kurs und dem Handelsvolumen errechnet werden. Dies leistet die Skalarfunktion kapital=mult(kurs, vol). Um den Kurs unverändert in das Ergebnis zu übernehmen, wird die Identitätsfunktion id(kurs) auf das Attribut kurs angewendet, so dass sich folgendes Ergebnis ergibt:
([wp, hp, zeit], [kapital, kurs])
Das neue Attribut kapital ist darin vom Typ float, die Typen der übrigen Attribute bleiben
gegenüber dem Operanden unverändert.
4.3 Der Sortieroperator
Durch den Sortieroperator werden die Tupel einer Queue nach einem oder mehreren
Identifikationsattributen in aufsteigender oder absteigender Reihenfolge sortiert. Ihr Schema
bleibt dabei unverändert.
Definition: Der Sortieroperator order ( q in, [ s 1, …, s r ] ) mit s k = ( A ik, o k ) und der Sortierordnung
erzeugt
aus
der
Queue
o k ∈ { ASC , DESC }
q in = [ ( [ id 1, …, id n ], [ c 1, …, c m ] ) ] eine Queue q out des gleichen Schemas. Die
Reihenfolge der Tupel erfüllt dabei die folgenden Bedingungen:
r = 0 ⇒ ∀q i ∈ q out : pos ( q i, q out ) = pos ( q i, q in )
r > 0 ⇒ ∀q i, q j ∈ q out : i < j ⇐ ( A i1 ( q i ) < A i1 ( q j ) ∧ o 1 = ASC ) ∨
( A i1 ( q i ) > A i1 ( q j ) ∧ o 1 = DESC ) ∨
( ( A i1 ( q i ) = A i1 ( q j ) ) ∧ pos ( q i, order ( q in, [ s 2, …, s r ] ) ) <
pos ( q j, order ( q in, [ s 2, …, s r ] ) ) )
Da die übrigen Operatoren die Reihenfolge der Tupel nicht erhalten, wirkt sich eine Sortierung nur als letzter Operator einer Kette auf das Endergebnis aus.
4.4 Der Gruppierungsoperator
Der Gruppierungsoperator arbeitet analog zu dem des SQL-Standards. Zuerst wird eine
Queue in disjunkte Teilmengen zerlegt. Diese Partitionierung erfolgt durch die Angabe von
Identifikationsattributen, die Äquivalenzklassen definieren.
198
Definition: Die Partitionierung der Queue q in = [ ( [ id 1, …, id n ], c ) ] nach der Attributmenge S = { ID p1, …, ID p j } mit p k ≤ n ist durch die Menge P von Queues gegeben:
P = ∪ [ x ∈ q in ( ID pi ( x ) = ID pi ( q k ) ) ∀i ∈ [ 1, j ] ] .
k
Ein Element von P wird Partition genannt.
Eine Partitionierung besteht also aus mehreren Queues, von denen jede alle die Tupel der
Ausgangs-Queue enthält, die in den Werten der Partitionierungsattribute übereinstimmen.
Aus den Tupeln einer Partition leitet sich dann ein Tupel der Ergebnis-Queue ab. Dazu werden alle Werte, die diese in einem Attribut annehmen, durch eine sogenannte Aggregatfunktion zu einem Wert zusammengefasst. Mögliche Aggregatfunktionen sind dabei Anzahl
(count), Summe (sum), Minimum (min) und Maximum (max)*. Die Partitionierungsattribute
nehmen den gemeinsamen Wert aller Tupel der Partition an. Attribute, nach denen weder partitioniert noch aggregiert wird, fallen im Ergebnis weg.
Definition: Sei q in = [ id , c ] = [ ( [ id 1, …, id n ], [ c 1, …, c m ] ) ] der Operand und P eine
Partitionierung nach den Attributen S = { ID p1, …, ID p j } . Der Gruppierungsoperator groupBy ( q in, ( { ID p1, …, ID p j }, { ( A h1, f 1, N 1 ), …, ( A hk, f k , N k ) } ) ) erzeugt
die
Ergebnis-Queue
q out = ∪ ( ( [ id p1, …, id p j ], [ f 1 ( a h1 ), …, f k ( a hk ) ] ) a h j = { A h j ( x ) }, x ∈ r ) mit
r∈P
dem
Schema Q out = ( ID, [ N 1, …, N k ] ) .
Die Funktionsweise des Gruppierungsoperators wird an folgendem Beispiel verdeutlicht:
Um die Gesamtzahl der gehandelten Aktien getrennt nach ihrer Bezeichnung, aber unabhängig vom Handelsplatz und der Zeit zu berechnen, dient der Gruppierungsoperator
groupBy ( q in, ( { wp }, { ( vol, sum, sVol ) } ) ) . Angewendet auf die Queue des laufenden
Beispiels liefert er die beiden folgenden Tupel:
[([‘ORACLE’], [‘105’]),
([‘IBM’
], [‘49’])
]
Das Schema der neuen Queue lautet: ([wp], [sVol]).
4.5 Der Shift-Operator
Der Shift-Operator stellt in gewisser Weise das Gegenstück zur Gruppierung dar. Während
letztere den Identifikationsteil einer Queue verkleinert, wird er durch den Shift vergrößert.
Hierzu werden Informationsattribute in den Identifikationsteil verschoben.
* Der Durchschnitt wird als Aggregatfunktion nicht zugelassen, da sich sein Ergebnis nicht inkrementell aktualisieren lässt. Er kann aber durch eine nachgeschaltete Skalaroperation aus Anzahl und
Summe berechnet werden.
4.6 Der Window-Operator
199
Definition: Aus der Queue q in = [ id , c ] = [ ( [ id 1, …, id n ], [ c 1, …, c m ] ) ] generiert der
Shift-Operator shift ( q in, a ) mit a = { c i1, …, c ik }, c i j ∈ c 1 …c m als Resultat die
Queue q out = [ ( [ id 1, …, id n, c i1, …, c ik ], c ) c = c\a ] .
Besondere Bedeutung erlangt der Shift-Operator im Zusammenhang mit Datumsoperationen. Um z.B. nicht nach der Uhrzeit, sondern lediglich nach der Kalenderwoche zu gruppieren, muss die Skalaroperation KW auf die Zeit angewendet und das neue Attribut anschließend in den Identifikationsteil verschoben werden. Nun kann nach der Kalenderwoche gruppiert werden.
Die Anwendung des nebenstehenden Shift-Operators shift ( q in, { kurs } ) auf eine Queue
mit dem Schema aus dem laufendem Beispiel liefert als Resultat eine Queue, die alle Tupel
des Operanden enthält, aber das folgende, modifizierte Schema besitzt:
([wp, hp, zeit, kurs], [vol])
4.6 Der Window-Operator
Der Window-Operator ähnelt stark der Gruppierung, jedoch wird die Anzahl der Tupel durch
ihn nicht reduziert. Seine Arbeitsweise lässt sich in vier Phasen einteilen. In der ersten Phase
werden wie bei der Gruppierung Partitionen gebildet. Die Tupel jeder Partition werden in der
zweiten Phase nach einem vorzugebenden Kriterium sortiert.
Definition:
Sei q in = [ id , c ] = [ ( [ id 1, …, id n ], [ c 1, …, c m ] ) ] der Operand und
P = { p i } eine Partitionierung desselben nach S = { ID p1, …, ID p j } . Die geordneten Partitionen O ergeben sich dann als O = { order ( p i, [ s 1, …, s i ] ) ( p i ∈ P ) } .
In der dritten Phase wird jedem Tupel ein Ausschnitt der Partition, der es angehört, zugeordnet. Die Größe eines solchen Fensters kann durch ein logisches oder physisches Intervall bestimmt werden. Ein logisches Intervall bezieht sich auf die Werte eines bestimmten Attributs
und deren natürliche Ordnung (z.B. zwei Tage vor bis drei Tage nach dem Zeitstempel des
Tupels), ein physisches Intervall auf die Zahl von Tupeln (z.B. die beiden vorausgehenden
und die drei nachfolgenden Tupel der geordneten Partition). Bezugspunkt des Intervalls ist
immer das jeweilige Tupel. Reicht ein solches Fenster über die Grenzen der Partition hinaus,
so wird es an diesen ‘abgeschnitten’. Fehlende Werte werden also nicht durch Nullwerte ersetzt oder mit anderen Mechanismen ergänzt.
Definition: Sei O qi diejenige geordnete Partition, die das Tupel q i enthält, und [l, r] ein Interl
vall. Das logische Fenster W qi von q i wird durch das Intervall wie folgt definiert:
l
W q i = [ x ∈ O q i A ( x ) ∈ [ l, r ] ] .
p
Das physische Fenster W qi des Tupels q i wird durch das Intervall [l, r] definiert als:
p
W qi = [ q L, …, q R L = max ( i – l, 0 ), R = min ( i + r, O qi ) ] .
Als Basis für ein logisches Fenster kann nur das erste Sortierattribut dienen, da die Tupel der
Partition nach diesem geordnet sind und nur so sichergestellt ist, dass das Fenster von einem
zusammenhängenden Bereich von Tupeln gebildet wird.
200
In der letzten Phase aggregiert der Window-Operator analog zur Gruppierung, allerdings alle
Fenster getrennt. Auf diese Weise entsteht für jedes Fenster und somit für jedes Tupel des
Operanden ein Ergebnistupel.
Definition:
Angewendet auf den Operanden q in erzeugt der Window-Operator
indow ( q in, ( { ID p1, …, ID p j }, [ s 1, …, s i ], ( T , [ l, r ] ), { ( A h1, f 1, N 1 ), …, ( A hk, f k , N k ) } )
zuerst eine Partitionierung der Queue nach { ID p1, …, ID p j } und sortiert die Partitionen anschließend nach [ s 1, …, s i ] . Danach wird für jedes Tupel q i ∈ q in ein Fenster
W qi gebildet. T ∈ { l, p } gibt seinen Typ an und [ l, r ] ist das Intervall. Das Ergebnis
q out des Operators setzt sich dann aus der Aggregation jedes Fensters W qi nach
( A h1, f 1, N 1 ), …, ( A hk, f k , N k ) zusammen:
out = ∪ groupBy ( W q i, ( { ID p 1, …, ID p j }, { ( A h 1, f 1 ), …, ( A h k , f k ) } ). Es besitzt
das Schema Q out = ( ID, [ N 1, …, N k ] ) .
Die Arbeitsweise des Operators wird im folgenden
Beispiel deutlich. So haben gerade im Bereich der
wp, hp
zeit, ASC
Aktienkurse, aber auch bei Verkaufszahlen gleitenWindow
p, [-1, 1]
de Durchschnitte oder Summen eine besondere BesumKurs=SUM(kurs),
cntKurs=COUNT(kurs),
deutung. So können mit dem Window-Operator das
sumVol=SUM(vol)
Handelsvolumen und der durchschnittliche AktienAbb. 4.1: Window-Operator
kurs für einen Tag und die beiden angrenzenden
Handelstage berechnet werden. Datenbasis zur Demonstration des Window-Operators ist die
folgende Queue, welche die Schlusskurse sowie die gehandelte Stückzahl der Oracle- und
IBM-Aktie für einen Zeitraum von einer Woche enthält:
[([‘ORACLE’,
([‘ORACLE’,
([‘ORACLE’,
([‘ORACLE’,
([‘ORACLE’,
([‘IBM’,
([‘IBM’,
([‘IBM’,
([‘IBM’,
([‘IBM’,
]
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘11.9.2000
‘12.9.2000
‘13.9.2000
‘14.9.2000
‘15.9.2000
‘11.9.2000
‘12.9.2000
‘13.9.2000
‘14.9.2000
‘15.9.2000
19:30’],
19:30’],
19:30’],
19:30’],
19:30’],
19:30’],
19:30’],
19:30’],
19:30’],
19:30’],
[‘ 98,10’,
[‘ 95,90’,
[‘ 94,00’,
[‘ 99,00’,
[‘ 93,80’,
[‘145,50’,
[‘146,50’,
[‘146,50’,
[‘149,00’,
[‘146,00’,
‘23108’]),
‘19657’]),
‘57843’]),
‘40435’]),
‘83129’]),
‘24184’]),
‘41416’]),
‘18388’]),
‘13359’]),
‘10151’])
Basierend auf dieser Queue ist folgende Windowing-Operation auszuführen:
window (Q, ({wp, hp, zeit}, [-1,1],
({ (sumKurs=sum(kurs), (cntKurs=count(kurs), (sumVol=SUM(vol)}))
In einem ersten Schritt wird eine Partitionierung dieser Queue nach Wertpapier und Handelsplatz durchgeführt. Das Ergebnis sind zwei Queues, eine mit den Tupeln der IBM-Aktie und
eine mit denen der Oracle-Aktie. Beide werden anschließend nach Datum und Uhrzeit sortiert. Anschließend kann für jedes Tupel sein physisches Fenster von ± 1 Tupel erzeugt und
4.7 Der Vereinigungsoperator
201
über diesem die Aggregationen ausgeführt werden. Logische Fenster sind hier nicht verwendbar, da für Wochenenden keine Kursinformationen existieren und die Fenster der angrenzenden Tage somit nicht vollständig wären.
Für das erste Tupel der Oracle-Queue umfasst dieses Fenster die beiden ersten Tupel, da ein
Vorgängerwert nicht existiert. Durch Anwendung der Aggregatfunktionen ergibt sich als Ergebnistupel:
([‘ORACLE’, ‘FWB’, ‘12.9.2000 19:30’],
[‘194,00’, ‘2’, ‘ 42765’])
Für das nächste Tupel enthält das Fenster die ersten drei Tupel der Queue. Im Ergebnis ist
für dieses daher das folgende Tupel enthalten:
([‘ORACLE’, ‘FWB’, ‘12.9.2000 19:30’],
[‘288,00’, ‘3’, ‘100608’])
Führt man die Berechnungen analog für alle Tupel der obigen Ausgangsqueue durch, so erhält man als Ergebnis eine Queue mit ebenfalls zehn Tupeln. Der Durchschnittskurs wird mit
einer anschließenden Skalaroperation berechnet. Das Ergebnis ist nachfolgend dargestellt:
[([‘ORACLE’,
([‘ORACLE’,
([‘ORACLE’,
([‘ORACLE’,
([‘ORACLE’,
([‘IBM’,
([‘IBM’,
([‘IBM’,
([‘IBM’,
([‘IBM’,
]
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘FWB’,
‘11.9.2000
‘12.9.2000
‘13.9.2000
‘14.9.2000
‘15.9.2000
‘11.9.2000
‘12.9.2000
‘13.9.2000
‘14.9.2000
‘15.9.2000
19:30’],
19:30’],
19:30’],
19:30’],
19:30’],
19:30’],
19:30’],
19:30’],
19:30’],
19:30’],
[‘ 97,00’,
[‘ 96,00’,
[‘ 96,30’,
[‘ 95,60’,
[‘ 96,40’,
[‘146,00’,
[‘146,20’,
[‘147,40’,
[‘147,20’,
[‘147,50’,
‘42765’]),
‘100608’]),
‘117935’]),
‘181407’]),
‘123564’]),
‘65600’]),
‘83988’]),
‘73163’]),
‘41898’]),
‘23510’])
4.7 Der Vereinigungsoperator
Der Vereinigungsoperator führt zwei Queues mit identischem Schema zu einer ErgebnisQueue zusammen. Damit auch hier die Eindeutigkeit des Identifikationsteils im Resultat gewährleistet ist, wird diesem ein neues Attribut hinzugefügt, das in Abhängigkeit der Quelle,
aus der ein Tupel stammt, einen unterschiedlichen Wert annimmt.
1
2
Definition: Der Vereinigungsoperator union ( q in, q in, ( ID, s 1, s 2 ) ) mit s 1 ≠ s 2 erzeugt aus
1
1
1
1
1
den
Operanden
q in = [ ( [ id 1, …, id n ], [ c 1, …, c m ] ) ]
und
2
2
2
2
2
q in = [ ( [ id 1, …, id n ], [ c 1, …, c m ] ) ]
mit
dem
gemeinsamen
Schema
das
Ergebnis
Q = ( [ ID 1, …, ID n ], [ C 1, …, C m ] )
1
2
q out = [ ( [ id , s 1 ], c ) ( ( id , c ) ∈ q in ) ] ∪ [ ( [ id , s 2 ], c ) ( ( id , c ) ∈ q in ) ] . Das Schema
des
Resultats ist gegeben durch Q out = ( [ ID 1, …, ID n, ID ], C ) .
202
Der Wert des Attributs ID zeigt an, aus welcher Quelle ein Ergebnistupel stammt; es wird
daher nach [SmSm77] als diskriminierendes Attribut bezeichnet.
1
2
Angenommen, es existieren zwei Queues q und q mit identischem Schema, wobei die erste Aktienkurse und die zweite Kurse von Wertpapierfonds enthalte. Um ihre gemeinsame
Verarbeitung durch andere Operatoren zu ermöglichen, werden sie mit dem Vereinigungs1 2
operator union ( q , q , ( ( typ, string ), ′A′, ′F′ ) ) zu einer Queue zusammengefasst. Das gemeinsame Schema wird um ein Identifikationsattribut typ des Datentyps string erweitert.
Diesem wird für Aktien der Wert A, für Fonds der Wert F zugeweisen.
4.8 Der Verbundoperator
Wie bei der Vereinigung werden auch beim Verbund zwei Queues zu einer zusammengeführt. Jedoch entsteht hier ein Ergebnistupel aus je einem Tupel jedes Operanden. Welche
Tupel dabei verknüpft werden, gibt die Verbundbedingung an. Sie ist genauso strukturiert,
wie die Selektionsbedingung (Abschnitt 4.1), jedoch tritt an die Stelle des Werts der Name
eines Identifikationsattributs des zweiten Operanden. Als Vergleichsoperator ist lediglich das
‘=’ zulässig. Zwei Tupel werden genau dann verbunden, wenn sie die Bedingung erfüllen.
Zur Verknüpfung werden die Identifikations- und die Informationsattribute beider Tupel jeweils aneinandergehängt. Das neu entstandene Tupel wird durch diese zusammengesetzte
Identifikation innerhalb des Ergebnisses wieder eindeutig gekennzeichnet.
1
2
1
2
Definition: Durch den Verbundoperator join ( q in, q in, Bed ( id , id ) ) wird aus den Operanden
1
1
1
1
1
2
2
2
2
2
q in = [ ( [ id 1, …, id ñ ], [ c 1, …, c m̃ ] ) ] und q in = [ ( [ id 1, …, id n̂ ], [ c 1, …, c m̂ ] ) ] als
Ergebnis
1
1
2
2
1
1
2
2
1
2
q out = [ ( [ id 1, …, id ñ, id 1, …, id n̂ ], [ c 1, …, c m̃, c 1, …, c m̂ ] ) ] Bed ( id , id )
erzeugt.
Voraussetzung für die Auswertbarkeit der Bedingung Bed und damit für die Durchführbarkeit des Verbundes ist, dass die in ihr verglichenen Attribute vom selben Typ sind. Haben Attribute der Kennzeichnungsteile den gleichen Namen, so muss eines umbenannt werden, um
die Eindeutigkeit des Attributnamens im Ergebnis zu erhalten. Hierzu ist eine Konvention
festzulegen, z.B. kann der Attributname des zweiten Operanden um das Suffix ‘2’ erweitert
werden. Gleiche Attributnamen in den Informationsteilen der Verbundpartner sind nicht zulässig.
Um Aktien den Indizes, in deren Berechnung sie eingehen, zuordnen zu können, dient der in
Abschnitt 2 beschriebene Informationskanal mit dem Schema ([wp, idx], [beginn]),
der die M:N-Verknüpfung zwischen Aktien und Indizes wiedergibt. Sollen alle Aktien des
DAX ermittelt werden, so ist zuerst ein Verbund mit der Queue aus dem laufenden Beispiel,
die die Aktienwerte enthält, nötig. Die Verbundbedingung lautet: wp=wp. Als Ergebnis liefert der Operator eine Queue mit dem folgenden Schema:
([wp, hp, zeit, wp2, idx], [kurs, vol, beginn])
5 Modellierung von Produzenten
203
Auf dieser ist dann eine Selektion nach dem Index möglich. Da beide Operatoren ein Attribut
wp besitzen, wird das des zweiten wie oben beschrieben im Ergebnis in wp2 umbenannt.
5
Modellierung von Produzenten
Als Produzenten werden diejenigen Komponenten des Subskriptionssystems bezeichnet, die
neue Daten einbringen. Der Vorgang des Einbringens wird publizieren genannt, die eingebrachten Daten Publikation. Eine Publikation erfolgt dabei immer in einen Informationskanal; es handelt sich also um ein kanalbasiertes System.
Informationskanal
Informationskanäle dienen dazu, die Daten innerhalb des Publish/Subscribe-Systems zu
strukturieren. Die in einen Kanal publizierten Daten müssen daher einem gemeinsamen
Schema, dem Schema des Kanals, genügen.
Definition: Ein Informationskanal (‘channel’) K wird durch einen eindeutigen Namen N
gekennzeichnet und durch das Schema einer Queue Q beschrieben. Er wird als
K = ( N , Q ) dargestellt. Q wird auch als Schema des Kanals bezeichnet.
Jeder Produzent ist genau einem Informationskanal zugeordnet. Aus seiner Sicht wird das
Schema dieses Kanals als Exportschema bezeichnet, da es die Form der publizierten Daten
festlegt.
Normalerweise besitzt jeder Produzent einen eigenen Informationskanal, sodass sich eine
1:1 Relation zwischen Produzenten und Kanälen ergibt. Unter bestimmten Umständen erscheint es jedoch sinnvoll, dass sich mehrere Produzenten einen Kanal teilen. Beispielsweise
könnten für verschiedene Börsenplätze eigene Produzenten existieren, die ihre Daten in einem gemeinsamen Schema in einen einzigen Kanal publizieren. Auch für den umgekehrten
Fall lassen sich Anwendungen finden: so könnte ein Produzent den Börsenkurs der IBM-Aktie einerseits in einen allgemeinen Aktien-Kanal publizieren, andererseits aber auch in einen
Kanal mit DAX-Aktien und in einen, der jede Art von Informationen über das Unternehmen
IBM enthält. Physisch handelt es sich in so einem Fall um einen einzigen Produzenten; auf
logischer Ebene tritt er jedoch getrennt auf, sodass die 1:1 Relation erhalten bleibt.
Der Zustand eines Produzenten ist durch seinen momentanen, für das Gesamtsystem relevanten Datenbestand gegeben. Innerhalb des Subskriptionssystems spiegelt er sich im Zustand
des entsprechenden Informationskanals wider.
Definition: Der Zustand eines Informationskanals zu einem Zeitpunkt t ist eine Queue, die
alle zu diesem Zeitpunkt gültigen Tupel des Kanals enthält. Ein Tupel ist gültig, wenn
es das zuletzt in diesen Kanal publizierte Tupel mit einer bestimmten Identifikation ist.
Ein Tupel wird ungültig, wenn alle Attribute seines Informationsteils null sind.
204
5. Modellierung von Produzenten
Der Zustand eines Kanals enthält also für jede Identifikation das zuletzt publizierte Tupel.
Die Publikation eines Tupels mit leerem Informationsteil führt zu seiner Entfernung aus dem
Zustand. Dies macht die Forderung nach mindestens einem Informationsattribut (Abschnitt
3.1) erforderlich, da sonst ein Löschen von Tupeln nicht möglich wäre.
Publikation
Eine Publikation dient dazu, das System über eine Zustandsänderung des Produzenten zu informieren. Sie enthält den Namen des Publikationskanals und eine Queue. Die Tupel der
Queue müssen genügend Informationen enthalten, um den Zustand des Informationskanals
dem des Produzenten angleichen zu können.
Definition: Eine Publikation P ist durch ein Tupel P = ( N , q ) gegeben. N ist darin der
Name des Informationskanals, in den die Publikation erfolgt, und q ist eine Instanz des
zugehörigen Kanalschemas, die die zu publizierenden Daten enthält.
Um den Zustand des Informationskanals zu aktualisieren, wird die Publikation mit diesem
wie folgt vereinigt: ist im Zustand ein Tupel mit der Identifikation eines publizierten Tupels
enthalten, so wird dieses durch das neue Tupel ersetzt. Ist kein solches Tupel enthalten, so
wird der Zustand um das neue Tupel erweitert, indem es am Ende der entsprechenden Queue
angefügt wird. Sind die Informationsattribute eines publizierten Tupels alle gleich null, so
wird das entsprechende Tupel des Zustands ungültig und aus ihm entfernt.
Produzenten
Beschreibende Merkmale eines Produzenten sind der Informationskanal, in den er publiziert,
und die Aktualität, die er für die von ihm veröffentlichten Daten gewährleistet (‘Quality of
Service’, QoS). Die Aktualität wird durch die maximale Zeitspanne zwischen dem GültigWerden eines Datums und seiner Publikation definiert. Dieses Intervall kann im Falle eines
sogenannten synchronen Produzenten (z.B. einer Datenbank mit Triggern) null sein, da Änderungen sofort erkannt werden; bei einem asynchronen Produzenten (z.B. von Web-Seiten)
ist es je nach Abfrage-Intervall der Datenquelle größer.
Ein weiteres Charakteristikum eines Produzenten ist die Art, wie er dem Subskriptionssystem Zustandsänderungen mitteilt. Der Snapshot-Produzent informiert über die Änderung
seines Zustands von Z nach Z' , indem er seinen neuen Zustand Z' publiziert. Ein Delta-Produzent nutzt aus, dass das System den alten Zustand kennt und übermittelt nur Datenänderungen. Der neue Zustand Z' kann vom Subskriptionssystem aus dem alten Zustand Z abgeleitet werden, wenn die hinzugefügten, gelöschten und geänderten Tupel bekannt sind.
Diese werden wie oben beschrieben publiziert.
Beispiel zu Arten der Übermittlung von Zustandsänderungen
Der Informationskanal für Aktienkurse besitze das folgende (gegenüber dem Beispiel aus
Kapitel 1 vereinfachte) Schema:
([Wertpapier, Handelsplatz], [Kurs])
5 Modellierung von Produzenten
205
In vorausgehenden Publikationen seien bereits die folgenden Tupel in den Informationskanal
publiziert worden:
[([‘ORACLE’, ‘FWB’], [‘ 96,70’]),
([‘IBM’,
‘BER’], [‘145,30’]),
([‘IBM’,
‘STU’], [‘144,20’])
]
Im weiteren Verlauf steigt der Kurs der IBM-Aktie in Berlin auf ‘146,50’, d.h. der Zustand
des Produzenten ändert sich. Ist der Produzent ein Snapshot-Produzent, so muss er seinen
neuen Zustand, also die folgenden Tupel publizieren:
[([‘ORACLE’, ‘FWB’], [‘ 96,70’]),
([‘IBM’,
‘BER’], [‘146,50’]),
([‘IBM’,
‘STU’], [‘144,20’])
]
Handelt es sich um einen Delta-Produzent, so publiziert dieser lediglich das geänderte Tupel
[([‘IBM’, ‘BER’], [‘146,50’])]
Das Publish/Subscribe-System leitet aus der Publikation und dem alten Zustand den neuen
ab, indem das vorhandene Tupel mit der Identifikation [‘IBM’, ‘BER’] ersetzt wird. Liegt
nun zusätzlich der Kurs der IBM-Aktie in Frankfurt vor, so muss ein Snapshot-Produzent
fortan vier Tupel publizieren, der Delta-Produzent wiederum nur eines mit den neuen Kursinformationen.
Der Zustand eines Produzenten und damit auch der eines Informationskanals kann stark anwachsen (insbesondere dann, wenn historische Daten aufbewahrt werden). Es erscheint daher nicht immer sinnvoll, ihn im Subskriptionssystem zu duplizieren. Stattdessen wird die
Möglichkeit vorgesehen, dass das System bei Bedarf durch eine geeignete Abfrage auf den
Zustand des Produzenten zurückgreifen kann.
Definition: Gestattet ein Produzent dem Subskriptionssystem den Zugriff aus seinen
Zustand, so wird er als abfragbar bezeichnet, andernfalls als nicht abfragbar. Ein
Informationskanal wird als abfragbar bezeichnet, wenn entweder der zugehörige Produzent abfragbar ist, oder sein gesamter Zustand im System dupliziert vorhanden ist.
Nachdem nun alle Komponenten, die einen Produzenten beschreiben, dargestellt wurden,
kann dieser selbst definiert werden.
Definition: Ein Produzent R bringt Daten in Form von Publikationen in das Subskriptionssystem ein. Er wird durch das Tupel R = ( N , QoS , Z , A ) beschrieben. Darin ist N der
Name des Informationskanals, in den er publiziert, und QoS das maximale Alter der
publizierten Daten. Z ∈ { snapshot, delta } gibt die Übermittlungsart von Zustandsänderungen an und A ∈ { true, false } die Abfragbarkeit des Produzenten.
206
6. Modellierung von Subskriptionen
delta
snapshot
Die Eigenschaften der Übermittlungsart und der
Abfragbarkeit eines Produzenten sind orthogonal
zueinander, d.h. sie können in beliebigen Kombinationen auftreten (Abbildung 5.1). Typischerweise sind jedoch Datenquellen, deren Zustand sehr
viele Tupel umfasst (z.B. ein Data Warehouse), als
abfragbarer Delta-Produzent realisiert. Einfache
Datenquellen mit einem nur wenige Tupel enthaltenden und sich verhältnismäßig selten ändernden
Zustand (z.B. Web-Seiten) sind hingegen häufig
als Snapshot-Produzent realisiert.
Übermittlungsart
Bei abfragbaren Produzenten wird zwischen einer engen und einer losen Kopplung mit dem
Publish/Subscribe-System unterschieden. Bei einer engen Kopplung kann das System direkt
auf den Zustand des Produzenten, der als Relation in einem Datenbanksystem vorliegt, zugreifen. Im Fall einer losen Kopplung muss es den Zustand mittels einer Anfragesprache
beim Produzenten abfragen.
Zustand umfasst
viele Tupel
Zustand umfasst
viele Tupel
Data Warehouse
Aktienkurse mit
Zeitstempel aber
ohne Zugriff auf
historische Daten
Zustand umfasst
wenige Tupel
Zustand umfasst
wenige Tupel
Web-Seite mit
Abfragemöglichkeit
Web-Seite mit
aktuellen
Aktienkursen
abfragbar
nicht abfragbar
Abfragbarkeit
Abb. 5.1: Klassifikation von Produzenten
Enthält ein Kanalschema einen Zeitstempel als
Identifikationsattribut, so ist dies ein Hinweis auf eine viele Tupel umfassende Datenquelle,
da historische Daten im Zustand nicht von aktuelleren überschrieben werden. Prinzipiell
lässt sich feststellen, dass ein Delta-Produzent immer dann empfehlenswert ist, wenn der Zustand der Datenquelle viele Tupel umfasst; enthält er nur wenige Tupel, so ist auch ein Snapshot-Produzent denkbar.
Ist das Subskriptionssystem nicht für die Zustandssicherung eines Informationskanals verantwortlich, verwirft es Publikationen, wenn alle Strukturen aktualisiert sind und die Daten
nicht mehr benötigt werden. Dies hat zur Folge, dass eine Subskription, die nach dem zu ihrem Einbringzeitpunkt aktuellen Zustand eines Kanals verlangt, bei einem nicht abfragbaren
Informationskanal nicht einmal bei einem Snapshot-Produzent ‘aus dem System heraus’ beantwortet werden kann. Zur Auswertung muss der Produzent abgefragt werden oder die
nächste Publikation abgewartet werden. Die Abfragbarkeit eines Informationskanals wirkt
sich also auf die Ausführbarkeit verschiedener Klassen von Subskriptionen aus. Auf die diesbezüglichen Abhängigkeiten wird im Zusammenhang mit der Definition von Subskriptionen
im nächsten Abschnitt eingegangen.
6
Modellierung von Subskriptionen
Subskribenten reflektieren Konsumenten in einem Subskriptionssystem. Es findet ein Informationsfluss von den Produzenten durch das Subskriptionssystem hin zu den Subskribenten
statt. Eine Subskription ist eine Funktion, die in zeitlichen Abständen mehrfach aus sich
möglicherweise ändernden Daten berechnet wird. Randbedingungen steuern dabei das Zeit-
6.1 Aufbau einer Subskription
207
intervall, in dem eine Auswertung stattfinden soll. Wie eine solche Subskription strukturiert
ist, wird in diesem Abschnitt definiert. Im Anschluss werden Subskription anhand verschiedener Kriterien eingeteilt. Die Klasse der realisierbaren Subskriptionen wird durch mehrere
Merkmale weiter klassifiziert.
6.1 Aufbau einer Subskription
Den Rumpf oder Kern einer Subskription bildet eine Abfrage von Daten. Randbedingungen
geben an, in welchem Zeitraum eine Auswertung erwünscht ist, und wann diese erfolgen
soll. Weiterhin enthält eine Subskription eine Qualitätsbedingung und einen Empfänger. Auf
die genannten Komponenten wird in den folgenden Abschnitten detailliert eingegangen.
Rumpf
Der Rumpf einer Subskription beschreibt die vom Anwender gewünschten Daten in Form eines Operatorenbaums, der aus den in Abschnitt 4 definierten Operatoren aufgebaut ist. Das
Resultat des Operators an der Wurzel ist die Queue, die an den Anwender ausgeliefert wird.
Die Blätter eines solchen Baumes liefern die Daten, die durch den Baum propagiert werden.
Dies können entweder Konstanten oder Ressourcen sein.
Eine Konstante ist eine Queue mit nur einem Tupel. Sie wird durch ein Schema und ein dazu
kompatibles Tupel spezifiziert. Eine Ressource repräsentiert einen Informationskanal. Jedes
Mal, wenn Daten in den Kanal publiziert werden, fließen diese durch den Operatorenbaum
und verändern schließlich das Ergebnis des obersten Knotens. Jede Ressource enthält neben
dem Namen des Informationskanals ein Flag, das angibt, ob sie initialisiert werden soll. Ist
dies der Fall, so wird ein erstes Ergebnis des Wurzel-Operators aus dem aktuellen Zustand
des Informationskanals ermittelt, was jedoch seine Abfragbarkeit voraussetzt. Findet keine
Initialisierung statt, so wird der aktuelle Zustand als leer angenommen. Eine Initialisierung
findet dann möglicherweise auf Basis anderer Ressourcen des Operatorenbaums statt.
Beispiel einer Subskription
Der Operatorenbaum in Abbildung 6.1 zeigt den Rumpf einer Subskription. Die Kanten sind
jeweils mit dem Schema der dort fließenden Informationen beschriftet; auf die Angabe des
Datentyps wird dabei verzichtet.
Als Resultat liefert der Operatorenbaum für alle Aktien des DAX den durchschnittlichen
Kurs der letzten drei Handelstage sowie die Summe der in diesem Zeitraum gehandelten Aktien. Dazu werden zuerst die beiden Informationskanäle für Aktien und Indizes verbunden.
Es entsteht eine neue Queue, die für jede Kursangabe den Index des Wertpapiers enthält. Bei
dem Verbund entsteht ein Namenskonflikt, der durch Umbenennung eines Attributs gelöst
wird. Als nächste Operation wird nun eine Selektion ausgeführt, die dem DAX angehörende
Aktien aus der Queue herausfiltert. Der Window-Operator bildet anschließend die gleitenden
Summen und Anzahlen. Mit der abschließenden Skalaroperation wird der Durchschnittskurs
208
([wp, hp, zeit], [avgKurs, sumVol])
Skalar
avgKurs=div(sumKurs, countKurs)
sumVol=id(sumVol)
([wp, hp, zeit], [sumKurs, countKurs, sumVol])
Window
wp, hp
zeit ASC
p, [-3, 0]
sumKurs=sum(kurs), countKurs=count(kurs), sumVol=sum(vol)
Selektion
idx=‘DAX’
Verbund
wp=wp
([wp, idx], [beginn])
Aktien
Indizes
([wp, hp, zeit],
[kurs, vol])
Abb. 6.1: Beispiel für den Rumpf einer Subskription
berechnet. Das Ergebnis enthält zu einem Wertpapier, einem Handelsplatz und einem Zeitpunkt den gleitenden Durchschnittskurs und das gleitende Gesamtvolumen der entsprechenden Aktie zum angegebenen Zeitpunkt.
Kontrollbedingungen
Um die Auswertung des Rumpfes zu kontrollieren, existieren eine Start- und eine Stoppbedingung, sowie eine Auslieferungsbedingung. Eine neue Subskription ist anfänglich inaktiv. Ist die Startbedingung erfüllt, so wird sie aktiviert. Dies bedeutet, dass der Anwender ab
diesem Zeitpunkt Ergebnisse zugestellt bekommt. Die Auslieferung wird eingestellt und die
Subskription aus dem System entfernt, sobald die Stoppbedingung wahr wird. Durch die
Auslieferungsbedingung wird gesteuert, wann und wie häufig der Benutzer benachrichtigt
wird. Ist sie erfüllt und die Subskription aktiviert, so wird der Rumpf ausgewertet und das
Ergebnis an den Benutzer ausgeliefert. Die Auswertung der Bedingung und des Rumpfes ist
transaktional geschützt, d.h. beide erfolgen auf der gleichen Datenmenge. Insbesondere
bleibt dadurch eine Publikation, die zwischen beiden Auswertevorgängen im System eintrifft, unberücksichtigt.
Qualitätsbedingung
Die Qualitätsbedingung spiegelt die Anforderung des Anwenders an die Aktualität der in der
Subskription referenzierten Daten wider.
Definition: Die Qualitätsbedingung einer Subskription ist als eine Zeitspanne ∆t definiert,
die maximal zwischen dem Gültig-Werden eines Datums und seiner Auswirkung auf
die Subskription verstreichen darf.
Dieser Parameter korrespondiert direkt mit der Aktualität der Publikationen eines Produzenten. Liefern alle vom Rumpf und den Bedingungen benötigten Quellen Daten mit mindestens
der hier geforderten Aktualität, so wird die Subskription akzeptiert. Ist dies auch nur für einen Produzenten nicht der Fall, wird sie vom System abgewiesen.
6.2 Klassifikation von Subskriptionen
209
Empfänger
Dieses Attribut legt das Medium fest, über welches Ergebnisse ausgeliefert werden, und gibt
die Adresse des Anwenders an.
Definition: Ein Empfänger E ist durch ein Übermittlungsmedium M und eine für dieses spezifische Adresse Adr definiert: E = ( M , Adr ).
Für das Übermittlungsmedium sind keine Grenzen gesetzt. Das Spektrum reicht von elektronischer Mail, Fax und SMS über das Bereitstellen auf einem Web-Server oder den Upload
via FTP bis zum Verschicken über eine Netzverbindung zur direkten elektronischen Nachoder Weiterverarbeitung durch ein Programm des Anwenders.
Subskription
Nachdem nun alle Einzelkomponenten beschrieben sind, kann die Subskription selbst formal
dargestellt werden.
Definition: Eine Subskription S ist durch das Tupel S = ( R, Start, Stop, A, QoS , E ) definiert. R ist der Rumpf, Start die Startbedingung, Stop die Abbruchbedingung und A die
Auslieferungsbedingung. QoS spezifiziert die geforderte Mindestaktualität der Publikationen in referenzierte Kanäle und E ist der Empfänger.
Ein wichtiger Spezialfall dieser allgemeinen Subskription ist die synchrone Subskription. Sie
zeichnet sich dadurch aus, dass ihr Rumpf sofort nach ihrem Eintreffen ausgewertet, das Ergebnis einmal versandt und die Subskription anschließend entfernt wird.
Definition: Eine synchrone Subskription ist eine Subskription, deren Start-, Stopp- und Auslieferungsbedingungen konstant ‘wahr’ sind.
Im Gegensatz zur (asynchronen) Subskription werden im synchronen Fall sofort Daten zugestellt. Sie eignet sich daher zur Zustandsabfrage eines Informationskanals und hat eine erhebliche Bedeutung für die internen Abläufe des PubScribe-Systems ([RRHL01]).
6.2 Klassifikation von Subskriptionen
Eine Einteilung realisierbarer Subskriptionen lässt sich anhand der von ihnen benötigten Daten vornehmen. Ex-tunc-Subskriptionen greifen auf in der Vergangenheit publizierte Daten,
im Allgemeinen auf den gesamten Zustand eines Informationskanals zurück. Dieser Rückgriff geschieht durch die initiale Auswertung aller benutzten Ressourcen und setzt die Abfragbarkeit der zugehörigen Informationskanäle zwingend voraus. Ex-nunc-Subskriptionen
hingegen greifen nur auf Daten, die nach ihrem Einbringzeitpunkt publiziert werden, zu. Alle
ihre Ressourcen verzichten auf eine Initialisierung; der Zustand der referenzierten Kanäle
muss nicht verfügbar sein.
Abbildung 6.2 zeigt schematisch die von verschiedenen Subskriptionstypen benötigten Daten. Die durchgezogene vertikale Linie markiert darin den Zeitpunkt der Einbringung einer
Subskription, die gestrichelten Linien die Publikationszeitpunkte. Der oberste Balken steht
210
für die Ex-tunc-Subskription in ihrer allgemeinsten Form. Es können alle Daten der Vergangenheit und der Zukunft verwendet werden, vorausgesetzt natürlich, sie sind im Zustand des
Informationskanals zum Auswertungszeitpunkt enthalten und durch abfragbare Kanäle verfügbar. Der zweite Balken zeigt die Ex-nunc-Subskription. Es werden nur Daten verwendet,
die nach dem Einbringungszeitpunkt publiziert werden; in diesem Fall sind keine abfragbaren Informationskanäle und somit auch keine abfragbaren Produzenten erforderlich. Der
dritte Balken stellt einen Spezialfall der Ex-tunc-Subskription dar und benötigt somit abfragbare Informationskanäle. Hier werden allerdings nur vor dem Einbringzeitpunkt publizierte
Daten verwendet. Die Besonderheit dieses Falls liegt darin, dass sich das Ergebnis auch bei
wiederholter Auswertung im Laufe der Zeit nicht ändert.
Eine weitere, zur Einteilung in ex-tunc und ex-nunc orthogonale Klassifikation von Subskriptionen lässt sich aufgrund der auszuliefernden Daten vornehmen. Liefert eine Subskription als Ergebnis die Auswertung der Anfrage auf dem aktuellen Zustand der Datenquellen,
so wird sie als komplett bezeichnet. Liefert sie dagegen nur diejenigen Tupel, die sich im Vergleich zur letzten Auslieferung geändert haben, so wird sie partiell genannt.
Eine Subskription beziehe sich beispielsweise auf die Informationskanäle des laufenden Beispiels und selektiere die Kurse der IBM-Aktie. Nach der Publikation der ersten drei Tupel
des Beispiels liefert sie erwartungsgemäß die folgende Queue:
[([‘IBM’, ‘BER’], [‘145,30’]),
([‘IBM’, ‘STU’], [‘144,20’])
]
Fordert die Subskription eine komplette Auslieferung der Ergebnisse, so werden nach der
Publikation des neuen Aktienkurses erneut zwei Tupel – diesmal jedoch das mit dem aktualisierten Kurs der Berliner Börse – ausgeliefert:
[([‘IBM’, ‘BER’], [‘146,50’]),
([‘IBM’, ‘STU’], [‘144,20’])
]
Bei der inkrementellen Auswertung werden die Unterschiede zwischen der letzten Auslieferung und dem aktuellen Auswertungsergebnis ermittelt und es wird lediglich das geänderte
Tupel als Resultat versandt:
[([‘IBM’, ‘BER’], [‘146,50’])]
Publikation
Einbringung der
Subskription
Zeit
allg. ex-tunc
allg. ex-nunc
spez. ex-tunc
Abb. 6.2: Benötigte Daten verschiedener Subskriptionstypen
7 Zusammenfassung
211
Die Begriffe der kompletten und partiellen Auswertung sind an Operatorenbäume gekoppelt
und lassen sich somit nicht nur für den Rumpf, sondern auch für die Bedingungen angeben.
Da der Rumpf die zentrale Komponente der Subskription ist, wird sie als komplett bzw. partiell auszuwertend bezeichnet, wenn der Rumpf die entsprechende Eigenschaft hat.
7
Zusammenfassung
Dieser Beitrag beschreibt und definiert die theoretischen Strukturen, die dem PubScribe-System zu Grunde liegen. In einem ersten Schritt wird die Queue als Struktur zur Datenverwaltung und eine Menge von Operatoren zu ihrer Modifikation und Kombination dargestellt.
Darauf aufbauend werden Informationskanäle und deren Zustand definiert. Anschließend erfolgt die Definition eines Produzenten und seine Klassifikation bezüglich der Abfragbarkeit
und der Art der Publikationen.
Literatur
ASS+99 Aguilera, M.; Strom, R.; Sturman, D.; Astley, M.; Chandra, T.: Matching Events in a Content-Based
Subscription System. In: Proceedings of the 18th ACM Symposium on Principles of Distributed
Computing (PODC‘99, Atlanta (GA), USA, 3.-6. Mai), 1999, S. 53-61
CaRW00 Carzaniga, A.; Rosenblum, D.S.; Wolf, A.L.: Achieving Scalability and Expressiveness in an
Internet-Scale Event Notification Service. In: Proceedings of the 19th ACM Symposium on Principles
of Distributed Computing (PODC 2000, Portland (OR), USA, 16.-19. Juli), 2000, S. 219-227
RRHL01 Redert, M.; Reinhard, C.; Hümmer, W.; Lehner, W.: Ausgewählte Konzepte der Realisierung von
PubScribe. In: Lehner, W. (Ed.): Advanced Techniques in Personalized Information Delivery.
Arbeitsbericht des Instituts für Informatik, Friedrich-Alexander Universität Erlangen-Nürnberg,
2001
LeHü01 Lehner, W.; Hümmer, W.: The Revolution Ahead: Publish/Subscribe meets Database Systems. In:
SAB+00 Segall, B.; Arnold, D.; Boot, J.; Henderson, M.; Phelps, T.: Content Based Routing with Elvin4. In:
Proceedings of the AUUG 2000 Technical Conference (AUUG 2000, Canberra, Australien, 25.-30.
Juni), 2000
SmSm77 Smith, J.M.; Smith, D.C.P.: Database Abstractions: Aggregations and Generalization. In: ACM
Transactions on Database Systems (TODS, Vol. 2, Nr. 2), 1977, S. 105-133
SQL99 N.N.: ISO/IEC 9075: Information technology - Database Languages - SQL, 1999
W3C99 N.N.: Extensible Markup Language (XML) 1.0. World Wide Web Consortium, 1999
212
7. Zusammenfassung
AUSGEWÄHLTE KONZEPTE DER
REALISIERUNG VON PUBSCRIBE
M. Redert, C. Reinhard, W. Hümmer, W. Lehner
{mlredert, cnreinha, lehner, huemmer}@immd6.informatik.uni-erlangen.de
Kurzfassung
Modellierungsaspekte erfordern stets eine adäquate Realisierung. In diesem Beitrag werden eine Vielzahl unterschiedlicher Aspekte der Realisierung von PubScribe
aufgegriffen und ausführlich diskutiert. Dies erfolgt in drei Schritten; Im ersten Schritt
wird beschrieben, wie eine eingehende Subskription den globalen Anfragegraph modifiziert und Restrukturierungen impliziert. Im zweiten Schritt wird erläutert, welche
dynamische Vorgänge bei der Registrierung eines Produzenten und der Aufnahme einer
produzierten Nachricht ablaufen. Der Hauptteil dieses Beitrags widmet sich der Thematik der anfrageübergreifenden Optimierung und Restrukturierung. Prinzipielle Abläufe
in diesem Zusammenhang, notwendige Voraussetzungen und die Erzeugung von Kompensationen werden aufgeführt. Die Technik im speziellen und die Realisierung im allgemeinen wird abschließend an Evaluierungsstudien erläutert.
214
1
1. Einleitung
Einleitung
Im Rahmen dieses Beitrags werden die dynamischen Abläufe und Realisierungsaspekte des
PubScribe-Systems erläutert. Einen Überblick über die typischen Aufgaben des Systems gibt
das ’Usecase-Diagram’ ([Fowl00]) der Abbildung 1.1. Wie dort zu erkennen ist, interagieren
drei sogenannte Akteure mit dem System: Subscriber, Publisher und Administratoren. Sie benutzen es, indem sie die Ausführung von Anwendungsfällen initiieren. Über welche Anwendungsfälle ein Akteur mit dem System kommuniziert, ist in der Abbildung durch Linien gekennzeichnet. Gerichtete Kanten zwischen Anwendungsfällen geben an, dass eine Aufgabe
Teil einer anderen ist. So muss beispielsweise beim Löschen einer Subskription mindestens
ein Operatorenbaum gelöscht werden. Die wichtigsten Anwendungsfälle werden im weiteren Verlauf dieses Beitrags aufgearbeitet und ausführlich dargestellt. So wird im ersten Abschnitt der Vorgang einer Subskriptionsregistrierung aufgearbeitet. Daran schließt sich eine
Beschreibung der Produzentenregistrierung und der Verarbeitung einer eingehenden Publikation an. Abschnitt 4 schließlich diskutiert den Einsatz anfragenübergreifender Optimierungstechniken, die in PubScribe zum Einsatz kommen. Der Beitrag schließt mit einer Evaluierung der PubScribe-Implementierung, wobei insbesondere der zuvor angesprochene Bereich
der anfrageübergreifende Optimierung zur Diskussion steht.
2
Registrierung einer Subskription
Eine Subskription trifft beim Subscriptions-Handler in Form eines XML-Dokuments ein.
Zuerst wird sie gegen die DTD validiert. Da die Korrektheit im Sinne der DTD noch keine
Ausführbarkeit garantiert, wird dies in einem zweiten Schritt durch die Komponenten des
Subscriber
Broker
Subskription löschen
Subskription registrieren
Ablauf steuern
Operatorenbaum löschen
Operatorenbaum einfügen
Operatorenbaum auswerten
Zustand wiederherstellen
Operatorenbaum optimieren
Publisher registrieren
Publizieren
Publisher
Initiale Auswertung
System starten
System herunterfahren
Administrator
Abb. 1.1: Anwendungsfalldiagramm der Verwendungsweisen des PubScribe-Systems
2 Registrierung einer Subskription
215
Brokers überprüft. Ist eine Subskription ausführbar, so wird eine eindeutige Identifikation
zurückgeliefert. Mit dieser sind Modifikationen der Subskription oder ihr vorzeitiges Entfernen aus dem System durch den Benutzer möglich. Zuletzt erfolgt die Publikation der Registrierungsdaten in den Metakanal und die Weiterleitung der Subskription an den Core Broker.
Dort übernimmt die Ablaufsteuerung die Kontrolle über sie. Der Ablauf einer Subskription
von ihrer Einrichtung über die Auslieferung von Ergebnissen bis zu ihrer Entfernung ist
komplex und wird daher in den folgenden Abschnitten detaillierter erläutert.
QRU
DPU
OCU
STA
Ablaufsteuerung
Core Broker
Das in diesem Abschnitt beschriebene Einfügen eines Operatorenbaums in das SysDeliverer
tem stellt eine zentrale Funktion des Core
Mail SMS Web
...
Port
Brokers dar. Ihr Ziel ist es, einen durch ein
XML-Dokument beschriebenen Baum derart zu repräsentieren, dass dieser bei Publikationen ausgewertet werden kann. Dazu
wird der Operatorenbaum in einem mehrRDBS
stufigen Prozess optimiert und schließlich
auf voneinander abhängige Sichten und Tabellen in einer Datenbank abgebildet. Die
Publisher Handler
einzelnen dazu notwendigen Stufen sind in
der Detailansicht des Core Brokers in AbAbb. 2.1: Detailansicht der Core Broker
bildung 2.1 dargestellt. Ein OperatorenKomponente mit angrenzenden Modulen
baum durchläuft nacheinander die Query
Restructuring Unit (QRU), die Data Processing Unit (DPU) und die Operator Clustering
Unit (OCU), bis er schließlich durch die Staging Area (STA) auf relationale Strukturen in einem Datenbanksystem abgebildet wird.
Zur Verdeutlichung der im Folgenden dargestellten Funktionsweisen der einzelnen Optimierungsschichten dient ein komplexeres Beispiel aus dem Börsenszenario. Der Anwender wird
über diejenigen Aktien des DAX informiert, für die der 200-Tages-Durchschnitt unter dem
30-Tages-Durchschnitt der täglichen Schlusskurse (um 19:30 Uhr) liegt. Diese Information
hat in der technischen Aktienanalyse eine große Bedeutung, da in einem solchen Fall besondere Aufmerksamkeit geboten ist. Die Benachrichtigung soll ab dem 1.10.2000 an jedem
Wochentag um 20:00 Uhr erfolgen. Der Rumpf und die Auslieferungsbedingung der Subskription kann Abbildung 2.2 entnommen werden, die Startbedingung ist in Abbildung 2.3
dargestellt.
216
2. Registrierung einer Subskription
([wp, hp], [avg30Kurs, avg200Kurs])
Schalter
([], [bedingungOK])
Gruppierung
wp, hp
avg30Kurs=min(avg30Kurs)
Skalar
([wp, hp, ..., wp2, hp2, ..., kursInvers],
[avg30Kurs, avg200Kurs])
Selektion
kursInvers=’true’
Shift
kursInvers
kursInvers=smaller(avg30Kurs, avg200Kurs)
avg30Kurs=id(avg30Kurs)
([wp, hp, ..., wp2, hp2, ...],
Verbund
Skalar
([wp, hp, ...],
[avg200Kurs])
Skalar
([wp, hp, zeit, wp2, idx, uhrzeit],
[sumKurs, countKurs])
Window
Window
Selektion
Shift
Skalar
idx=’DAX’
uhrzeit=’19:30’
uhrzeit
wp=wp
kurs=id(kurs)
uhrzeit=time(zeit)
Selektion
Verbund
([], [zeit])
wp, hp
uhrzeit ASC
p, [-200, 0]
sumKurs=sum(kurs), countKurs=count(kurs)
([wp, hp, zeit, wp2, idx], [kurs, uhrzeit])
kurs=id(kurs)
uhrzeit=time(zeit)
Selektion
Zeit
([wp, hp, zeit, wp2, idx, uhrzeit], [kurs])
uhrzeit
([wp, hp, zeit, wp2, idx], [kurs, uhrzeit])
Skalar
dow=dow(zeit)
uhrzeit=time(zeit)
([wp, hp, zeit, wp2, idx, zeit], [kurs])
uhrzeit=’19:30’
Shift
dow
uhrzeit
([], [zeit])
avg200Kurs=div(sumKurs, countKurs)
wp, hp
uhrzeit ASC
p, [-30, 0]
Selektion
([dow, uhrzeit], [])
Shift
Skalar
uhrzeit=‘20:00’ AND
dow<>‘Samstag’ AND
dow<>‘Sonntag’
([], [dow, uhrzeit])
wp=wp AND hp=hp
([wp, hp, ...],
[avg30Kurs])
anzahl=count(dow)
Selektion
([wp, hp, ..., wp2, hp2, ...],
[kursInvers, avg30Kurs, avg200Kurs])
Skalar
([], [anzahl])
Gruppierung
bedingungOK=greater(anzahl, 0)
idx=’DAX’
Verbund
wp=wp
Aktien
Indizes
Aktien
Indizes
([wp, hp, zeit],
[kurs, vol])
([wp, hp, zeit],
[kurs, vol])
Abb. 2.2: Unoptimierter Operatorenbaum
2.1 Die "Schalter"-Metaoperation
Ein Schalter bildet die Schnittstelle zwischen Operatorenbäumen und der Ablaufsteuerung.
Logisch gesehen ist er ein Element der Ablaufsteuerung, das aber als Operator repräsentiert
ist. Er übernimmt den Teil der Steuerung einer Subskription, der kein Einfügen und Löschen
von Operatorenbäumen und keine Auslieferung zur Folge hat. Die Ablaufsteuerung wird erst
dann involviert, wenn eine dieser Operationen erforderlich ist.
Der Schalter ist ein binärer Metaoperator, dessen linker Operand der Rumpf einer Subskription ist und dessen rechter Operand der Operatorenbaum einer Bedingung ist. Ist die Bedingung erfüllt, so wird der Rumpf ausgewertet und als Ergebnis des Operators an die Ablaufsteuerung weitergegeben; wenn nicht, erfolgt keine Auswertung des linken Unterbaums und
auch keine Mitteilung nach ‘oben’. Abbildung 2.2 zeigt den Rumpf und die Auslieferungs-
2.2 Query Restructuring Unit
217
bedingung des obigen Beispiels, die über einen Schalter verbunden sind. Da die Auswertung
der Bedingung die des Rumpfes steuert, wird der rechte Teilbaum eines Schalters als aktiv
bezeichnet, der linke als passiv.
Um eine einheitliche Repräsentation von Operatorenbäumen zu erzielen, sollen auch Startund Stopbedingungen über einen Schalter mit der Ablaufsteuerung kommunizieren. Dazu
wird als linker Operator des Schalters die Konstante ‘wahr’ verwendet. In Abbildung 2.3 ist
der derartig erweiterte Operatorenbaum der Startbedingung zu sehen. Der um einen Schalter
erweiterte Operatorenbaum durchläuft bei seiner Integration in das System mehrere Optimierungsstufen, deren Ziel seine Abbildung auf Strukturen einer relationalen Datenbank ist.
Sie werden in den nachfolgenden Abschnitten beschrieben.
([], [wert])
Schalter
([], [wert])
([], [bedingungOK])
Skalar
wahr
([], [wert])
bedingungOK=greater(zeit , ‘1.10.2000 12:00’)
([], [zeit])
Zeit
([], [zeit])
Abb. 2.3: Operatorenbaum der Startbedingung
2.2 Query Restructuring Unit
Die Query Restructuring Unit erhält von der Ablaufsteuerung ein XML-Dokument, das einen erweiterten Operatorenbaum darstellt, und führt eine Anfrageoptimierung auf algebraischer Ebene durch. Diese läuft wie in relationalen Datenbanksystemen ab. Die Grundidee ist
es dabei, Operatoren mit einer hohen Selektivität möglichst früh auszuführen, damit die Zahl
der von jedem Operator zu verarbeitenden Tupel klein ist. Die dazu notwendige Umstrukturierung des Operatorenbaums muss natürlich die Semantik beibehalten. Die durchzuführenden Maßnahmen können beliebig komplex werden, wenn Heuristiken zur Ermittlung der
Zahl der Tupel in einem Teilergebnis herangezogen werden. Es lassen sich jedoch auch allgemein gültige Aussagen zur sinnvollen Umstrukturierung treffen. So ist es ratsam, Selektionen und Gruppierungen möglichst frühzeitig durchzuführen, d.h. zu den Blättern des
Baums hin zu verschieben, da sie die Tupelzahl nur verringern können. Vereinigungen und
insbesondere Verbundoperationen können die Zahl der Tupel im Ergebnis stark erhöhen und
sollten daher möglichst spät ausgeführt werden; sie werden im Operatorenbaum zur Wurzel
hin verschoben. Da diese Optimierungstechnik gängige Praxis in allen relevanten Datenbanksystemen ist, wird hier nicht näher auf sie eingegangen. Details können in [ElNa99]
nachgelesen werden.
Die QRU liefert als Ergebnis die Beschreibung des optimierten Operatorenbaums in Form
eines XML-Dokuments, das an die nachfolgende Optimierungsschicht weitergereicht wird.
218
Restrukturierung eines Operatorenbaums durch die QRU
In Abbildung 2.4 ist der Operatorenbaum aus Abbildung 2.2 nach seiner Restrukturierung
durch die Query Restructuring Unit dargestellt. Die Modifikationen sind durch Pfeile angedeutet.
Zunächst wird die Selektion nach dem Index DAX in den rechten Ast des unteren Verbundes
verschoben. Dies ist möglich, weil sich die Selektionsbedingung nur auf Attribute des Kanals Indizes bezieht. Die Zahl der in den Verbund eingehenden Tupel wird somit verringert.
Der Skalaroperator, der Shift-Operator und die anschließende Selektion werden mit der selben Begründung in den linken Ast des Verbundes gezogen, wobei ihre Reihenfolge unverändert bleibt. Da die Unterbäume des oberen Verbundes bis auf den Window-Operator identisch sind, können diese Umstrukturierungen in seinen beiden Unterbäumen gleichermaßen
durchgeführt werden. Im oberen Teil des Rumpfes und in der Bedingung sind keine Optimierungen möglich, weil Nachfolgeoperatoren immer von den Resultaten ihrer direkten Vorgänger abhängen.
Schalter
([], [bedingungOK])
Gruppierung
wp, hp
...
Selektion
Shift
kursInvers
([wp, hp, ..., wp2, hp2, ...],
Skalar
([wp, hp, ..., wp2, hp2, ...],
Verbund
wp=wp AND hp=hp
([wp, hp, ...],
[avg30Kurs])
([wp, hp, ...],
[avg200Kurs])
Skalar
Skalar
Window
wp, hp
uhrzeit ASC
p, [-30, 0]
Window
([wp, hp, zeit, wp2, idx, uhrzeit], [kurs, beginn])
Verbund
([wp, hp, zeit, uhrzeit], [kurs])
Selektion
Selektion
Shift
Verbund
uhrzeit
Indizes
([wp, hp, zeit], [kurs, uhrzeit])
Skalar
kurs=id(kurs)
uhrzeit=time(zeit)
idx=’DAX’
wp, hp
uhrzeit ASC
p, [-200, 0]
([wp, hp, zeit, wp2, idx, uhrzeit], [kurs, beginn])
wp=wp
uhrzeit=’19:30’
Selektion
wp=wp
uhrzeit=’19:30’
Selektion
Shift
uhrzeit
idx=’DAX’
Indizes
Skalar
kurs=id(kurs)
uhrzeit=time(zeit)
Aktien
Aktien
([wp, hp, zeit],
[kurs, vol])
([wp, hp, zeit],
[kurs, vol])
Abb. 2.4: Restrukturierter Operatorenbaum
2.3 Data Processing Unit
219
Der nächste Schritt der Optimierung erfolgt in der Data Processing Unit. Aus der textuellen
Beschreibung des erweiterten Operatorenbaums wird eine Datenstruktur erzeugt, die eine
einfache Handhabung des Baums im Programm erlaubt. Operatoren werden durch Instanzen
verschiedener Klassen repräsentiert, die Operator-Operand-Beziehungen werden durch Referenzen dargestellt. Aus logischer Sicht stellt diese Struktur einen gerichteten Baum dar,
dessen Knoten Operatoren sind, die durch Kanten verbunden sind. Die Richtung der Kanten
wird durch die Relation ‘ist Operand von’ bestimmt; sie verlaufen also von den Blättern zur
Wurzel.
Werden Subskriptionen z.B. durch die Parametrierung eines Grundgerüsts erzeugt, so entstehen identische oder in weiten Teilen gleiche Operatorenbäume. Dies ist auch in abgeschwächter Form der Fall, wenn sich mehrere Subskriptionen auf dieselben Datenquellen beziehen, da für verschiedene Anwender häufig ähnliche Informationen von Interesse sind. Zur
optimierten Auswertung werden daher Überdeckungen des neuen Operatorenbaums mit bereits bestehenden von der DPU erkannt, sodass die entsprechenden Teile gemeinsam genutzt
werden können. Verschiedene Operatorenbäume werden somit nicht getrennt verwaltet, sondern bilden gemeinsam einen gerichteten azyklischen Graphen (DAG). Die Optimierung hat
den Vorteil, dass weniger Operatorenknoten existieren, was auf der Datenbankebene zu einer
verminderten Anzahl von Sichten und damit zur Einsparung von Ressourcen führt.
Ein weitreichenderer Vorteil der gemeinsamen Nutzung von Teilbäumen ergibt sich bei der
Auswertung von Bedingungen. Hängen mehrere Bedingungen von einer Ressource ab, so
muss jede von ihnen bei einer Publikation in den zugeordneten Informationskanal ausgewertet werden. Da jede Bedingung auf Datenbankebene als eine Menge von kaskadierten Sichten realisiert ist, müssen alle diese bei der Auswertung jeder Bedingung berechnet werden.
Ist hingegen der oberste Knoten eines gemeinsamen Teilbaums nicht als Sicht, sondern als
materialisierte Sicht repräsentiert, so kann dieser bei der Auswertung der ersten Bedingung
einmalig aktualisiert werden und stellt so sein Ergebnis für die folgenden Auswertungen
ohne zusätzlichen Berechnungsaufwand bereit. Um die Auswertungszeit zu minimieren,
sollte also der oberste Knoten jedes Teilbaums, der von mindestens zwei Bedingungen genutzt wird, materialisiert werden. Wie später noch genauer dargestellt wird, ist ebenso die
Materialisierung der Vorgänger von Window-Operatoren, Gruppierungen und Verbunden
sinnvoll (Abschnitt ). Aufgrund des i.A. beschränkten Speicherplatzes ist jedoch eine Materialisierung in der Regel nicht für alle in Frage kommenden Knoten möglich. Die Entscheidung, welche Knoten günstigstenfalls ausgewählt werden sollten, wird ebenfalls durch die
DPU getroffen. Der Entwurf eines geeigneten Verfahrens hätte allerdings den Rahmen dieser
Arbeit deutlich überschritten.
220
Als Ergebnis liefert die DPU eine Menge von Teilbäumen des ursprünglichen Operatorenbaums, deren Blätter Ressourcen, Konstanten oder materialisierte Operatoren sind. Ihre
Wurzeln sind zu materialisierende Knoten. Die direkt nach dem Schalter folgenden, obersten
Knoten eines jeden Operatorenbaums werden immer materialisiert. Jeder Teilbaum wird als
XML-Dokument an die darunterliegende Schicht weitergegeben.
Optimierung eines Operatorenbaums durch die DPU
Schalter
([], [bedingungOK])
Gruppierung
wp, hp
...
Selektion
Shift
kursInvers
([wp, hp, ..., wp2, hp2, ...],
Skalar
([wp, hp, ..., wp2, hp2, ...],
Verbund
wp=wp AND hp=hp
([wp, hp, ...],
[avg30Kurs])
([wp, hp, ...],
[avg200Kurs])
Skalar
Skalar
([wp, hp, zeit, uhrzeit, wp2, idx],
Window
wp, hp
uhrzeit ASC
p, [-30, 0]
[kurs, beginn])
Verbund
Selektion
wp=wp
Shift
Window
wp, hp
uhrzeit ASC
p, [-200, 0]
[kurs, beginn])
uhrzeit=’19:30’
uhrzeit
Selektion
idx=’DAX’
Indizes
Skalar
kurs=id(kurs)
uhrzeit=time(zeit)
Aktien
([wp, hp, zeit],
[kurs, vol])
Abb. 2.5: Operatorenbaum nach der Optimierung
In diesem Beispiel wird davon ausgegangen, dass außer dem Operatorenbaum aus Abbildung 2.4 keine weiteren Bäume angelegt sind. Eine Optimierung durch Zusammenfassung
von Teilbäumen kann somit nur innerhalb dieses Baums erfolgen. Sofort fällt auf, dass die
beiden Teilbäume unterhalb der Window-Operatoren identisch sind. Sie werden daher nur
einmalig im System erzeugt und von beiden Operatoren genutzt. Der resultierende azyklische Graph von Operatoren ist in Abbildung 2.5 dargestellt. Die Richtung der Kanten verläuft darin implizit von unten nach oben. Die beiden Skalar-Operatoren direkt unterhalb des
oberen Verbundes können – obwohl identisch – nicht zusammengefasst werden, da die Win-
221
dow-Operatoren unterschiedliche Tupel als Eingabe liefern. Im Bedingungsteil ist ebenso
keine Optimierung möglich, weil kein zweiter Knoten der Ressource Zeit existiert. Er bleibt
daher unverändert gegenüber Abbildung 2.2.
Zur Materialisierung kommen die in Abbildung 2.5 grau markierten Knoten in Betracht. Der
Schalter
([], [bedingungOK])
Handle2
Handle3
wp, hp
avg200Kursmin(avg200Kurs)
Gruppierung
Skalar
bedingungOK=greater(anzahl, 0)
([], [anzahl])
Selektion
Gruppierung
anzahl=count(dow)
kursInvers
Shift
Selektion
([wp, hp, ..., wp2, hp2, ...],
Skalar
([wp, hp, ..., wp2, hp2, ...],
uhrzeit=‘20:00’ AND
dow<>‘Samstag’ AND
dow<>‘Sonntag’
Shift
dow
uhrzeit
([], [dow, uhrzeit])
wp=wp AND hp=hp
Verbund
([wp, hp, ...],
[avg30Kurs])
Skalar
([wp, hp, ...],
[avg200Kurs])
dow=dow(zeit)
uhrzeit=time(zeit)
([], [zeit])
Skalar
Skalar
Window
Zeit
wp, hp
uhrzeit ASC
p, [-30, 0]
Window
[kurs, beginn])
([], [zeit])
wp, hp
uhrzeit ASC
p, [-200, 0]
[kurs, beginn])
Handle1
Handle1
Verbund
Selektion
wp=wp
uhrzeit=’19:30’
Shift
uhrzeit
Selektion
idx=’DAX’
Indizes
Skalar
kurs=id(kurs)
uhrzeit=time(zeit)
Aktien
([wp, hp, zeit],
[kurs, vol])
Abb. 2.6: Zerlegung des Operatorenbaums in materialisierbare Teilbäume
Verbund kann gewählt werden, weil er der oberste Knoten des gemeinsam genutzten Teilbaums und der Vorgänger von Window-Operatoren ist. Die Selektionen und Skalaroperatoren kommen als direkte Vorgänger eines Verbundes bzw. einer Gruppierung in Frage. In der
weiteren Darstellung wird davon ausgegangen, dass nur der Verbund zur Materialisierung
ausgewählt wird.
222
Somit wird, wie im unteren Teil der Abbildung 2.6 zu sehen ist, der gemeinsame Teilbaum
abgetrennt und an die darunterliegende Verarbeitungsschicht weitergeleitet. Sie legt eine materialisierte Sicht für den Wurzelknoten an und liefert einen Verweis (Handle1) darauf zurück. Dieses wird statt des Teilbaums als Knoten in den übrigen Baum eingefügt. Für den
oberen Teil des Rumpfes wird das Handle2 geliefert und für die Bedingung das Handle3. Beide werden im Schalter gespeichert, um eine Auswertung der Bäume initiieren zu können
bzw. um auf deren Ergebnis zugreifen zu können (Abschnitt ).
Operator Clustering Unit
Von der darüberliegenden Schicht erhält die Operator Clustering Unit Operatorenbäume, deren Blätter Ressourcen, Konstanten oder Handles bereits materialisierter Operatoren sind.
Ihre Aufgabe ist es, zu entscheiden, welche Teilbäume durch die Staging Area als eine einzige Sicht in der Datenbank angelegt werden sollen. Die Abbildung jedes einzelnen Operators auf eine eigene Sicht ist zwar möglich, jedoch benötigt jede Sicht System-Ressourcen
und ihre Generierung ist zeitaufwendig. Auch die Erzeugung einer einzigen Sicht für den gesamten Operatorenbaum ist wenig praktikabel, da die Definitionen der Sichten tief geschachtelt und daher sehr komplex wären. Die OCU muss daher einen Mittelweg zwischen diesen
beiden Extremen finden. Sie fasst nur solche Operatoren zusammen, die einen einfachen –
möglichst ungeschachtelten – Selektionsausdruck erzeugen.
Aufeinanderfolgende Selektionen können durch Konjunktion der Bedingungen zusammengefasst werden. Ebenso kann eine Gruppierung mit einer vorausgehenden Selektion und ein
Verbund mit einer nachfolgenden Selektion kombiniert werden. Der Sortieroperator kann
mit jedem Vorgängeroperator verbunden werden. Lediglich der Window-Operator ist in sich
bereits derart komplex, dass auf eine Kombination mit anderen Operatoren verzichtet wird.
Weitere Verknüpfungsmöglichkeiten sind denkbar, sollen jedoch hier nicht aufgezählt werden.
Die von der OCU erzeugten Teilbäume werden als XML-Dokumente an die Staging Area
weitergeleitet, die die Erzeugung einer entsprechenden Sicht in der Datenbank veranlasst.
Verbund
HandleA
Selektion
HandleB
Selektion
idx=’DAX’
uhrzeit
Indizes
Skalar
uhrzeit=’19:30’
Shift
wp=wp
kurs=id(kurs)
uhrzeit=time(zeit)
Aktien
([wp, hp, zeit],
[kurs, vol])
Abb. 2.7: Cluster des unteren Teilbaums der Abbildung 2.6
223
Zusammenfassung von Operatoren durch die OCU
Zuerst sei der unterste Operatorenbaum des Rumpfes aus Abbildung 2.6 betrachtet. Wie in
Abbildung 2.7 dargestellt, können in beiden Ästen des Verbundes alle Operatoren gebündelt
werden. Der Verbund selbst bleibt einzeln, da sonst ein geschachtelter SQL-Ausdruck entstehen würde. Wie bei der Zerlegung durch die DPU werden auch hier Teilbäume durch
Handles ersetzt. Diese werden jedoch nicht an die DPU weitergereicht; für sie erfolgt die
Aufteilung der Operatorenbäume völlig transparent.
Die Zerlegung des oberen Teilbaums aus Abbildung 2.6 ist in Abbildung 2.8 zu sehen. Hier
entstehen auch nur einen Operator umfassende Cluster, da insbesondere der Window- und
der Verbundoperator komplexe SQL-Ausdrücke erzeugen und somit nicht mit weiteren Operatoren zusammengefasst werden. Die Teilbäume werden von unten beginnend an die Staging Area weitergereicht, da die zurückgelieferten Handles in den darüberliegenden Clustern
statt der Unterbäume eingesetzt werden müssen. Die durch die Handles gegebenen Abhängigkeiten sind in den Abbildungen 2.7 und 2.8 durch Pfeile angedeutet. Handles mit numerischem Index sind für die DPU sichtbar, die übrigen sind nur in der OCU vorhanden.
Gruppierung
wp, hp
Selektion
Shift
kursInvers
([wp, hp, ..., wp2, hp2, ...],
Skalar
([wp, hp, ..., wp2, hp2, ...],
HandleG
Verbund
wp=wp AND hp=hp
([wp, hp, ...],
[avgKurs])
HandleE
Skalar
HandleF
Skalar
HandleC
Window
HandleD
wp, hp
uhrzeit ASC
p, [-200, 0]
[kurs, beginn])
Handle1
([wp, hp, ...],
[avgKurs])
Window
wp, hp
uhrzeit ASC
p, [-200, 0]
[kurs, beginn])
Handle1
Abb. 2.8: Cluster des oberen Teilbaums aus Abbildung 2.6
224
Staging Area
Aufgabe der Staging Area ist es, aus einer von der OCU erhaltenen Definition eines Operatoren-Teilbaums einen SQL-Ausdruck zu erzeugen, der eine entsprechende Sicht anlegt, und
diesen in der Datenbank auszuführen. Der SQL-Ausdruck wird mit den Blättern eines Baums
beginnend generiert. Handelt es sich bei einem Blattknoten um eine Konstante, so wird eine
Tabelle mit dem vorgegebenen Schema angelegt und das entsprechende Tupel eingefügt.
Eine Ressource ist durch eine Staging Table repräsentiert, die bei der Registrierung des entsprechenden Publishers erzeugt wird. Ist ein Blattknoten ein Handle, so existiert bereits eine
Sicht, die die geforderten Daten liefert. Für jeden inneren Operator des Baums wird nun entsprechend seines Typs ein SQL-Ausdruck erzeugt. Als Datenquellen werden in der FromKlausel die Namen von Tabellen (bei Ressourcen und Konstanten) und Sichten (bei Handles)
oder eingeschachtelte SQL-Ausdrücke (bei einem Nicht-Blattknoten) angegeben. Auf diese
Weise entsteht aus dem Operatorenbaum ein SQL-Ausdruck, der die Baumstruktur durch in
der From-Klausel geschachtelte Ausdrücke abbildet. Die Parametrierungen der Operatoren
spiegelt sich in den anderen Klauseln wider. So wird z.B. die Selektionsbedingung durch die
Where-Klausel repräsentiert, die Gruppierung durch die Group-by-Klausel und Aggregatfunktionen in der Select-Klausel. Skalarfunktionen werden ebenfalls in der Select-Klausel
angegeben, jedoch müssen für sie zuvor benutzerdefinierte Funktionen in der Datenbank generiert worden sein. Der entstehende SQL-Ausdruck wird zur Erzeugung der Sicht verwendet.
Erzeugung von Sichten durch die Staging Area
Abbildung 2.9 zeigt die Generierung einer Sichtdefinition aus einem Operatorenbaum am
Beispiel des linken unteren Clusters aus Abbildung 2.7. Die Staging Table (Abschnitt 3) des
Informationskanals Aktien bildet die Basis der Sicht; sie ist in der From-Klausel zu finden.
Die Skalarfunktion time wird auf die gleichnamige, benutzerdefinierte Attributfunktion abgebildet, die Identitätsfunktion id benötigt kein Gegenstück. Die Selektionsbedingung spiegelt sich in der Where-Klausel des Ausdrucks wider. Der Shift-Operator hat kein Gegenstück
Selektion
uhrzeit=’19:30’
Shift
uhrzeit
Skalar
kurs=id(kurs)
uhrzeit=time(zeit)
CREATE VIEW V1 AS
SELECT wp, hp, zeit, time(zeit) as uhrzeit, kurs
FROM Aktien
WHERE uhrzeit=‘19:30’;
Aktien
([wp, hp, zeit],
[kurs, vol])
Abb. 2.9: Abbildung eines Operatorenbaums auf eine Sicht
225
in der SQL-Anweisung, da die Zugehörigkeit eines Attributs zum Identifikations- oder Informationsteil einer Queue keine Entsprechung auf Datenbankebene hat*. Der Name V1 der
Sicht wird nach ihrer Erzeugung als Handle an die OCU zurückgegeben.
Die Abhängigkeit von Operatorenbäumen, die auf Ebene der OCU durch Handles zum Ausdruck kommt, spiegelt sich in der Staging Area durch Abhängigkeiten der Sichten wider.
Dieser Zusammenhang wird in Abbildung 2.10 anhand der Cluster aus Abbildung 2.8 verdeutlicht. Es ist zu erkennen, dass durch die Handles auf der einen Seite und die Abhängigkeitsbeziehungen der From-Klauseln auf der anderen strukturgleiche Bäume entstehen.
Gruppierung
wp, hp
Selektion
Shift
kursInvers
([wp, hp, ..., wp2, hp2, ...],
Skalar
CREATE VIEW V5 AS
SELECT wp, hp,
min(avg30Kurs) as avg30Kurs,
min(avg200Kurs) as avg200Kurs
FROM V4
WHERE avg30Kurs < avg200Kurs
GROUP BY wp, hp
([wp, hp, ..., wp2, hp2, ...],
HandleG
Verbund
wp=wp AND hp=hp
([wp, hp, ...],
[avg30Kurs])
([wp, hp, ...],
[avg200Kurs])
HandleE
Skalar
HandleF
([wp, hp, ...],
HandleC
Skalar
([wp, hp, ...],
HandleD
CREATE VIEW V5 AS
SELECT V3.wp, V3.hp, ...,
V3.avg30Kurs as avg30Kurs,
V4.wp as wp2, 4.hp as hp2, ...,
V4.avg200Kurs as avg200Kurs
FROM V3, V4
WHERE V3.wp=V4.wp AND V3.hp=V4.hp
CREATE VIEW V3 AS
SELECT *, sumKurs/countKurs
as avg30Kurs
FROM V1
CREATE VIEW V4 AS
SELECT *, sumKurs/countKurs
as avg200Kurs
FROM V2
Abb. 2.10: Abhängigkeiten zwischen Teilbäumen und Sichten
Soll eine Sicht materialisiert werden, so muss eine Tabelle mit dem Schema ihres Ergebnisses angelegt werden. Im Gegensatz zu herkömmlichen Sichten muss eine materialisierte
Sicht vor jedem Zugriff aktualisiert werden. Hier ist zwischen solchen Sichten zu unterscheiden, die im Bedingungsteil eines Operatorenbaums vorkommen und solchen aus dem Anfrageteil. Erstere müssen immer aktuell sein, da jede Änderung einer der Quellen die erneute
Auswertung der Bedingung erforderlich macht. Diese Forderung kann auf einfache Weise
durch den Einsatz von Triggern realisiert werden. Dazu werden für jede materialisierte Sicht
Änderungstrigger auf alle ihre materialisierten Vorgängersichten eingerichtet. Ändert sich
eine von ihnen, feuert der Trigger und die abhängige Sicht wird aktualisiert. Dies kann wei* Die Abbildung des Identifikationsteils auf den Primärschlüssel ist zwar für materialisierte Sichten
realisierbar, nicht jedoch für unmaterialisierte Sichten.
226
tere Trigger auslösen, die von der Sicht abhängige Sichten aktualisieren. Somit werden alle
materialisierten Sichten, die Teil einer Bedingung sind, bei Änderungen eines Informationskanals automatisch auf den aktuellen Stand gebracht. Eine materialisierte Sicht aus dem Anfrageteil eines Operatorenbaums muss dagegen nicht bei jeder Änderung der Daten aktualisiert werden, da sie nur dann benötigt wird, wenn die zugehörige Bedingung erfüllt ist. Für
den passiven Zweig des erweiterten Operatorenbaums werden daher keine Trigger erzeugt.
Trigger in einem Operatorenbaum
Abbildung 2.11 zeigt den aktiven Teil eines hypothetischen Operatorenbaums. Die Operatoren und die fließenden Daten sind zur Bestimmung der Trigger irrelevant und daher nicht dargestellt. Materialisierte Sichten sind in der Abbildung grau gekennzeichnet, die Pfeile zeigen
die Abhängigkeiten der Sichten durch Trigger auf. Ressourcen triggern die direkt von ihnen
anhängigen materialisierten Sichten. Diese wiederum triggern Sichten, für die sie als Berechnungsgrundlage dienen. Da der Schalter das Ergebnis des gesamten Operatorenbaums als
materialisierte Sicht enthält, muss dieser ebenfalls getriggert werden.
2.4 Sonderfall der Initialauswertung
Beim Einfügen einer Anfrage mit Ex-tunc-Semantik muss ihr Ergebnis ein erstes Mal auf
Basis des momentanen Zustands der beteiligten Datenquellen ermittelt werden. Dieser Vorgang wird initiale Auswertung genannt. Um sie durchführen zu können, muss jeder beteiligte
Informationskanal abfragbar sein, d.h. den Zugriff auf seinen momentanen Zustand erlauben. Ist dies nicht der Fall, kann keine Initialauswertung stattfinden und die Ergebnismenge
bleibt bis zur nächsten Publikation eines der beteiligten Publisher leer.
Schalter
...
Abb. 2.11: Trigger in einem Operatorenbaum
3 Produzentenregistrierung und Publikation
227
Um einen Operatorenbaum initial auszuwerten, wird dieser zuerst derartig zerlegt, dass möglichst große Teilbäume entstehen, die nur von einer einzigen Ressource (die allerdings auch
mehrfach vorkommen darf) abhängen. Für den unteren Teil des bereits optimierten Baums
aus Abbildung 2.5 ist diese Zerlegung in Abbildung 2.12 durch die grau hinterlegten Bereiche angedeutet. Ist die Abfragbarkeit des Kanals durch den Broker realisiert, so kann die initiale Auswertung durch die Erzeugung einer Sicht auf die entsprechende Staging Table erfolgen. Ist dies nicht der Fall und ist der entsprechende Publisher eng an das System gekoppelt, so kann der Baum durch den Broker auf Basis der Zustands-Relation ausgewertet
werden. Bei einer losen Kopplung muss der Operatorenbaum an den entsprechenden Publisher gesandt werden. Dieser wertet ihn auf seinem aktuellen Zustand aus und übermittelt
das Ergebnis an den Broker. Es reicht auch aus, lediglich den Blattknoten, der dem Kanal entspricht, berechnen zu lassen; dies würde jedoch bedeuten, dass der Publisher seinen kompletten Zustand übermitteln müsste. Weil dies häufig aufgrund der Größe nicht realisierbar ist,
wird in der Annahme, dass Folgeoperatoren die Tupelzahl veringern, der größte Teilbaum
gewählt. Die Auswertungsergebnisse der Quellen werden in temporären Relationen gespeichert. Liegen alle Ergebnisse vor, so wird der obere Teil des Operatorenbaums auf diesen basierend vom Broker ausgewertet. Das Ergebnis wird im Schalter gespeichert.
Es sei noch angemerkt, dass eine initiale Auswertung nicht nur für Operatorenbäume mit Extunc-Semantik erfolgen muss, sondern auch für alle darin enthaltenen materialisierten Sichten, die inkrementell gepflegt werden sollen.
3
Produzentenregistrierung und Publikation
Bevor ein Publisher publizieren darf, muss er sich beim Broker registrieren. Dazu wendet er
sich mit einer Registrierungsanforderung, die seine Charakteristika enthält, an den Publisher
Handler des Brokers. Die Anforderung enthält den Namen des Informationskanals, in den er
publizieren will, sein Schema, die garantierte Aktualität der Daten, ob es sich um einen
Verbund
Selektion
wp=wp
uhrzeit=’19:30’
Shift
uhrzeit
Selektion
idx=’DAX’
Indizes
Skalar
kurs=id(kurs)
uhrzeit=time(zeit)
Aktien
([wp, hp, zeit],
[kurs, vol])
Abb. 2.12: Zerlegung eines Operatorenbaums
228
3. Produzentenregistrierung und Publikation
Snapshot- oder Delta-Publisher handelt und ob er abfragbar ist. Um die Kontaktaufnahme
mit abfragbaren Publishern zu ermöglichen, enthält ihre Registrierung zusätzlich einen
Rechnernamen und eine Port-Nummer.
Der Publisher Handler validiert die Anforderung anhand der entsprechenden DTD. Basierend auf den Metadaten der bereits registrierten Publisher ermittelt er dann eine eindeutige
Identifikationsnummer als Surrogat, die dem Publisher zurückgeliefert wird. Damit eine Publikation einem Kanal zugeordnet werden kann, muss sie mit dieser Kennung versehen werden. Durch eine Publikation in den Metadatenkanal wird die Existenz des neuen Publishers
dauerhaft gespeichert. Zuletzt wird die Registrierung an den Core Broker weitergeleitet, der
für den neuen Informationskanal eine sogenannte Staging Table anlegt. Wird die Abfragbarkeit des Informationskanals vom Broker realisiert, so enthält sie den gesamten Kanalzustand,
d.h. sie wird bei jeder Publikation inkrementell aktualisiert. Ist der Publisher abfragbar, so
speichert sie die durch die Publikationen auflaufenden Tupel temporär bis zu ihrer Verarbeitung durch alle Subskriptionen zwischen.
Publikation
Publisher senden ihre Publikationen in Form einer als XML-Dokument dargestellten Queue
an das PubScribe-System. Dort nimmt sie der Publisher Handler entgegen und überprüft zuerst ihre Validität anhand der DTD. Durch die eindeutige Kennung, die dem Publisher bei
seiner Registrierung zugewiesen wird, kann der Publisher Handler den Kanal identifizieren
und die Metadaten aktualisieren. Dann übergibt er die Publikation an den Core Broker, wo
sie direkt von der Staging Area weiterverarbeitet wird. Diese ist für die Aktualisierung abhängiger Bedingungen verantwortlich, sodass ggf. eine Benachrichtigung des Benutzers
über geänderte Daten erfolgen kann.
3.1 Auswertung der Operatorenbäume
Aufgrund der in der Publikation enthaltenen Kennung des Informationskanals ermittelt die
Staging Area die zugehörige Staging Table. Realisiert sie die Abfragbarkeit, so beinhaltet sie
den kompletten Kanalzustand und kann mittels der Publikation aktualisiert werden. Speichert sie lediglich Änderungen, so werden die Tupel der Publikation an ihrem Ende eingefügt. Bei einem eng gekoppelten Publisher kann die Publikation verworfen werden, da dieser
die Änderungen bereits vorgenommen hat. Sie hat dann lediglich die Bedeutung einer Bekanntmachung der Zustandsänderung. In jedem Fall hat sich der Zustand des Kanals geändert und es müssen alle abhängigen Bedingungen erneut ausgewertet werden, damit das System feststellen kann, welche Subskriptionsbedingungen erfüllt sind. Dazu werden zuerst die
materialisierten Sichten aller abhängigen aktiven Bäume aktualisiert. Dies geschieht durch
die in Abschnitt 2.11 dargestellte Verkettung der Sichten durch Trigger automatisch bei der
Änderung einer Staging Table. Feuert der Trigger der einem Schalter zugeordneten materialisierten Sicht, so wird dieser Schalter informiert. Aus der nun aktuellen Sicht ermittelt er das
3.1 Auswertung der Operatorenbäume
229
Schalter
Trigger-Nachricht
Aktualisierungsaufforderung
Aktualisierung der Sichten
Publikation
Abb. 3.1: Auswertung von Bedingungen
Resultat der Bedingung und initiiert nach positiven Test die Auswertung seines passiven Astes. Das Ergebnis wird an die Ablaufsteuerung weitergegeben, die je nach Zustand der Subskription geeignet reagieren kann.
Während die Auswertung in aktiven Zweigen von den Blättern zur Wurzel hin erfolgt, verläuft die Auswertung passiver Zweige in der entgegengesetzten Richtung. Die erneute Berechnung des obersten materialisierten Knotens eines passiven Zweiges wird durch den
Schalter veranlasst, wenn die Bedingung erfüllt ist. Da die ihm vorausgehenden materialisierten Sichten nicht notwendigerweise aktuell sind, veranlasst er zuerst deren Aktualisierung. Dieser Vorgang setzt sich sukzessiv bis zu den Blättern des Baumes fort. Von dort beginnend werden die Sichten dann aktualisiert, bis schließlich auch die oberste Sicht auf dem
neuen Stand ist. Durch ihre Abfrage erhält der Schalter das aktuelle Ergebnis des Rumpfes.
Abstrakt gesprochen löst eine Publikation also drei Wellen von Nachrichten aus, die durch
das Netz von Operatorenknoten laufen. Zuerst breitet sich vom geänderten Informationskanal eine Welle von Trigger-Nachrichten aus, die letztendlich einige Schalter erreicht. Ist deren Bedingung erfüllt, so löst jeder einzelne eine Welle von Aktualisierungsaufforderungen
in seinem passiven Zweig aus, die schließlich zu den Blattknoten gelangt. In einer rücklaufenden Welle werden dann die materialisierten Sichten aktualisiert. Dieser Vorgang ist schematisch in Abbildung 3.1 dargestellt. Die durchgezogenen Pfeile sind darin die TriggerNachrichten, die gepunkteten Pfeile die Aktualisierungsaufforderungen und die gestrichelten
Pfeile repräsentieren die Rückmeldung, dass die Aktualisierung erfolgt ist.
230
3. Produzentenregistrierung und Publikation
Enthält die Staging Table nicht den kompletten Kanalzustand, so wird ein neues Tupel durch
den Trigger-Mechanismus sofort in alle materialisierten Sichten aktiver Zweige eingebracht.
Da passive Zweige nur bei Bedarf ausgewertet werden, werden Tupel in deren materialisierte
Sichten möglicherweise erst sehr viel später integriert. Die Staging Table muss Publikationen
also so lange aufheben, bis sie in alle abhängigen materialisierten Sichten passiver Zweige
eingebracht sind. Erst dann können sie verworfen werden. Da diese Zeitspanne möglicherweise sehr lang ist, ist es denkbar, dass in regelmäßigen Abständen alle Sichten aktualisiert
werden, sodass die Staging Table wieder leer wird.
Wartung materialisierter Sichten
Normale Sichten werden bei einem Zugriff automatisch neu berechnet und sind somit immer
aktuell. Materialisierte Sichten müssen hingegen explizit aktualisiert werden, damit sie das
Ergebnis der Auswertung des entsprechenden Operatorenbaums auf den momentanen
Kanalzuständen enthalten. Da jede neue Publikation den Zustand eines Kanals ändert, sind
die Inhalte direkt oder indirekt abhängiger materialisierter Sichten veraltet und müssen vor
dem nächsten Zugriff aktualisiert werden.
Die Aktualisierung einer materialisierten Sicht kann durch die erneute Berechnung des Operatorenbaums geschehen. Dazu muss jedoch auf die Zustände aller referenzierten
Informationskanäle zugegriffen werden. Enthält die Staging Table jedes benötigten Kanals
seinen gesamten Zustand oder ist der Produzent eng an das System gekoppelt, so stellt die
erneute Auswertung kein Problem dar. Ist die Abfragbarkeit eines Kanals hingegen beim
Produzenten realisiert, so müssen dazu ggf. sehr große Datenmengen zum Broker transferiert
werden, was das Vorgehen insbesondere in Anbetracht häufiger Auswertungen unpraktikabel macht. Enthält die Staging Table geänderte Tupel, so können abhängige Sichten auch inkrementell gepflegt werden. Dazu wird ihr neuer Inhalt aus dem alten und den Änderungen
berechnet. Geeignete Techniken sind aus dem Bereich des Data-Warehousing bekannt und
z.B. in [Tesc99] detailliert beschrieben. Es wird daher an dieser Stelle nur auf die sich im
Kontext des PubScribe-Systems ergebenden Problematiken eingegangen. Hauptproblem ist
dabei, dass bestimmte Operatoren zur Berechnung ihres neuen Zustands neben dem alten
und den Änderungen weitere Informationen benötigen, die bereitgestellt oder beschafft werden müssen.
Soll ein materialisierter Verbund-Knoten inkrementell aktualisiert werden und ergibt sich in
einem seiner Teilbäume eine Änderung, so kann diese nur dann ins Ergebnis eingebracht
werden, wenn alle Verbundpartner aus dem anderen Teilbaum bekannt sind. Dies kann dadurch erreicht werden, dass der direkte Vorgängerknoten in jedem Ast des Verbundes materialisiert wird. Die Verbundpartner eines neuen Tupels können dann aus dem Vorgänger ermittelt werden. Alternativ können sie auch durch eine synchrone Subskription bestimmt werden. Ihr Rumpf enthält den Teilbaum des Verbunds, aus dem die Verbundpartner stammen
müssen, gefolgt von einem Selektionsknoten. Die Selektion ermittelt lediglich diejenigen
Tupel, die gemäß der Verbundbedingung zu dem neuen Tupel passen. Die Löschung oder
Änderung eines in den Verbund eingehenden Tupels kann analog behandelt werden. Diese
4 Anfragenübergreifende Optimierung
231
Möglichkeit zur Ermittlung der Verbundpartner setzt allerdings voraus, dass alle Datenquellen im entsprechenden Teilbaum abfragbar sind. Ist dies nicht der Fall, so kann der Verbundoperator nur mit materialisierten Vorgängern eingesetzt werden.
Ein ähnliches Problem tritt bei der Gruppierung und dem Window-Operator auf. Sie können
nur dann aktualisiert werden, wenn das Ergebnis ihrer Operanden bekannt ist. Dies kann wie
oben durch deren Materialisierung oder eine entsprechende Abfrage zur Laufzeit erreicht
werden. Sind nicht alle dazu notwendigen Informationskanäle abfragbar, so ist auch hier die
Materialisierung unumgänglich.
Prinzipiell lassen sich die Operatoren nach den Daten, die sie zur inkrementellen Aktualisierung benötigen, in zwei Klassen einteilen. Das Ergebnis einer Selektion, Vereinigung, Shift, Skalar- oder Sortieroperation kann nur auf Basis der Änderungen und ihres letzten Resultats
aktualisiert werden. Gruppierung, Verbund und Window-Operator erfordern dazu das Ergebnis ihrer jeweiligen Operanden.
Sicherstellung der Atomarität von Ereigniseintritt, Bedingungs- und
Rumpfauswertung
Durch das PubScribe-System muss genau wie bei der Verwendung von Triggern in einem relationalen Datenbanksystem der transaktionale Schutz vom Eintritt des auslösenden Ereignisses, der Bedingungsauswertung und der resultierenden Aktion (‘Event-Condition-Action’, ECA-Prinzip) sichergestellt werden. Dazu muss insbesondere die Auswertung der Bedingung und – sofern diese ‘wahr’ ergibt – auch die des Rumpfes auf dem gleichen Zustand
der beteiligten Informationskanäle erfolgen. Dies verhindert, dass eine Publikation zwischen
der Berechnung von Bedingung und Rumpf den Zustand eines Kanals ändert und die Bedingung – ausgewertet auf den Daten, die der Auswertung des Rumpfes zugrunde liegen – möglicherweise nicht mehr wahr wäre. Die Berechnung der Bedingung und des Rumpfes eines
Operatorenbaums muss also atomar erfolgen.
Der Mechanismus der verketteten Trigger gewährleistet diese Anforderung automatisch. Die
Änderung eines Kanalzustands löst Trigger aus. Als Folge werden Sichten aktiver Teilbäume
aktualisiert, die möglicherweise wieder Trigger auslösen. Als letzter Knoten dieser Kette
wird der Schalter getriggert, der bei erfüllter Bedingung die Auswertung des Rumpfes veranlasst (Abbildung 2.11). Erst wenn sie abgeschlossen ist, wird die Transaktion des Einfügevorgangs beendet.
4
Anfragenübergreifende Optimierung
In diesem Abschnitt wird auf die anfragenübergreifende Optimierung (‘Multiple Query Optimization’, MQO) der Data Processing Unit des Brokers eingegangen. Damit wird die folgende Problematik adressiert: Subskriptionen, die durch parametrierte Vorlagen beispielsweise aus Web-Seiten heraus erzeugt werden, sind immer sehr ähnlich oder sogar gleich
232
4. Anfragenübergreifende Optimierung
strukturiert. So könnten beispielsweise die in Abbildung 4.1 dargestellten Ausschnitte zweier Subskriptionen erzeugt werden. Die linke Subskription berechnet den Höchst- und Tiefstkurs der IBM-Aktie, die rechte Subskription ermittelt zusätzlich deren Durchschnittskurs.
Ihre getrennte Verwaltung durch das Subskriptionssystem verbraucht unnötig viele Ressourcen. Selbst der in beiden Teilen identisch auftretende Selektionsknoten wird doppelt und damit redundant im System erzeugt und muss so bei einer Publikation in den Aktienkanal zweimal ausgewertet werden.
Ein intuitiver Ansatz besteht darin, gleiche Teile verschiedener Operatorenbäume nur einmal
im System und damit auch in der Datenbank zu repräsentieren. Dies führt dazu, dass der
Selektionsoperator aus Abbildung 4.1 zur Laufzeit nur einmal vorhanden ist und somit, sofern er materialisiert ist, auch nur einmalig zur Aktualisierung beider Subskriptionen ausgewertet werden muss. In diesem Abschnitt wird eine Lösungsidee dargestellt, die über diesen
naheliegenden Ansatz hinausgeht. Sie basiert auf dem in [LCPZ01] beschriebenen Verfahren
zur Wartung materialisierter Sichten in Data-Warehouse-Systemen, muss aber an die besonderen Erfordernisse des Subskriptionssystems PubScribe angepasst werden.
Bei genauer Betrachtung der beiden Operatorenbäume aus Abbildung 4.1 fällt auf, dass die
rechte Gruppierung alle Daten der linken berechnet (min(kurs), max(kurs)). Auch dieser
Operator könnte daher von beiden Teilbäumen gemeinsam genutzt werden, sofern die zu viel
berechneten Daten (sum(kurs), count(kurs)) des rechten Operators durch einen zusätzlichen
Operator entfernt werden können. Im vorliegenden Fall ist dies auf einfache Weise durch einen Skalaroperator, der die zu viel berechneten Attribute ausblendet, möglich. Die beiden
Teilbäume der Abbildung 4.1 können also zu dem in Abbildung 4.2 dargestellten Netz zusammengefasst werden. Wird das Ergebnis der gemeinsam genutzten Gruppierung materialisiert, so muss es zur Auswertung beider Bäume nur einmalig berechnet werden. Gegenüber
dem nicht optimierten Fall entfällt die doppelte Berechnung der Selektion und der Gruppierung; stattdessen muss lediglich der hinzugekommene Skalaroperator zusätzlich ausgewertet
werden.
Der in diesem Abschnitt beschriebene Algorithmus ermittelt gemeinsame Teile zweier Operatorenbäume indem er sie von den Blättern beginnend durchläuft. Sind zwei Operatoren wie
die Gruppierungen im Beispiel lediglich ähnlich, so wird versucht, den einen aus dem ande-
Skalar
Gruppierung
Selektion
hp
minKurs = min(kurs)
maxKurs = max(kurs)
wp = ‘IBM’
Gruppierung
Selektion
Aktien
Aktien
([wp, hp, zeit],
[kurs, vol])
([wp, hp, zeit],
[kurs, vol])
minKurs = id(minKurs)
maxKurs = id(maxKurs)
avgKurs = div(sumKurs, countKurs)
hp
minKurs = min(kurs), maxKurs = max(kurs)
sumKurs = sum(kurs), countKurs = count(kurs)
wp = ‘IBM’
Abb. 4.1: Ausschnitte zweier aus einer Vorlage generierter Subskriptionen
4.1 Einschluss von Operatoren und Kompensationen
Skalar
Gruppierung
Selektion
Skalar
233
avgKurs = div(sumKurs, countKurs)
hp
sumKurs = sum(kurs), countKurs = count(kurs)
wp = ‘IBM’
Aktien
([wp, hp, zeit],
[kurs, vol])
Abb. 4.2: Zusammenfassung von zwei Subskriptionen
ren abzuleiten. Damit der identische und somit gemeinsam genutzte Teilbaum möglichst
groß werden kann, werden die eventuell notwendigen Anpassungen erst möglichst spät ausgeführt.
Im nächsten Abschnitt wird erläutert, unter welchen Umständen zwei Operatoren als ähnlich
angesehen werden und wie ihre gemeinsamen Teile ermittelt werden. Danach wird das Verfahren dargestellt, mit dem die Unterschiede im Operatorenbaum möglichst weit nach oben
verschoben werden können, und es werden einige Besonderheiten zu seiner Anwendung im
PubScribe-System aufgegriffen. Den Abschluss des Abschnitts bildet die Darstellung und Interpretation von Messergebnissen zum Nutzen des Verfahrens.
4.1 Einschluss von Operatoren und Kompensationen
Grundlegend für das weitere Vorgehen ist die im folgenden definierte Einschlussbeziehung
zwischen zwei Operatoren:
Definition: Ein Operator schließt einen anderen ein, wenn sein Resultat aus dem des ersteren
durch die Anwendung eines weiteren Operators – Kompensation genannt – erzeugt
werden kann. Der Operator mit dem allgemeineren Resultat wird als einschließender
Operator (‘Subsumer’) bezeichnet, der andere als eingeschlossener Operator (‘Subsumee’).
Der Zusammenhang zwischen Operatoren in diesen Rollen ist in Abbildung 4.3 verdeutlicht.
Der Einschluss des Subsumees vom Subsumer wird durch den Match-Pfeil angedeutet. Die
Hintereinanderausführung der Operatoren Subsumer und Kompensation liefert bei Anwendung auf den gleichen Daten das Ergebnis des Subsumee-Operators.
234
gleiches Ergebnis
Kompensation
Subsumee
Match
Subsumer
gleiche Daten
Abb. 4.3: Einschluss von Operatoren
4.2 Einschluss und Kompensation verschiedener Operatoren
Nachfolgend wird für alle in #ReLW01# beschriebenen Operatoren des PubScribe-Systems
dargestellt, wann eine Einschlussbeziehung besteht und wie eine Kompensation bestimmt
wird. Da kein Operator allgemein als eine Kombination der anderen dargestellt werden kann,
ist eine Kompensation nur zwischen zwei Operatoren desselben Typs möglich.
4.2.1
Der Selektionsoperator
Ein Selektionsoperator schließt einen anderen ein, wenn seine Selektionsbedingung mindestens diejenigen Tupel des eingeschlossenen Operators auswählt.
Um dies entscheiden zu können, seien zuerst nur atomare Selektionsbedingungen der Form
‘Attribut Vergleichsoperator RechteSeite’ betrachtet. Für den Einschluss ist es
eine notwendige Bedingung, dass die Attributnamen der linken Seiten beider Bedingungen
übereinstimmen. Bestehen ihre rechten Seiten aus Attributen, so müssen auch diese gleich
sein. Eine Einschlussbeziehung ist je nach Vergleichsoperator z.B. bei a < b und a ≤ b vorhanden. Stehen auf der rechten Seite Werte, und wird auch derselbe Vergleichsoperator verwendet, so sind alleine die Werte für das Vorliegen einer Einschlussbeziehung entscheidend.
So z.B. selektiert die Bedingung a < 7 alle Tupel, die auch a < 5 selektiert. Stimmt der Operator nicht überein, wohl aber der Wert, so lässt sich die Entscheidung ebenfalls einfach treffen. Beispielsweise sind alle Tupel mit a < 5 auch in a ≤ 5 und in a ≠ 5 enthalten. In einigen
Fällen ist eine Entscheidung auch dann möglich, wenn Operator und Wert unterschiedlich
sind, so z.B. besteht zwischen a < 5 und a ≤ 7 ebenfalls ein Einschluss.
Im Fall komplexer Bedingungen müssen die von ihnen gebildeten Baumstrukturen untersucht werden. Es muss entweder der Bedingungsbaum des möglichen eingeschlossenen
Operators in dem des einschließenden enthalten sein, oder umgekehrt. Ein Baum ist dabei in
einem anderen enthalten, wenn korrespondierende Blätter das obige Einschlusskriterium erfüllen und ihre Verknüpfung identisch ist. Ein Operator ist dann eingeschlossen, wenn seine
Bedingung im Bedingungsbaum des einschließenden Operators enthalten ist und von seiner
235
Wurzel und der des einschließenden Bedingungsbaums nur mit Disjunktionen verknüpft ist
(Abbildung 4.4a). Ein Einschluss ist ebenfalls gegeben, wenn die Inklusion der Operatorenbäume andersherum ist und ihre Wurzeln konjunktiv verknüpft sind (Abbildung 4.4b).
Zur Kompensation muss im Allgemeinen die eingeschlossene Operation durchgeführt werden. Ein Spezialfall liegt vor, wenn alle korrespondierenden, atomaren Bedingungen beider
Bäume identisch sind und die einschließende Bedingung wie im Fall der Abbildung 7.4b eingebettet ist. Dann kann die eingeschließende Bedingung in der eingeschlossenen durch einen
immer wahren Ausdruck ersetzt werden.
AND
OR
OR
AND
eingeschlossene
Bedingung
OR
einschließende
Bedingung
OR
AND
AND
eingeschlossene
Bedingung
a) Fall 1
einschließende
Bedingung
b) Fall 2
Abb. 4.4: Einschluss bei zusammengesetzten Selektionsbedingungen
Kompensation eines Selektionsoperators
Die Teilbedingung a2 < 5 des eingeschlossenen, linken Operators in Abbildung 4.5 korrespondiert mit a2 < 7 des einschließenden Operators. Durch letztere wird kein Tupel aus dem
Ergebnis ausgeschlossen, das nicht auch von der ersten ausgeschlossen würde. Dasselbe ist
für a3 > 9 und a3 ≠ 3 der Fall. Alle von der eingeschlossenen Bedingung selektierten Tupel
werden also auch von der geklammerten Bedingung des einschließenden Operators ausgewählt. Diese ist wie im obigen Fall 4.4a in die gesamte Bedingung integriert, da sie nur über
einen OR-Operator mit der anderen Teilbedingung a1 = 5 verbunden ist. Zwischen den beiden Operatoren besteht also eine Einschlussbeziehung. Zur Kompensation dient eine Selektion mit der eingeschlossenen Bedingung.
Selektion
Selektion
a2 < 5 AND
a3 > 9
Match
Selektion
a2 < 5 AND
a3 > 9
a1 = 5 OR
(a2 < 7 AND
a3 ≠ 3)
Abb. 4.5: Kompensation eines Selektionssoperators
236
4.2.2
Der Skalaroperator
Der Einschluss zweier Skalaroperatoren lässt sich immer durch einen weiteren Skalaroperator kompensieren. Die Entscheidung über den Einschluss lässt sich dabei auf eine Entscheidung für jede einzelne Skalarfunktion reduzieren. Es sind zwei Fälle möglich: Ist eine Skalarfunktion des eingeschlossenen Operators im einschließenden vorhanden, so ist zu ihrer
Kompensation lediglich eine Identitätsfunktion notwendig, die evtl. den Attributnamen des
Ergebnisses anpasst. Eine nicht im einschließenden Operator enthaltene Skalarfunktion kann
nur dann kompensiert werden, wenn alle ihre Operanden enthalten sind. Dazu wird die Funktion selbst angewendet, wobei die Namen ihrer Argumente an die Attributnamen des einschließenden Operators angepasst werden müssen.
Skalar
Skalar
r1 = greater(a2, a3)
r2 = time(a1)
Match
Skalar
r1 = greater(r2, r3)
r2 = id(r1)
r1 = time(a1)
r2 = id(a2)
r3 = id(a3)
Abb. 4.6: Kompensation eines Skalaroperators
Im Fall der Funktion time aus Abbildung 4.6 liegt der erste der oben beschriebenen Fälle vor.
Sie ist in beiden Operatoren identisch vorhanden (time(a1)), sodass sie mit einer Identitätsfunktion, die den Attributnamen anpasst, kompensiert wird (r2 = id(r1)). Die Vergleichsfunktion greater des eingeschlossenen Operators ist nicht im einschließenden Operator enthalten,
wohl aber ihre beiden Operanden. Hier wird die Funktion in der Kompensation angewandt.
Die Attributnamen der Operanden müssen dabei von a2 und a3 auf r2 und r3 geändert werden.
4.2.3
Der Sortieroperator
Eine Sortierung nach einer Folge von Attributnamen ist immer auch eine Sortierung nach einem Präfix dieser Folge. Ein Sortieroperator wird daher genau dann von einem anderen eingeschlossen, wenn seine Sortierreihenfolge ein Präfix der des einschließenden Operators ist.
Eine Kompensation ist nicht erforderlich, der entsprechende Operator entfällt.
Die Sortierordnung ‘a1 ASC, a2 DESC, a3 ASC’ des rechten Operators der Abbildung 4.7 ist
auch eine Sortierung nach den ersten beiden Attributen (bei gleicher Sortierreihenfolge). Er
schließt daher den linken Operator ein. Eine Kompensation ist nicht notwendig; der entsprechende Operator bleibt leer.
237
leer
Match
Sortierung
a1 ASC
a2 DESC
Sortierung
a1 ASC
a2 DESC
a3 ASC
Abb. 4.7: Kompensation eines Sortieroperators
4.2.4
Der Gruppierungsoperator
Partitioniert eine Gruppierung nach einer Obermenge der Attribute einer anderen, so entstehen feinere Partitionen. Jede der gröberen Partitionen lässt sich jedoch lückenlos aus feineren
zusammensetzen. Die Aggregatfunktionen sum, min und max sind assoziativ. Ihr Wert für
eine gröbere Partition lässt sich durch die Anwendung der Funktion auf die Ergebnisse der
enthaltenen, feineren Partitionen berechnen. Das Ergebnis der nicht assoziativen Funktion
count kann analog durch Bildung der Summe über die Teilzählungen ermittelt werden.
Eine Gruppierung schließt also eine andere ein, wenn zwei Bedingungen erfüllt sind: erstens
müssen ihre Gruppierungsattribute eine Obermenge derer des potentiell eingeschlossenen
Operators sein; zweitens muss sie dessen Aggregatfunktionen enthalten. Zur Kompensation
wird die eingeschlossene Gruppierung mit evtl. angepassten Attributnamen durchgeführt.
Die linke Gruppierung der Abbildung 4.8 wird beispielsweise von der rechten Gruppierung
eingeschlossen, da diese ihre Gruppierungsattribute a1 und a2 sowie die Aggregatfunktionen
sum(a5) und count(a5) enthält. Zur Kompensation wird die untergeordnete Gruppierung ausgeführt, wobei in der Summe das Attribut a5 durch r2 ersetzt werden muss. Das Ergebnis
Funktion count wird durch die Summe über die entsprechenden Teilsummen (r3) des übergeordneten Operators berechnet.
Gruppierung
a1, a2
r1 = sum(r2)
r2 = sum(r3)
Match
Gruppierung
a1, a2
r1 = sum(a5)
r2 = count(a5)
Gruppierung
a1, a2, a3
r1 = min(a4)
r2 = sum(a5)
r3 = count(a5)
Abb. 4.8: Kompensation eines Gruppierungsoperators
238
4.2.5
Der Shift-Operator
Damit ein Shift-Operator einen anderen einschließt, müssen die von ihm in den Identifikationsteil verschobenen Attribute eine Teilmenge der vom eingeschlossenen Operator verschobenen Attribute bilden. Zur Kompensation werden durch einen weiteren Shift-Operator die
noch fehlenden Attribute verschoben.
Shift
Shift
a1, a2, a3
Match
Shift
a3
a1, a2
Abb. 4.9: Kompensation eines Shift-Operators
Der in der Abbildung 4.9 rechts dargestellte Operator verschiebt die Attribute a1 und a2 in
den Identifikationsteil. Er schließt daher den linken Operator ein, der zusätzlich das Attribut
a3 verschiebt. Der Kompensationsoperator verschiebt dieses Attribut ebenfalls und liefert so
das Ergebnis des eingeschlossenen Operators.
4.2.6
Der Window-Operator
Jedes Tupel des Ergebnisses eines Window-Operators entsteht durch Aggregation aus einer
Menge von Tupeln des Operanden – seinem Fenster. Diese Mengen sind allerdings nicht wie
bei der Gruppierung disjunkt, sondern überschneiden sich. Eine auf der Assoziativität der
Aggregatfunktionen beruhende Kompensation ist nicht möglich, da so Tupel mehrfach ins
Ergebnis eingingen. Für einen Window-Operator lässt sich daher keine Kompensation angeben. Eine gemeinsame Nutzung durch mehrere Operatorenbäume ist nur dann möglich,
wenn die Window-Operatoren identisch sind.
4.2.7
Der Vereinigungsoperator
Zwei Vereinigungsoperatoren schließen sich immer gegenseitig ein. Die Kompensation besteht aus einer Skalaroperation, die den Namen des diskriminierenden Attributs anpasst und
ggf. die von der Quelle anhängigen Werte transformiert. Sie enthält weiterhin für jedes Informationsattribut eine Identitätsfunktion, damit ihr Wert im Ergebnis erhalten bleibt. Anschließend muss das neu erzeugte diskriminierende Attribut in den Identifikationsteil verschoben und das des einschließenden Operators durch eine Gruppierung entfernt werden. Da
fraglich ist, ob sich diese drei Operatoren tatsächlich effizienter als eine Vereinigung berechnen lassen, wird auf die Kompensation verzichtet. Vereinigungen werden nur dann mehrfach
genutzt, wenn sie identisch sind und somit keine Kompensation erforderlich ist.
4.3 Verschiebung von Kompensationen
4.2.8
239
Der Verbundoperator
Der Verbundoperator bildet das kartesische Produkt beider Datenquellen und wendet anschließend eine Selektion zur Reduktion der Ergebnistupelzahl an. Ein Verbund schließt daher einen anderen ein, wenn die von ihm selektierten Tupel eine Obermenge der vom eingeschlossenen Operator selektierten sind. Diese Entscheidung lässt sich analog zu dem für den
Selektionsoperator beschriebenen Vorgehen treffen. Als Kompensation wird eine Selektion
durchgeführt, die die vom einschließenden Verbund zu viel erzeugten Tupel entfernt.
Hieraus ergibt sich auch der Grund für die Definition des Verbundes als Gleichverbund. Zur
Kompensation eines natürlichen Verbundes wäre neben der Selektion eine Gruppierung erforderlich, die überflüssige Attribute aus dem Identifikationsteil des Ergebnisses entfernt.
Die Verwendung mehrerer Kompensationsoperatoren ist zwar prinzipiell denkbar, wird jedoch hier aus Gründen der Komplexität nicht erlaubt.
Kompensation eines Verbundoperators
Stimmt der Wert des Attributs a1 eines Tupels der ersten Datenquelle mit dem Wert des Attributs b1 eines Tupels der zweiten Datenquelle überein, so verbindet der rechte Verbund der
Abbildung 4.10 diese. Der linke Verbund dagegen verbindet zwei Tupel nur, wenn zusätzlich
auch die Attribute a2 und b2 übereinstimmen. Sein Ergebnis kann aus dem des ersten Operators durch eine Selektion erzeugt werden, die alle Tupel, für die die Bedingung a2 = b2 erfüllt ist, herausfiltert.
Selektion
Verbund
a1 = b1 AND
a2 = b2
Match
Verbund
a2 = b2
a1 = b1
Abb. 4.10: Kompensation eines Verbundoperators
Im vorausgehenden Abschnitt wird dargestellt, wie aus zwei sich einschließenden Operatoren ein gemeinsamer Operator und eine Kompensation erzeugt werden können. Durch wiederholte Anwendung sollen in zwei getrennten Operatorenbäumen möglichst große, identische Teilbäume entstehen. Damit diese gemeinsam genutzt werden können, müssen sie als
Eingabe dieselben Daten besitzen. Das Verfahren beginnt also bei den Blättern und wird zur
Wurzel aufsteigend fortgesetzt. Damit identische Bäume entstehen, müssen die Kompensationen auf unterster Ebene über den nachfolgenden Operator nach oben geschoben werden.
240
Ein noch größerer, gemeinsamer Baum kann entstehen, wenn sie dann über den Operator der
dritten Ebene geschoben werden usw. Unter welchen Bedingungen es möglich ist, einen
Operator über einen anderen zu verschieben, wird im folgenden Abschnitt erläutert.
4.3.1
Kriterium zur Verschiebung einer Kompensation
Um den identischen Teil zweier Operatorenbäume zu vergrößern, muss eine Kompensation
nach oben über ihren nachfolgenden Operator verschoben werden können, d.h. ihre
Ausführungsreihenfolge muss vertauscht werden. Dies ist jedoch nur dann erlaubt, wenn die
Operatoren kommutativ sind.
Eine Verschiebung ist nicht möglich, wenn die Kompensation dadurch nicht mehr ausführbar
ist. Dies ist z.B. der Fall, wenn ihr Selektionsattribut durch eine Gruppierung entfernt wird.
Weiterhin kann es passieren, dass zwar eine Auswertung der Operatoren in der umgekehrten
Reihenfolge möglich ist, jedoch eine andere Semantik hat. Beispielsweise kann eine Gruppierung, die die Anzahl der Tupel jeder Partition bestimmt, nicht über eine andere Gruppierung, die die Tupelzahl reduziert, verschoben werden.
Als notwendiges und hinreichendes Kriterium für die Verschiebbarkeit einer Kompensation
lässt sich angeben, dass der Operator, über den sie verschoben werden soll, keine von ihr benötigten Attribute ändern oder aus dem Schema entfernen darf. Die von einem Operator benötigten Attribute sind alle in seiner Parametrierung vorkommenden.
Kann eine Kompensation nicht weiter nach oben verschoben werden, so kann kein größerer
gemeinsamer Teilbaum gebildet werden; eine weitere Zusammenfassung ist dann nicht möglich.
Beispiel zur Verschiebung von Kompensationen
Das Vorgehen zur Verschiebung von Operatoren wird anhand der beiden in Abbildung 4.11
dargestellten Operatorenbäume erläutert. Der links dargestellte Operatorenbaum berechnet
für die über XETRA gehandelten IBM-Aktien beginnend mit dem 16.10.2000 die Differenz
zwischen minimalem und maximalem Kurs (minKurs und maxKurs) und liefert diejenigen
Tupel, für die diese Differenz (diffKurs) größer als sieben ist. Weiterhin enthält das Ergebnis
die durchschnittlich gehandelte Zahl von Aktien (avgVol). Der rechte Baum führt eine analoge Berechnung für die Aktien aller Handelsplätze durch und liefert Tupel, für die die Kursdifferenz zehn übersteigt. Zusätzlich zu dieser werden die Anzahl der eingehenden Kursdaten und die Gesamtzahl der gehandelten Aktien geliefert.
Für die unteren vier Operatoren wird der jeweils linke von seinem rechten Gegenstück eingeschlossen. Für die oberste Selektion besteht die Einschlussbeziehung in der umgekehrten
Richtung. Grundvoraussetzung für die Integration der beiden Operatorenbäume ist, dass ihre
Informationsquellen identisch sind. Da beide auf dem Aktien-Kanal basieren, ist diese Bedingung erfüllt. Nun wird die Kompensation für das unterste Operatorenpaar gebildet. Da die
zeitliche Bedingung und die Selektion der IBM-Aktie bereits im einschließenden Operator
241
([wp, hp, diffKurs], [avgVol])
([wp, hp, diffKurs], [countKurs, sumVol])
diffKurs > 7
Selektion
Match
Selektion
([wp, hp, diffKurs], [avgVol])
([wp, hp, diffKurs], [countKurs, sumVol])
diffKurs
Shift
Match
([wp, hp], [diffKurs, countKurs, sumVol])
diffKurs = minus(maxKurs, minKurs)
avgVol = div(sumVol, countKurs)
Match
wp, hp
sumVol = sum(vol)
countKurs = count(kurs)
([wp, hp],
[minKurs, maxKurs,countKurs, sumVol])
Match
Gruppierung
wp, hp
sumVol = sum(vol)
zeit ≥ ‘16.10.2000’ AND
wp = ‘IBM’ AND
hp = ‘XETRA
Selektion
countKurs= id(countKurs)
sumVol = id(sumVol)
Skalar
([wp, hp],
[minKurs, maxKurs, sumVol, countKurs])
Gruppierung
diffKurs
Shift
([wp, hp], [diffKurs, avgVol])
Skalar
diffKurs > 10
Match
Selektion
zeit ≥ ‘16.10.2000’ AND
wp = ‘IBM’
Aktien
Aktien
([wp, hp, zeit],
[kurs, vol])
([wp, hp, zeit],
[kurs, vol])
Abb. 4.11: Einschluss in zwei getrennten Operatorenbäumen
enthalten ist, muss die Kompensation lediglich die Bedingung hp = ‘XETRA’ enthalten. Wie
in Abbildung 4.12 dargestellt, wird sie nach der Selektion des rechten Operatorenbaums und
vor der Gruppierung des linken eingekettet. Dadurch ist deren ursprünglicher Operand unbenutzt und braucht nicht im System vorhanden zu sein. Der gesamte Datenfluss wird geringer,
da die Tupel des Aktienkanals nur einmal gefiltert werden müssen. Die Kompensation setzt
dann auf dem bereits vorselektierten Tupeln auf.
Für die nachfolgende Gruppierung wird ebenfalls ein Einschluss festgestellt. Da die Operatoren jedoch identisch sind, bleibt die Kompensation leer. Um den gemeinsam genutzten
Teilbaum möglichst groß zu machen, muss die Kompensation der vorausgehenden Selektion
über die Gruppierung hinweg geschoben werden. Dies ist möglich, da das von der Selektion
verwendete Attribut hp von der Gruppierung nicht modifiziert wird. Enthielte die Selektion
eine von der Zeit abhängige Bedingung, so wäre ihre Verschiebung nicht erlaubt, da das At-
Gruppierung
Selektion
wp, hp
sumVol = sum(vol)
zeit ≥ ‘16.10.2000’ AND
wp = ‘IBM’ AND
hp = ‘XETRA
Selektion
Match
hp = ‘XETRA’
Gruppierung
Selektion
Aktien
Aktien
([wp, hp, zeit],
[kurs, vol])
([wp, hp, zeit],
[kurs, vol])
wp, hp
sumVol = sum(vol)
zeit ≥ ‘16.10.2000’ AND
wp = ‘IBM’
Abb. 4.12: Kompensation des ersten Operators
242
leer
Selektion
Gruppierung
Selektion
hp = ‘XETRA’
wp, hp
sumVol = sum(vol)
zeit ≥ ‘16.10.2000’ AND
wp = ‘IBM’ AND
hp = ‘XETRA
Gruppierung
Selektion
Aktien
Aktien
([wp, hp, zeit],
[kurs, vol])
([wp, hp, zeit],
[kurs, vol])
wp, hp
sumVol = sum(vol)
zeit ≥ ‘16.10.2000’ AND
wp = ‘IBM’
Abb. 4.13: Verschiebung einer Kompensation
tribut Zeit im Ergebnisschema der Gruppierung nicht mehr enthalten ist. Abbildung 4.13
zeigt die Verkettung der Operatoren nach der Verschiebung. Neben der Selektion kann nun
auch die Gruppierung im linken Baum entfernt werden.
Auf analoge Weise werden der folgende Skalar- und Shift-Operator behandelt. Die Selektion
kann über beide hinweggeschoben werden, da das Attribut hp stets unverändert bleibt. Die
Kompensation der Skalaroperation berechnet das durchschnittliche Handelsvolumen (avgVol) aus den Attributen sumVol und countKurs; der Shift-Operator macht keine Kompensation erforderlich, da sich das Attribut diffKurs bereits im Informationsteil befindet. Für den
obersten Operator der Kette besteht eine Einschlussbeziehung in umgekehrter Richtung, sodass die Kompensationen vor diesem eingegliedert werden müssen. Das Endergebnis der
Optimierung ist in Abbildung 4.14 dargestellt. Die leeren Kompensationen nicht mitgerechnet, fallen vier Operatoren weg und es kommen zwei hinzu. Es bietet sich an, das Ergebnis
des Shift-Operators als materialisierte Sicht anzulegen, da es so mit nur einmaliger Berechnung zur Auswertung beider Operatorenbäume benutzt werden kann.
4.3.2
Transformation von Attributnamen
Das im letzten Abschnitt dargestellte Verfahren erlaubt es, gleiche und ähnliche Teile zweier
Operatorenbäume zusammenzufassen und so mehrfach nutzen zu können. Problematisch
wird das Vorgehen, wenn zwei Operatoren zwar strukturgleich sind, jedoch andere Attributnamen für Zwischenergebnisse verwenden. Neben der eigentlichen Kompensation würden
dann weitere Operatoren benötigt, die die Attribute umbenennen. Um diesen Aufwand zu
verringern, werden die Abbildungen von Attributnamen des einschließenden auf die des eingeschlossenen Operatorenbaums für alle Kompensationen akkumuliert und zum Schluss nur
einmal vor den Kompensationen eingefügt.
243
leer
Skalar
Selektion
Shift
Skalar
Gruppierung
Selektion
diffKurs > 7
diffKurs
leer
Selektion
Selektion
hp = ‘XETRA’
Shift
wp, hp
sumVol = sum(vol)
zeit ≥ ‘16.10.2000’ AND
wp = ‘IBM’ AND
hp = ‘XETRA
diffKurs > 10
Skalar
Gruppierung
Selektion
Aktien
Aktien
([wp, hp, zeit],
[kurs, vol])
([wp, hp, zeit],
[kurs, vol])
diffKurs
sumVol = id(sumVol)
wp, hp
sumVol = sum(vol)
zeit ≥ ‘16.10.2000’ AND
wp = ‘IBM’
Abb. 4.14: Zusammengefasste Operatorenbäume
Da die Transformation sowohl Informations- als auch Identifikationsattribute betreffen kann,
sind zu ihrer Durchführung im Allgemeinen drei aufeinanderfolgende Operatoren erforderlich. Zur Erzeugung der Attribute unter ihrem neuen Namen wird ein Skalaroperator verwendet, der auf jedes zu transformierende Attribut eine Id-Funktion anwendet. Damit die übrigen
Informationsattribute nicht verloren gehen, werden diese ebenfalls mit einer Id-Funktion ins
Ergebnis übernommen. Für Informationsattribute ist somit die Transformation beendet. Die
Identifikationsattribute sind nun unter ihrem neuen Namen im Informationsteil vorhanden
und werden durch einen Shift in den Informationsteil verschoben. Um sie anschließend unter
ihrem alten Namen zu eliminieren, wird nach den übrigen Attributen des Informationsteils
gruppiert. Da die Abbildung von einem alten auf einen neuen Attributnamen eineindeutig ist,
enthält jede Gruppe genau ein Tupel. Um die Informationsattribute zu erhalten, wird auf diese eine Aggregatfunktion (z.B. min) angewandt.
Beispiel zur Transformation von Attributnamen
Abbildung 4.15 zeigt die zusammengefassten Operatorenbäume des Beispiels , jedoch sind
die Namen der abgeleiteten Attribute in beiden Teilen verschieden. Unterschiedliche Attributnamen sind in der Abbildung hervorgehoben. Damit insbesondere der oberste Selektionsoperator des linken Baums auch nach der Verschmelzung mit dem anderen angewandt werden kann, muss eine Transformation der Attributnamen durchgeführt werden. Diese wird direkt vor den Kompensationen vorgenommen und ist in der Abbildung durch schattierte
Operatoren dargestellt. Der Skalaroperator führt die eigentliche Transformation der Attributnamen durch. Er bildet die Attribute diffKurs, countKurs und sumVol des rechten Baums auf
244
5. Evaluierung anfrageübergreifender Optimierungsmethoden
die korrespondierenden Attribute kursDiff, kursCount und volSum des linken ab. Da das Attribut diffKurs zur Durchführung der nachfolgenden Selektion im Identifikationsteil des
Schemas enthalten sein muss, wird es durch den Shift-Operator verschoben.
Die anschließende Gruppierung entfernt das nicht mehr benötigte Attribut diffKurs aus dem
Identifikationsteil. Auf die benötigten Informationsattribute (kursCount, volSum) wird eine
Aggregation angewandt, damit sie im Ergebnis erhalten bleiben. Unnötige Informationsattribute des Ausgangsschemas (countKurs, sumVol) gehen durch die Gruppierung verloren.
Da die Transformation vor den Kompensationen durchgeführt wird, müssen diese im Namensschema des eingeschlossenen Operatorenbaums erzeugt werden.
5
Evaluierung anfrageübergreifender
Optimierungsmethoden
Prinzipiell werden durch die anfragenübergreifende Optimierung zwei Verbesserungen angestrebt: zum einen soll die Knotenzahl im Operatorennetz reduziert werden, sodass zur Programmlaufzeit weniger Speicherplatz benötigt wird. Zum anderen kann darauf aufbauend
die Auswertungszeit von Operatorenbäumen durch die Materialisierung mehrfach genutzter
Skalar
Selektion
Gruppierung
Selektion
Shift
Skalar
Gruppierung
Selektion
kursDiff > 7
kursDiff
Shift
Skalar
volAvg = div(volSum, kursCount)
hp = ‘XETRA’
wp, hp, kursDiff
kursCount = min(kursCount)
volSum = min(volSum)
kursDiff
Selektion
kursDiff = id(diffKurs)
kursCount = id(countKurs)
volSum = id(sumVol)
kursDiff = minus(kursMax, kursMin)
volAvg = div(volSum, kursCount)
wp, hp
kursMin = min(kurs), kursMax = max(kurs)
volSum = sum(vol)
kursCount = count(kurs)
zeit ≥ ‘16.10.2000’ AND
wp = ‘IBM’ AND
hp = ‘XETRA
Shift
Skalar
Gruppierung
Selektion
Aktien
Aktien
([wp, hp, zeit],
[kurs, vol])
([wp, hp, zeit],
[kurs, vol])
diffKurs > 10
diffKurs
sumVol = id(sumVol)
wp, hp
sumVol = sum(vol)
zeit ≥ ‘16.10.2000’ AND
wp = ‘IBM’
Abb. 4.15: Zusammengefasste Operatorenbäume mit Transformation der
Attributnamen
5.1 Aufbau des Testszenarios
245
Knoten verkürzt werden. Da der Prototyp des PubScribe-Systems zum Zeitpunkt der Anfertigung dieser Arbeit noch nicht in der Lage war, Operatorenknoten zu materialisieren, konnte
nur der Nutzwert des Verfahrens bezüglich des erstgenannten Punktes ermittelt werden. Die
Ergebnisse werden in diesem Abschnitt nach einer Beschreibung des Testszenarios detailliert
dargestellt und diskutiert.
5.1 Aufbau des Testszenarios
Für die Messungen werden durch zufällige Parametrierung aus einem immer gleichen Prototypen insgesamt 100 Subskriptionen erzeugt und nacheinander in das PubScribe-System
eingebracht. Durch die fehlende Startbedingung wird jede Subskription sofort aktiviert; mit
einer geeigneten Abbruchbedingung wird sichergestellt, dass keine Subskription vor Beendigung des Testlaufs aus dem System entfernt wird. Für jeden Einbringvorgang wird seine
Dauer und die neue Gesamtzahl der Operatorenknoten im Netz der Data Processing Unit bestimmt. Jeder Versuchslauf wird jeweils mit deaktivierter und aktivierter Optimierung durchgeführt. Weiterhin wird der Optimierungsalgorithmus derart vereinfacht, dass lediglich identische Operatorenknoten zusammengefasst werden. Auch in diesem Modus wird ein Testlauf
durchgeführt.
Die verwendete Subskription liefert den Höchst-, Tiefst- und Durchschnittskurs für eine Aktie an einem Handelsplatz. Bei der Generierung der zum Test verwendeten Subskriptionen
wird zufällig ermittelt, welche Kombination der drei Werte gefordert wird. Weiterhin ist der
Name der Aktie und der Handelsplatz variabel. Fehlt der Börsenplatz ganz, so wird über alle
existierenden aggregiert. In der Auslieferungsbedingung wird zufällig zwischen mehreren
Intervallen gewählt. Der Rumpf jeder Testsubskription enthält zehn Operatoren, die
Auslieferungsbedingung acht. Weil eine immer identische Abbruchbedingung in der Praxis
unrealistisch ist und somit die Resultate verfälschen würde, bleiben ihre Operatoren bei der
Auswertung der Ergebnisse unberücksichtigt.
Die Anzahl der möglichen Werte für Handelsplatz, Wertpapier und Auslieferungsintervall
wird in verschiedenen Szenarien unterschiedlich gewählt. Nachfolgende Tabelle gibt einen
Überblick über die variierten Parameter in den vier verwendeten Szenarien. ‘1(5)’ bedeutet
darin, dass für den entsprechenden Parameter ein Wert zufällig aus fünf möglichen gewählt
wird. ‘0/1(4)’ besagt, dass der Parameter entweder ganz fehlt, oder mit einem von vier Werten belegt ist. Die letzte Spalte der Tabelle gibt Auskunft über die Anzahl verschiedener Parameterkombinationen, die sich aus der Wahl der Wertebereiche ergibt. Dabei wird berücksichtigt, dass jede Subskription mindestens einen der ersten drei Parameter enthalten muss.
Durchschnittskurs
Handelsplatz
Wertpapier
Auslieferungsintervall
Szenario 1
ja/nein
ja/nein
ja/nein
0/1(4)
1(2)
1(2)
140
Szenario 2
ja/nein
ja/nein
ja/nein
0/1(4)
1(5)
1(5)
875
Szenario 3
ja/nein
ja/nein
ja/nein
0/1(4)
1(10)
1(5)
1750
Szenario 4
ja/nein
ja/nein
ja/nein
0/1(8)
1(10)
1(5)
3150
Gesamtzahl der
Kombinationen
Tiefstkurs
Höchstkurs
246
5.2 Ergebnisse der durchgeführten Messungen
Die Messungen der Knotenzahlen und der Einfügezeiten liefern in allen Szenarien prinzipiell
die gleichen Verläufe. Diese werden im Folgenden anhand des vierten Szenarios diskutiert
und bewertet. Anschließend wird auf die erwarteten und beobachteten Unterschiede zwischen den einzelnen Szenarien eingegangen.
5.2.1
Diskussion der vom Szenario unabhängigen Ergebnisse
In Abbildung 5.1a ist die Anzahl der Operatorenknoten im System über der Zahl der eingefügten Subskriptionen für das vierte Szenario aufgetragen. Für jeden der drei Modi – volle
Optimierung, Zusammenfassung identischer Knoten und fehlende Optimierung – ist eine
Verlaufskurve eingezeichnet.
Die durchgezogene Kurve für den Fall ohne Optimierung nimmt den erwarteten linearen
Verlauf. Da jede Subskription aus gleich vielen Operatoren besteht, wird für jede zusätzliche
die gleiche Anzahl neuer Knoten eingefügt. Die gestrichelte Kurve repräsentiert die
Zusammenfassung identischer Knoten und verläuft ebenfalls annähernd linear. Theoretisch
2000
160
keine Optimierung
Zusammenfassung identischer Knoten
volle Optimierung
Einfügezeit des Rumpfes [sec]
140
1500
Operatorenknoten
keine Optimierung
volle Optimierung
1000
500
120
100
80
60
40
20
0
0
20
40
60
Subskriptionen
80
a) Verlauf der Gesamtzahl der Operatoren
100
0
0
20
40
60
Subskriptionen
80
b) Verlauf der Einfügezeit
Abb. 5.1: Messergebnisse aus Szenario 4
100
5.2 Ergebnisse der durchgeführten Messungen
247
betrachtet ist die Wahrscheinlichkeit, dass für einen neuen Knoten im System bereits ein
identischer vorhanden ist, bei wenigen Subskriptionen gering und steigt mit wachsender Anzahl an. Ist schließlich zu jeder Parameterkombination eine Subskription enthalten, so kann
zu einem neuen Knoten immer ein identischer gefunden werden. Die Kurve müsste daher zu
Beginn eine größere Steigung haben, dann zunehmend abknicken und zum Schluss in eine
Gerade mit geringer Steigung übergehen. Dieser Verlauf kann jedoch durch die Messungen
nicht in der erwarteten Deutlichkeit bestätigt werden, sondern im Diagramm allenfalls nur
erahnt werden. Aufgrund analoger Überlegungen wird bei der vollen Optimierung ein ähnlicher Verlauf der Kurve erwartet. Dieser ist auch im Diagramm zu erkennen, jedoch fällt
auch seine Ausprägung deutlich geringer als erwartet aus.
Durch die Zusammenfassung identischer Operatoren können 492 der 2000 Knoten im unoptimierten Fall eingespart werden. Dies entspricht einer Reduktion ihrer Anzahl um 24,6%.
Durch die volle Optimierung werden 933 Knoten eingespart, was 46,7% entspricht. Erfreulich ist einerseits, dass bereits durch den vereinfachten Algorithmus derart große Einsparungen erzielt werden. Gerade in Anbetracht dieser Tatsache ist es umso erstaunlicher, dass die
volle Optimierung nochmals ein über 20% besseres Ergebnis zu erreichen vermag.
Betrachtet man die zum Einfügen der Subskriptionen benötigten Zeitspannen in Abbildung
5.1b, so fällt auf, dass diese für die fehlende Optimierung und die Zusammenfassung identischer Knoten annähernd konstant etwa 18 Sekunden beträgt. Im Fall der vollen Optimierung
steigt sie hingegen von diesem Wert gleichförmig bis auf ungefähr 140 Sekunden an. Die
konstante Einfügezeit im Fall der fehlenden Optimierung erscheint plausibel, da die durchzuführenden Schritte immer die gleichen sind. Bei der Zusammenfassung identischer Knoten
sollte der Verlauf hingegen aus theoretischer Sicht linear sein. Seine relative Konstanz bei
den Messungen lässt sich damit erklären, dass die zum Test auf identische Knoten benötigte
Zeitspanne sehr gering ist. Der mit der zunehmenden Zahl der Vergleiche verbundene Anstieg der Einfügezeit fällt dadurch sehr gering aus und liegt unter dem Messfehler durch äußere Faktoren. Wird vollständig optimiert, so müssen zum Einfügen eines neuen Operatorenbaums mit zunehmender Anzahl bereits enthaltender Bäume mehr Einschlusstests durchgeführt werden. Da diese sehr viel mehr Zeit als Vergleiche in Anspruch nehmen, kann hier der
erwartete lineare Verlauf der Einfügezeit nachgewiesen werden.
5.2.2
Diskussion der vom Szenario abhängigen Ergebnisse
Die Wahrscheinlichkeit, dass zwei zufällig erzeugte Subskriptionen für einen Parameter den
gleichen Wert besitzen, ist umgekehrt proportional zur Zahl seiner möglichen Belegungen.
Verallgemeinert lässt sich daher feststellen, dass bei gleicher Anzahl von Subskriptionen in
einem kleineren Szenario eine bessere Optimierung möglich ist, als in einem größeren. Für
die beiden Optimierungsmodi sind in Abbildung 5.2a und b die Verläufe der Knotenzahlen
für alle vier Szenarien dargestellt. Bei der Zusammenfassung identischer Knoten treten nur
marginale Unterschiede zwischen den Szenarien auf. Dieses auf den ersten Blick verblüffende Ergebnis lässt sich jedoch wie folgt erklären: ist ein Operator von einem Parameter abhän-
248
1600
1200
Szenario 1
Szenario 2
Szenario 3
Szenario 4
1400
Szenario 1
Szenario 2
Szenario 3
Szenario 4
1000
Operatorenknoten
Operatorenknoten
1200
1000
800
600
800
600
400
400
200
200
0
0
20
40
60
Subskriptionen
80
100
0
0
20
40
60
Subskriptionen
a) Zusammenfassung identischer Operatoren
80
100
b) Volle Optimierung
Abb. 5.2: Knoteneinsparungen in den unterschiedlichen Szenarien
gig, so ist davon auszugehen, dass ein entsprechender Knoten aufgrund der angenommenen
Gleichverteilung für jede mögliche Belegung einmal im System vorhanden ist. Während die
Vergrößerung seines Wertebereichs von zwei auf acht Werte die Kombinationsmöglichkeiten
und somit die Größe des Szenarios vervierfacht, werden statt zwei Knoten lediglich acht angelegt. Selbst wenn mehrere Knoten von Parametern abhängen, sind die sich ergebenden Änderungen der Knotenzahl in Anbetracht ihrer Gesamtzahl vernachlässigbar. Bei voller Optimierung (Abbildung 5.2b) beträgt der maximale Unterschied zwischen der Gesamtknotenzahl im kleinsten und größten Szenario immerhin bis zu ca. 160 Knoten und bestätigt somit
zumindest teilweise die Erwartung aus den theoretischen Überlegungen.
In Abbildung 5.3 sind die Einfügezeiten für die Testsubskriptionen im ersten Szenario dargestellt. Im Vergleich mit Abbildung 5.1b wird deutlich, dass ihre Abhängigkeit von der Größe des Szenarios unterhalb der Messungenauigkeit liegt. Berücksichtigt man jedoch die sehr
geringen Differenzen der Knotenzahlen (Abbildung 5.2), so erscheint dieses Ergebnis
zwangsläufig.
160
keine Optimierung
volle Optimierung
Einfügezeit des Rumpfes [sec]
140
120
100
80
60
40
20
0
0
20
40
60
Subskriptionen
80
100
Abb. 5.3: Einfügezeiten der Subskriptionen im Szenario 1
6 Zusammenfassung
6
249
Zusammenfassung
In diesem Beitrag wird ausführlich auf die Realisierung des PubScribe-Projektes eingegangen.
Dazu wird in einem ersten Schritt exemplarisch am Verlauf einer neuen Subskription gezeigt,
welche Phasen der Anfrageverarbeitung welche Restrukturierungen vornehmen, um die in
XML spezifizierte Subskriptionsanforderung auf relationale Datenbankstrukturen abzubilden. Da als Besonderheit im Kontext der Subskriptionssysteme quasi alle Anfragen vorab registriert werden, ist es Ziel einer Optimierungsstrategie, die Anzahl der im System zu verwaltenden Knoten zu minimieren. Die Technik der anfrageübergreifenden Optimierung, wie
sie im Abschnitt 4 eingeführt wird, vermag dieses basierend auf dem Konzept der Kompensation zu leisten. Eine Evaluierung dieser Optimierungstechniken, wie sie im PubScribe-System eingesetzt sind, findet sich im letzten Abschnitt. Eine erzielte Einsparung von annähernd
50% ist dabei als erstaunlich hoch einzuordnen.
Literatur
ElNa99
Fowl00
LeHR01
LCPZ01
LiPH00
LeHü01
Tesc99
SmSm77
SQL99
ReLH01
W3C99
Elmasri, R. A.; Navathe, S. B.: Fundamentals of Database Systems. Addison Wesley Longman Ltd.
(Reading (MA), USA), 3. Aufl., 1999
Fowler, M.: UML Distilled – A Brief Guide to the Standard Object Modeling Language. Addison
Wesley Longman Ltd. (Reading (MA), USA), 2. Aufl., 2000
Lehner, W.; Hümmer, W.; Redert, M.: Building An Information Marketplace using a Content and
Lehner, W.; Cochrane, B.; Pirahesh, H.; Zaharioudakis, M.: Proactive Query Matching for the
Optimization of Multiple Summary Table Refresh. In: Proceedings of the 17th International
Conference on Data Engineering (ICDE 2001, Heidelberg, Deutschland, 2.-6. April), 2001
Liu, L.; Pu, C.; Han, W.: XWRAP: An XML-enabled Wrapper Construction System for Web
Information Sources. In: Proceedings of the 16th International Conference on Data Engineering
(ICDE 2000, San Diego (CA), USA, 28. Februar - 3. März), 2000, S. 611-621
(Elektronisch verfügbar unter: http://www.cse.ogi.edu/~lingliu/Papers/xwrapTechRep.ps)
Lehner, W.; Hümmer, W.: The Revolution Ahead: Publish/Subscribe meets Database Systems. In:
Teschke, M.: Datenkonsistenz in Data Warehouse Systemen. Dissertation, Arbeitsberichte des
Instituts für mathematische Maschinen und Datenverarbeitung (Informatik), Universität ErlangenNürnberg, Band 32, Nummer 12, 1999
Smith, J.M.; Smith, D.C.P.: Database Abstractions: Aggregations and Generalization. In: ACM
Transactions on Database Systems (TODS, Vol. 2, Nr. 2), 1977, S. 105-133
N.N.: ISO/IEC 9075: Information technology - Database Languages - SQL, 1999
Redert, M.; Lehner, W.; Hümmer, W.: Strukturelle und Operationelle Aspekte in PubScribe. In:
N.N.: Extensible Markup Language (XML) 1.0. World Wide Web Consortium, 1999
250
6. Zusammenfassung

1 Introduction - Database Technology Group

Transcrição

Documentos relacionados

Inhalt 4/2014 Juli

Hifidelio Service Subscription

Studentische Tätigkeit bei internationalem Chip Broker

Vorsorge + Anlage > Analystenmeinung HANDELSBLATT, Freitag

PEOPLE TANGO-THEMA SPEZIAL SZENE INTERNATIONAL

RE/MAX Broker/Owner Meeting in Hamburg mit

Trading in regulated professions

Inhalt 1/2010 Ianuar-Februar

PDF - PRMaximus

REGELORDNUNG DES TENNIS CLUB GASPERICH Art. 1 Die