General overview of XMIDDLE

This material is extracted from Mirco Musolesi's Master degree thesis.

 

 

In this document  we present XMIDDLE, a data sharing middleware for an ad hoc network environment, describing its most interesting and innovative features. The essential motivation of this project is the need of sharing structured data among mobile hosts in this very dynamic context, where disconnections are the norm rather than the exception. XMIDDLE addresses this issue providing efficient mechanisms for automatic data replication and synchronization and hiding the complexity due to intermittent communication. Firstly, we analyse its architecture with a systematic approach; afterwards, we discuss some general issues related to networking and data replication.

 

Introduction

XMIDDLE [MasCZE02] is a middleware for mobile computing that allows mobile hosts (for example PDAs, mobile phones, laptop computers and other wireless devices) to share information with each other, without assuming the existence of any fixed network infrastructure. Furthermore, it specifically addresses ad-hoc networking scenario. In XMIDDLE, connection is symmetric but not transitive; for example, we can consider the following situation: a host Hi is connected to host Hj and this to host Hk; in this case, Hi and Hk may be not connected to each other. In other words, we do not consider a multi-hop scenario, where routing through mobile nodes is allowed.

Firstly, we assume that mobile hosts store their data in tree structures, which allow sophisticated manipulations due to the different node levels, hierarchy and relationships between elements that can be expressed.  XMIDDLE uses XML technologies in order to store, access and modify the data; we will analyse this aspect later, when we describe the implementation of our system.

Data sharing is possible only after a connection between hosts. Host HA is connected with host HB when it is in reach of this. Since we do not consider a particular network infrastructure, the definition of in reach varies depending on different protocols and transmission devices that are used; if we consider a wireless network, this expression means in radio range.

As we have discussed before, in a mobile scenario we have to address the problem of frequent disconnections; in other words, we must provide mechanisms in order to access and modify data in different working conditions, also in the case of disconnections; for this reason, XMIDDLE also allows off-line data manipulation, providing synchronization functionalities through a complex and efficient application dependent reconciliation process.

On each host, a set of possible access points for the owned data tree are defined in order to allow other nodes to link them. This operation that we call link is very similar to the remote directory mounting, a typical of distributed file systems. The access points in a host are grouped in an ExportLink set. For example, we can consider the following scenario: host Hi exports the branch A and hosts Hj and Hk link to this; thus, the owner of the branch A is still host Hi, but data now are shared by the three hosts, in other words these can access and modify them. When a host performs the link operation, a download of the linked branch is started.

After a branch modification, hosts have to notify the others that share it about all the changes introduced (immediately, if these are connected, otherwise, after a reconnection), in order to guarantee data consistency reconciling the copies stored in the different hosts running XMIDDLE applications.

A host is also allowed to make off-line changes to its replica of shared data; when it gets in reach again with other nodes, a reconciliation process of inconsistent replicas starts.

Each host uses two sets in order to store information about the nodes that share their data. These sets are called LinkedBy and LinkedFrom and contain respectively the hosts that are linking to one of its branches and from whom a part of its tree is linked; they are structured as lists of tuples (host, branch). Clearly, the LinkedFrom set does not mirror the current connection configuration; in other words, a host can be in the set of another one even if the two are disconnected; only specific primitives for linking and unlinking modify this set. The LinkedBy set is updated by connection and disconnection operations that can be explicit or implicit (if two hosts get out of reach); it is used to know to whom to notify the eventual changes of parts of the tree. Applications can also explicitly disconnect from the network; this feature allows a host to save battery power or to perform changes without receiving updates from other nodes.

 

 

XMIDDLE architecture

The architecture of XMIDDLE is that typical of a middleware system; we present it according the ISO/OSI reference model. As shown in the following figure, XMIDDLE implements the session and presentation layers on a standard transport protocol, such as UDP, that is provided in mobile networks on top of, for instance, a Bluetooth or a 802.11 data-link (i.e., Logical Link Control and Adaptation Protocol), MAC and physical layer.

 

 

The ISO-OSI protocol stack for mobile environments using XMIDDLE

 

With respect to the presentation layer, XMIDDLE maps XML documents to DOM trees and provides the mobile applications with the primitives to manipulate and to share data; the session layer implementation manages connections and disconnections.

The essential component of the architecture of our system is the XMIDDLE Controller, a concurrent thread that is able to communicate with the underlying transport protocol and to handle new connections and reconnections. Moreover, it can also trigger the reconciliation process of replicas according user-specific policies. Another fundamental module is the XMIDDLE Manager that provides the API in order to develop applications using this middleware; we underline that programs only have access to this component, which abstracts the rest of the middleware, providing a unified and high-level view of the system.

XMIDDLE is entirely implemented in Java and so it relies on a Java Virtual Machine; it is also worth noting that Java provides classes that allow to implement easily applications using TCP/UDP transport protocols.  

 

 

Networking issues

The current implementation of the prototype of our system relies on UDP over IP. UDP was chosen over TCP, since the second is connection-oriented and so it introduces additional overhead. Moreover, TCP is inappropriate for situations where bandwidth is limited and where the network does not have a fixed structure, as in a mobile ad-hoc network setting.

On the other hand, UDP does not provide guaranteed packet delivery as TCP does; it is connectionless and, for this reason, it is more suitable for a mobile environment. XMIDDLE also exploits IP multicast in order to detect the presence of hosts that are in reach (i.e., in the same radio coverage) and to negotiate how to establish private communications between the nodes. We will discuss these issues in details in section 5.7.

Therefore, XMIDDLE can run on any hardware on which UDP over IP can be im-plemented, such as Bluetooth, IrDA, 802.11, Ethernet-based LAN. Consequently, XMIDDLE can also be used in a fixed environment, even if it addresses particularly mobile setting issues, like frequent disconnections. However, it might be exploited in some situations such as the case of a user without a permanent network access and who uses a dial-up connection: in fact, in this case, the user might exploit XMIDDLE in order to work off-line.

We have been testing our system using HP iPAQ PDAs provided with 802.11 wireless LAN support and PCs connected by an Ethernet-based LAN.

 

 

Data manipulation and XML technologies in XMIDDLE

As we have discussed before, XMIDDLE stores data in XML documents that can be semantically associated to trees. Applications are enabled to manipulate them through the DOM API [W3C02b], which we present in Appendix A.

Other XML technologies are exploited in the design of this platform; in particular LinkedBy, LinkedFrom, ExportLink sets are formatted using XPath syntax [ClaD99]. Furthermore, the reconciliation of data trees is based on this class of technologies.

In order to understand how XMIDDLE represents and manipulates the shared data, we present an example from a possible collaborative e-commerce application.

For example, we consider a family, composed of three members, that usually does its weekly shopping electronically; it has a personal computer, where a replica of shop catalogue is maintained and each member owns a PDA, where a subset of products is replicated according his particular interest; every device has an embedded technology to establish an ad-hoc network, such as 802.11 connection. Furthermore, each family member has a replica of the shared shopping basket; they can buy products that are present in the catalogue, which is daily updated (for example by downloading it from an Internet site). Reconciliation of product catalogues and shopping baskets happens whenever the PDAs establish connection to each other or to the PC. This is a possible simplified representation of the catalogue:

 

<?xml version=”1.0”?>

<shop>

  <category name=”Hardware”>

    <item>

      <name>Drill</name>

      <price>50</price>

    </item>

    <item>

      <name>Hammer</name>

      <price>5.50</price>

    </item>

    <item>

      <name>Screwdriver</name>

      <price>2.50</price>

    </item>

  </category>

  <category name=”Dairy”>

    <item>

      <name>Milk</name>

      <price>1.30</price>

    </item>

    <item>

      <name>Cheddar cheese</name>

      <price>3</price>

    </item>

    <item>

      <name>Parmesan</name>

      <price>8</price>

    </item>

  </category>

</shop>

 

In this example we have omitted some attributes that XMIDDLE uses to deal with data conflicts, since now we analyse only some general aspects of our system.

The family members can link the catalogue and the basket that are present in the personal computer. At the beginning, obviously, they share an empty basket. A member may decide to link only a branch of the tree representing the catalogue, for example because he is interested only on hardware products. To link only to the hardware category, he has to specify the corresponding path of the DOM tree. For example, MemberA may link to the hardware category, MemberB to the dairy products and MemberC to the whole catalogue. They also link to the basket. In order to do this, they have to use the following instructions:

 

//MemberA’s link requests

link(“mypc.home.net”, “/shop/category[@name=“Hardware”]”,/);

link(“mypc.home.net”, “/basket”,/);

 

 

//MemberB’s link requests

link(“mypc.home.net”, “/shop/category[@name=“Dairy”]”,/);

link(“mypc.home.net”, “/basket”,/);

 

 

//MemberC’s link requests

link(“mypc.home.net”, “/shop”,/);

link(“mypc.home.net”, “/basket”,/);

 

The first parameter of link() operation is the server host name, in this case the personal computer (we discuss the problem of naming later); the second is the XPath [ClaD99] expression representing the root of the branch to be linked; the third is the “mounting point” on the local host.

The resulting XML tree that is maintained by MemberA is showed in the following figure:

   

 

The tree representation of the data structure stored on Member A’s PDA

 

After these linking operations, the application on each PDA can traverse and manipulate this data tree structure using DOM primitives, for example, in order to display the available products in the catalogue.

We may consider the common case of the purchase of a product; let us suppose that MemberA buys a product (for example one hammer). The modifications on MemberA’s tree are showed in the following figure.

  

 

The tree representation of the data structure stored on MemberA’s PDA after an order has been entered

Moreover, let us suppose that MemberA’s PDA is either out of reach or disconnected, while this update happens. When the device establishes a connection with the personal computer, the reconciliation protocol will reconcile any conflicts due to modifications of the catalogue (for instance, new products or discounted prices); likewise, any update that members have introduced to their shopping basket will be incorporated in the data structure on the PC. In our case, a new sub-branch describing the product that MemberA has bought will be added in the branch with the shop element as root. Thus, the shopping basket on the PC is gradually filled through synchronization with the PDAs owned by the different members of the family.

Every user can update his product basket during a disconnection allowing for possible inconsistent replicas of the data tree. For example, in the application that we are considering two members of the family might buy the same product in different quantities, when they are disconnected. The reconciliation algorithm identifies that a conflict occurred, as the quantities are different; in other words, we have two different values in the same position. It is worth noting that we cannot solve this conflict without using application-specific knowledge (for example, in this case, the highest value has to be chosen) and without considering the “history” of the previous operations performed by users on the data tree. For example, another common case may be the addition of a new product by a member during a disconnection; after a re-connection, the replicas stored in the other PDAs has to be re-synchronized accordingly.

Let us consider a simplified situation in order to better understand this issue: two members of the family buy a product when they are disconnected (or not in reach), having their baskets initially empty. For example MemberB buys a bottle of milk and MemberC buys a drill. The replicas of the documents are clearly inconsistent. For this reason, when they get in reach, a reconciliation process between these different copies has to be performed.

 

XMIDDLE primitives

XMIDDLE provides a simple API that essentially allows applications to connect to and disconnect from the network and manage the shared XML data structures. Now we describe the most important XMIDDLE primitives from an application programmer perspective.

 

The connect() and disconnect() primitives

As already discussed, each entry in the LinkedFrom and LinkedBy tables identifies a remote host and one of its branches and the ExportLink set maintains a list of the branches of the local tree that can be linked from remote hosts. Another fundamental table is InReach that stores the list of the hosts that are in reach.

The connect() primitive allows applications to notify their (re)connection to these nodes; upon this operation, a reconciliation process starts with all the hosts that are linking (or are linked) to a branch of the re-connected node and, at the same time, are in reach. In fact, the LinkedBy set is used to notify the changes to its tree to the other hosts that are included in this list. This is performed after a reconnection or during a connection (we refer to this case as on-line reconciliation). It is worth noticing that a host may be in the connected state but not in reach. In this case the reconciliation process is performed only when it gets in reach again. We can also reformulate in a more precise way the definition of in reach given before: two hosts are connected if they are in the same radio coverage and they have performed the connect() operation.

The disconnect() primitive allows a host to perform an explicit disconnection; this can also be implicit, when the host moves in an area where it is out of reach. The disconnection process modifies the content of the InReach table; we developed a specific protocol to handle this possible and frequent situation, which we will describe later.

 

The link() and unlink() primitives

The XMIDDLE link() primitive allows an application to link an exported tree (or a sub-tree) from a remote host; this is very similar to a mounting operation; in fact, the parameters are the server host, the path in the remote XML document and the position in its data tree where the branch has to be appended. XMIDDLE records the details of this process in the LinkedFrom table. 

The corresponding unlink() primitive modifies the local LinkedFrom table “unmounting” a particular branch of the shared tree.

 

The DOM primitives

XMIDDLE allows application programmers to access XML documents according DOM Level 1 Recommendation, using a standard W3C implementation. It is also possible to retrieve particular sub-branches of the tree, for example the ones that are exported by the host. Essentially, the middleware provides two fundamental primitives to access the data: the open() primitive returns the sub-tree specified by using an XPath expression, whereas the close() primitive “commits” the changes. After this operation the local copy is modified and a synchronization process with the remote replicas stored in the other hosts is performed, in order to guarantee data consistency.

It is worth underlining that all these concurrent operations on the data structure stored in each host are synchronized; monitor and resource sharing concepts and techniques are applied in order to avoid accesses to inconsistent information.

 

XMIDDLE protocols

Another interesting feature of XMIDDLE is its sophisticated protocol execution mechanism. Protocols can be plugged into the middleware; the system allows their dynamic negotiation between hosts in order to perform the requested operation in an efficient way. Moreover, if a host does not have a specific protocol needed for a particular service, it transfers it from another node transparently.

Furthermore, an application can add a new protocol, which can be accessed and used by the middleware. Each protocol must inherit from a specific Java class (edu.UCL.xmiddle.framework.lib.protocols.Protocol) and must follow determinate specifications. For example, our implementation based on UDP includes the reconciliation and linking protocol.

Another important aspect to point out is the concept of XMIDDLE protocol session; we define it as a structured networked communication between two or more XMIDDLE hosts; to start it, all the hosts that are involved needs to receive a request for the session. This can be automatically sent by the middleware in the case of an automatic reconciliation process or applications can explicitly send it, as in the case of linking protocol. 

 

 

Host identification

A fundamental problem in distributed systems design is naming; in our case we have to distinguish different hosts in the same radio coverage, which form an ad-hoc network. We cannot use IP addresses, because they might change in different environments; for instance, this is the case of dynamic IP addresses.

Our solution is to use two types of identification credentials, the Primary ID and the Secondary ID. The first is assumed to be unique, in other words the existence of two different hosts with the same Primary ID is not allowed; it can be an Ethernet address, a Bluetooth’s BD_ADDR, a number, etc. In the current implementation of XMIDDLE we simply use a large random integer, generated by java.util.random class.

The Secondary ID is used to identify hosts in a user-friendly manner; we do not assume that it has to be unique. Possible options are hostnames, usernames, etc.

Furthermore, it is worth noticing that the Secondary ID is not used in order to identify hosts during communication protocols; its purpose is only to allow applications to present to end users an understandable representation of the hosts that are currently in reach.The current implementation of XMIDDLE uses hostnames as Secondary IDs.

 

 

The target application domains

XMIDDLE addresses the needs of all developers that have to design applications based on data sharing in a mobile environment.

The range of possible applications is large, covering various fields from collaborative work to health-care service, from m-commerce to disaster management.

We underline that XMIDDLE is suitable both for an ad-hoc network setting, where hosts are peers and usually characterised by scarcity of resources, and for a nomadic setting where mobile devices interacts with more powerful machines, such as database services. However, XMIDDLE targets specifically the needs of developers that have to design ad-hoc networking applications; since the constraints imposed by this scenario are very restrictive, our system can also be used in situations where resource-rich devices are available, such as in a nomadic setting.

It is worth noticing that data structures must not be complex, due to memory and computational capability limitations. For example, users may link only to a portion of data in XML exported by a fixed database service host, in order to be able to store and manipulate them efficiently.

We have developed some applications based on the XMIDDLE platform in order to test and evaluate the performance of our middleware, a collaborative e-commerce application and a meeting organizer.

Moreover, XMIDDLE is currently used in a project developed jointly by University College of London and Elan Technologies in the field of health care data management.

   

 

Documentation Main Page

References

 

Last update: January 2003 by Mirco Musolesi