Modeling for cloud part1

From Knoesis wiki
Revision as of 17:30, 13 July 2010 by Michaelcooney (Talk | contribs)

Jump to: navigation, search

Semantic Modeling

for Cloud Computing, Part 2


Amit Sheth and Ajith Ranabahu • Wright State University

Part 1 of this two-part article discussed challenges related to cloud computing, cloud interoperability, and multidimen- sional analysis of cloud-modeling requirements (see the May/June issue). Here, we look more specifically at areas in which semantic models can support cloud computing.

Opportunities for Semantic Models in Cloud Computing
Semantic models are helpful in three aspects of cloud computing. The first is functional and nonfunctional definitions. The ability to define application functionality and quality-of-service details in a platform-agnostic manner can immensely benefit the cloud community. This is particu- larly important for porting application code horizontally—that is, across silos. Lightweight semantics, which we describe in detail later, are particularly applicable. The second aspect is data modeling. A crucial difficulty developers face is porting data hori- zontally across clouds. For example, moving data from a schema-less data store (such as Google Bigtable1) to a schema-driven data store such as a relational database presents a significant challenge. For a good overview of this concern, see the discussion of customer scenarios in the Cloud Computing User Cases White Paper (www. scr ibd.com/doc/18172802/Cloud-Comput ing -Use-Cases-Whitepaper). The root of this dif- ficulty is the lack of a platform-agnostic data model. Semantic modeling of data to provide a platform-independent data representation would be a major advantage in the cloud space. The third aspect is service description enhancement. Clouds expose their operations via Web services, but these service interfaces differ between vendors. The operations’ seman-

tics, however, are similar. Metadata added through annotations pointing to generic opera- tional models would play a key role in consoli- dating these APIs and enable interoperability among the heterogeneous cloud environments.

Functional Portability
From a perspective of the cloud landscape based on the language abstraction and type of semantics (that is, viewing the cube in Fig- ure 1 from the top), we see that opportunities exist to use semantic models to define applica- tions’ functional aspects in a platform-agnostic manner. In most cases, however, converting a high-level model directly to executable arti- facts pollutes both representations. Intermedi- ate representations are important to provide a convenient conversion. Applying high-level modeling to describe an application’s functional aspects isn’t new. Many software development companies have been using UML to model application functionality at a high level and use artifacts derived from these models to drive development. This process is commonly called model-driven development. This is an example of using high-level models to derive fine-grain artifacts. UML models usually don’t include code, so you can use them only to generate a skeletal application. That is, low- level details are deliberately kept away from the high-level models. UML models, however, are inherently bound with object-oriented languages, and UML- driven development processes depend heavily on advanced tools (for example, IBM’s Ratio- nal Rose; www-01.ibm.com/software/awdtools/ developer/rose). This limits UML’s applicability. A popular alternative to such tool-dependent heavy upfront models is domain-specific lan- guages (DSLs). Their popularity is due partly

to the availability of extensible interpreted programming languages such as Ruby and Python. Unlike UML, a DSL is applicable only in a given domain but enables a light- weight model in that domain, often without requiring proprietary tools. For example, you can use IBM’s Sharable Code DSL (http://services. alphaworks.ibm.com/isc), which is a mashup generator, with a basic text editor. (However, providing graphical abstractions and specialized tooling would be more convenient for users.) “Lightweight” signifies that these models don’t use rich knowledge rep- resentation languages and so have limited reasoning capabilities. Our Cirrocumulus project for cloud interoperability (http://kno- esis.org/research/srl/projects/cir- rocumulus) uses DSLs to bridge the gap between executable artifacts and high-level semantic models. A DSL, although domain specific, can provide a more programmer-oriented representation of functional, non- functional, or even data descriptions. A best-of-both-worlds approach is to use annotations to link mod- els, which provides the convenience of lightweight models while sup- porting high-level operations when required. Figure 2 shows an annota- tion referring to an ontology from a fictitious DSL script for configu- ration. The script is more program- mer-oriented (in fact, it’s derived from Ruby) but lacks an ontology’s richness. However, the annota- tion links the relevant components between the different levels, pro- viding a way to facilitate high-level operations while maintaining a simpler representation. From the perspective based on the type of semantics and software lifecycle stage—that is, looking at the cube in Figure 1 from the front—you can see the modeling coverage for software deployment and manage- ment. Elastra’s Elastic Computing Modeling Language (ECML), Elas-

Chart1.png

Chart2.png

tic Deployment Modeling Language

(EDML), and Elastic Management Modeling Language (EMML) ontolo- gies cover many system aspects and some nonfunctional aspects in all stages. We’re pleased to see such industry initiative in adopting semantic technologies. Data Modeling

Another opportunity for semantic

models for clouds lies in Resource

Description Framework (RDF) data modeling. As we discussed in part 1, a major concern plaguing cloud computing’s adoption is data lock- in—that is, the inability to port data horizontally. Many vendors designed schema-less, distributed data stores with relaxed consistency models to provide high availability and elas-

ticity to suit clouds’ needs. However,

Chart3.png

exploiting these data stores requires substantial redesign of many data- driven applications and often makes porting data to a traditional rela- tional database extremely difficult. The current practice is to address such transitions case-by-case. A better approach is to model the data in RDF and generate the specific target representations, and in some cases even the code for the applica- tion’s data access layer. This method can formulate transformations from one representation to another using the lifting-lowering mechanism. Semantic Annotations for WSDL and XML Schema (SAWSDL; www. w3.org/TR/sawsdl) demonstrated this mechanism’s use for data mediation. 2 Lightweight modeling in terms of DSLs also applies here. For example, the Web services community has long used XML Schema definitions as platform-agnostic data definitions. Schema definitions serve as inputs to code generation tools that generate platform-specific data definitions. From the perspective of the type of semantics and software lifecycle stage, most of this data modeling applies during application develop- ment. Concrete artifacts generated from these high-level models would be used mostly during subsequent lifecycle stages.

Service Enrichment One feature differentiating the cloud from other distributed environments is the availability of Web services to manipulate resources. Avail- ability of the service APIs lets you programmatically manage the cloud resources, even from within the same cloud. These capabilities have revolutionized application deploy- ment and management. For example, you can compose or mash up well- defined services to facilitate elabo- rate workflows. Service definitions are usually syntactic, and many researchers have focused on embedding rich metadata in formal service descrip- tions. One result of this research is SAWSDL. A growing trend is to annotate HTML descriptions to embed richer, machine-readable semantic metadata. One reason for this method’s popularity is search engines’ use of metadata to dis- play results in customized formats. Yahoo’s SearchMonkey and Google Rich Snippets are two such micro- format-driven schemes. These anno- tations, unlike the DSL annotations in Figure 3, might not always point to ontologies. Their structure can be based on a vocabulary or taxonomy— a lower-grade nonsemantic model. For example, the popular hCalendar

microformat is part of the “lowercase semantic web” movement, which emphasizes lightweight models. Embedding rich semantic meta- data in cloud service descriptions has three main benefits that go beyond customized search capabilities. The first benefit deals with Rep- resentational State Transfer (REST) style services. Many cloud service providers adopt REST-style Web ser- vices that don’t advocate a formal service description. These services are described using HTML pages. WSDL 2.0, the latest specification, explicitly supports formal descrip- tion of “RESTful” services but hasn’t seen quick adoption. Alternative approaches such as SA-REST3 (SA stands for semantic annotation), a generic annotation scheme that fol- lows microformat design principles, are becoming more applicable in this space. These annotations enable the seamless, flexible integration of formalizations into RESTful service descriptions. This opens the door to many exciting avenues such as faceted search to identify relevant reusable services and semiautomated service compositions. The second benefit deals with handling change. The cloud space is still evolving. If the history of soft- ware or component interoperability