Difference between revisions of "Modeling for cloud part1"

From Knoesis wiki
Jump to: navigation, search
(Fixes to hyphens)
 
(8 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
This is the Wiki version of the article published in IEEE Internet Computing May/June 2010 Issue.
 +
[http://www.computer.org/portal/web/csdl/doi/10.1109/MIC.2010.77 External Reference]
 +
 
<table border="1" width="650" cellpadding="10">
 
<table border="1" width="650" cellpadding="10">
 
<tr>
 
<tr>
 
<td width="650">
 
<td width="650">
<span style="font-size:28pt;color:purple">Semantic Modeling<br /><br />for Cloud Computing, Part 2</span><br /><br />
+
<span style="font-size:28pt;color:purple">Semantic Modeling<br /><br />for Cloud Computing, Part 1</span><br /><br />
 
Amit Sheth and Ajith Ranabahu • <i>Wright State University</i><br /><br />
 
Amit Sheth and Ajith Ranabahu • <i>Wright State University</i><br /><br />
 
<p style="float:left;width:300px">
 
<p style="float:left;width:300px">
Part 1 of this two-part article discussed
+
<span style="font-size:16pt;color:purple">
challenges related to cloud computing,  
+
C</span>loud computing has lately become the
cloud interoperability, and multidimen-
+
attention grabber in both academia and
sional analysis of cloud-modeling requirements
+
industry. The promise of seemingly unlimited, readily available utility-type computing
(see the May/June issue). Here, we look more
+
has opened many doors previously considered
specifically at areas in which semantic models
+
difficult, if not impossible, to open. The cloud-
can support cloud computing.<br /><br />
+
computing landscape, however, is still evolving,  
 +
and we must overcome many challenges to foster widespread adoption of clouds.
 +
The main challenge is interoperability.
 +
Numerous vendors have introduced paradigms
 +
and services, making the cloud landscape
 +
diverse and heterogeneous. Just as in the computer hardware industry’s early days, when each
 +
vendor made and marketed its own version of  
 +
(incompatible) computer equipment, clouds are
 +
diverse and vendor-locked. Although many
 +
efforts are under way to standardize clouds’
 +
important technical aspects, notably from the  
 +
US National Institute of Standards and Technology (NIST), consolidation and standardization
 +
are still far from reality. In this two-part article,  
 +
we discuss how a little bit of semantics can help
 +
address clouds’ key interoperability and porta-
 +
bility issues.<br /><br />
 
<span style="font-size:12pt;color:purple">
 
<span style="font-size:12pt;color:purple">
Opportunities for Semantic Models in Cloud Computing</span><br />
+
Cloud-Related Challenges</span><br />
Semantic models are helpful in three aspects of  
+
Figure 1 shows the three main flavors of clouds
cloud computing.
+
as outlined by NIST (http://csrc.nist.gov/groups/SNS/cloud-computing). Infrastructure-as-a-­
The first is functional and nonfunctional
+
service (IaaS) clouds have the largest gap (a
definitions. The ability to define application
+
high workload but little automation) in terms of
functionality and quality-of-service details in
+
deploying and managing an application. Platform-as-a-service (PaaS) or software-as-a-service
a platform-agnostic manner can immensely
+
(SaaS) clouds have substantially lower workloads,
benefit the cloud community. This is particu-
+
but at the expense of flexibility and portability.
larly important for porting application code
+
Given this diverse environment, a cloud service consumer faces four challenges.
horizontally—that is, across silos. Lightweight
+
First, depending on the application’s requirements, legal issues, and other possible consid</p>
semantics, which we describe in detail later, are
+
<p style="float:right;width:300px">erations, the consumer must select a cloud to  
particularly applicable.
+
use. Each cloud vendor exposes these details in  
The second aspect is data modeling. A crucial
+
different formats and at different granularity
difficulty developers face is porting data hori-
+
levels.
zontally across clouds. For example, moving data
+
Second, the consumer must learn about the  
from a schema-less data store (such as Google
+
vendor’s technical aspects (service interface,
Bigtable1) to a schema-driven data store such
+
scaling configuration, and so on) and workflow.
as a relational database presents a significant
+
Third, the consumer must then develop an
challenge. For a good overview of this concern,
+
application or customize the vendor-provided
see the discussion of customer scenarios in the
+
multitenant application to fulfill his or her
Cloud Computing User Cases White Paper (www.scribd.com/doc/18172802/Cloud-Computing-Use-Cases-Whitepaper). The root of this dif-
+
requirements. When doing this, the consumer
ficulty is the lack of a platform-agnostic data
+
must take into account various technical details
model. Semantic modeling of data to provide a
+
such as the choice of programming language
platform-independent data representation would
+
and limitations in the application runtime,
be a major advantage in the cloud space.
+
which will all be vendor-specific.
The third aspect is service description
+
Finally, after deploying the application, if
enhancement. Clouds expose their operations
+
the consumer must change the service provider
via Web services, but these service interfaces
+
(which happens surprisingly often), at least two
differ between vendors. The operations’ seman-</p>
+
major considerations arise. First, the consumer
<p style="float:right;width:300px">tics, however, are similar. Metadata added
+
might need to rewrite or modify the application
through annotations pointing to generic opera-
+
code to suit the new provider’s environment. For
tional models would play a key role in consoli-
+
some clouds (such as IaaS), this is minimal, but
dating these APIs and enable interoperability
+
porting the code in PaaS and SaaS clouds will
among the heterogeneous cloud environments.<br /><br />
+
likely require more effort.
<span style="font-size:12pt;color:purple">Functional Portability</span><br />
+
The second consideration is that data col-
From a perspective of the cloud landscape
+
lected for the application might need transfor-
based on the language abstraction and type
+
mation. Data is the most important asset the
of semantics (that is, viewing the cube in Fig-
+
application generates over time and is essential
ure 1 from the top), we see that opportunities
+
for continued functioning. The transformation
exist to use semantic models to define applica-
+
might even need to carry across different data
tions’ functional aspects in a platform-agnostic
+
models. The industry practice is to address such
manner. In most cases, however, converting a
+
transformations case-by-case.
high-level model directly to executable arti-
+
To overcome these challenges and provide
facts pollutes both representations. Intermedi-
+
better insight into the aspects requiring attention, proper modeling in this space is essential.
ate representations are important to provide a
+
Semantic modeling can help with this.<br /><br />
convenient conversion.
+
<span style="font-size:12pt;color:purple">
Applying high-level modeling to describe an
+
Cloud Interoperability?</span><br />
application’s functional aspects isn’t new. Many
+
First, a word about cloud interoperability is in
software development companies have been
+
order. Interoperability requirements in the cloud </p>
using UML to model application functionality
+
at a high level and use artifacts derived from
+
these models to drive development. This process
+
is commonly called model-driven development.
+
This is an example of using high-level models to
+
derive fine-grain artifacts. UML models usually
+
don’t include code, so you can use them only
+
to generate a skeletal application. That is, low-
+
level details are deliberately kept away from the
+
high-level models.
+
UML models, however, are inherently bound
+
with object-oriented languages, and UML-
+
driven development processes depend heavily
+
on advanced tools (for example, IBM’s Ratio-
+
nal Rose; www-01.ibm.com/software/awdtools/
+
developer/rose). This limits UML’s applicability.
+
A popular alternative to such tool-dependent
+
heavy upfront models is domain-specific lan-
+
guages (DSLs). Their popularity is due partly </p>
+
 
</td></tr>
 
</td></tr>
 
<tr><td>
 
<tr><td>
<p style="float:left;width:250px">to the availability of extensible
+
<p style="float:left;width:200px">landscape in Figure 1 arise owing to  
interpreted programming languages
+
two types of heterogeneities.
such as Ruby and Python. Unlike
+
The first is vertical heterogeneity — that is, within a single silo. We
UML, a DSL is applicable only in a
+
can address this by using middleware to homogenize the API and
given domain but enables a light-
+
sometimes by enforcing standardization. For example, the Open Virtualization Format (OVF; www.dmtf.org/
weight model in that domain, often
+
vman) is an emerging standard that
without requiring proprietary tools.  
+
allows migration of virtual-machine
For example, you can use IBM’s
+
snapshots across IaaS clouds.
Sharable Code DSL (http://services.alphaworks.ibm.com/isc), which is a
+
The second type is horizontal
mashup generator, with a basic text
+
heterogeneity—that is, across silos.
editor. (However, providing graphical
+
Overcoming this is fundamentally
abstractions and specialized tooling
+
more difficult. Each silo provides
would be more convenient for users.)
+
different abstraction levels and services. High-level modeling pays off,
“Lightweight” signifies that these
+
especially when you must move an
models don’t use rich knowledge rep-
+
application and code horizontally
resentation languages and so have
+
across these silos.
limited reasoning capabilities.
+
Surprisingly, many small and
Our Cirrocumulus project for  
+
medium businesses make horizontal
cloud interoperability (http://knoesis.org/research/srl/projects/cirrocumulus) uses DSLs to bridge the  
+
transitions. PaaS clouds offer faster
gap between executable artifacts
+
setup for applications, and many
and high-level semantic models. A
+
exploit the free hosting opportunities of some platform cloud providers
DSL, although domain specific, can
+
(for example, Google’s App Engine).  
provide a more programmer-oriented
+
When the application grows in scope
representation of functional, non-
+
and criticality, however, an IaaS
functional, or even data descriptions.
+
cloud might prove cheaper, more  
A best-of-both-worlds approach
+
flexible, and more reliable, prompting a transition.
is to use annotations to link mod-
+
The following discussion on
els, which provides the convenience
+
semantic models applies to both ver-
of lightweight models while sup-
+
tical and horizontal interoperability.  
porting high-level operations when
+
The key in addressing both types is  
required. Figure 2 shows an annota-
+
that many of the core data and services causing them follow the same
tion referring to an ontology from
+
semantic concepts. For example,  
a fictitious DSL script for configu-
+
almost all IaaS clouds follow conceptually similar workflows when allocating resources, although the actual
ration. The script is more program-
+
service implementations and tools differ significantly. Similarly, the PaaS
mer-oriented (in fact, it’s derived
+
modeling space is a subset of that for
from Ruby) but lacks an ontology’s
+
IaaS, from a semantic perspective.
richness. However, the annota-
+
These observations prompt us to
tion links the relevant components
+
argue for semantic models’ applicability, especially to supplement
between the different levels, pro-
+
interoperability in the cloud space.
viding a way to facilitate high-level
+
Some parts of the scientific and  
operations while maintaining a  
+
engineering community weren’t </p>
simpler representation.
+
<p style="float:right;width:400px">[[File:Chart4.png|380px]]<br /><br />
From the perspective based on
+
<table width="400" border="0" cellpadding="5">
the type of semantics and software
+
lifecycle stage—that is, looking at the
+
cube in Figure 1 from the front—you
+
can see the modeling coverage for
+
software deployment and manage-
+
ment. Elastra’s Elastic Computing
+
Modeling Language (ECML), Elas- </p>
+
<p style="float:right;width:350px">[[File:Chart1.png|350px]]<br /><br />[[File:Chart2.png|350px]]<br /><br />
+
<table width="350" border="0" cellpadding="5">
+
 
<tr>
 
<tr>
<td width="175"><p style="float:left">tic Deployment Modeling Language
+
<td width="200"><p style="float:left">impressed by early semantic-modeling approaches, especially ones that
(EDML), and Elastic Management
+
required large up-front investment.
Modeling Language (EMML) ontolo-
+
This perception, however, is changing rapidly with the influx of new
gies cover many system aspects
+
applications and technologies that
and some nonfunctional aspects
+
exploit detailed semantic models to
in all stages. We’re pleased to see
+
provide improved functionality (for
such industry initiative in adopting
+
example, biomedical ontologies cataloged at the US National Center for
semantic technologies.
+
Biomedical Ontology; www.bioon-
Data Modeling
+
tology.org). Semantic models such
Another opportunity for semantic </p></td>
+
as ontologies can formalize more
<td width="175"><p style="float:left">models for clouds lies in Resource
+
details than other traditional modeling techniques. They also enable
Description Framework (RDF) data
+
reasoning, a way to make inferences and gain new knowledge. Such
modeling. As we discussed in part
+
rich models can improve traditional
1, a major concern plaguing cloud
+
functions far beyond what was once
computing’s adoption is data lock-
+
thought possible. For example, Powerset (now part of Microsoft; www.
in—that is, the inability to port data
+
powerset.com), a search service
horizontally. Many vendors designed
+
provider, added considerable value
schema-less, distributed data stores
+
to search results by incorporating
with relaxed consistency models to
+
semantic models and thus enabling
provide high availability and elas-
+
fact discovery. These capabilities are
ticity to suit clouds’ needs. However, </p></td>
+
being complemented by the ability
 +
to more rapidly create domain models, often by mining crowd knowledge represented, for example, in  
 +
Wikipedia1 or shared data, exemplified by the linked-object data cloud.  
 +
The technologies showcased at the
 +
Semantic Technology Conference </p></td>
 +
<td width="200"><p style="float:left">(www.semant ic-conference.com)  
 +
also provide plenty of evidence of
 +
semantic-empowered commercial
 +
and scientific applications.<br /><br />
 +
<span style="font-size:12pt;color:purple">
 +
Multidimensional Analysis
 +
of Cloud-Modeling
 +
Requirements</span><br />
 +
We suggest a 3D slicing of the modeling requirements along the following dimensions (see Figure 2).
 +
The­ types­ of­ semantics that are
 +
useful for porting or interoperabil-
 +
ity in cloud computing are similar to
 +
those we introduced in 2003 for Web
 +
services.
 +
2 This is natural because the
 +
primary means of interacting with a
 +
cloud environment is through Web
 +
services. The four types of semantics—data, functional, nonfunctional, and system—are based on the  
 +
different semantic aspects a model
 +
must cover.
 +
The­ language­ abstraction­ level
 +
indicates the modeling’s granularity and specificity. Although ontological modeling is preferable at
 +
a higher level, developers prefer
 +
detailed, concrete syntactic representations. These representations
 +
of different granularities might
 +
need to be related, often through
 +
explicit annotations. For example,  
 +
although service developers happily </p></td>
 
</tr>
 
</tr>
 
</table>
 
</table>
Line 164: Line 186:
 
<tr>
 
<tr>
 
<td>
 
<td>
[[File:Chart3.png|600px]]
+
[[File:Chart1.png|380px]]
<p style="float:left;width:200px">exploiting these data stores requires
+
<p style="float:left;width:200px">use Web Services Description Lan-
substantial redesign of many data-
+
guage (WSDL) descriptions in their
driven applications and often makes
+
Web services, these descriptions
porting data to a traditional rela-
+
are syntactic and can’t provide useful semantic details. To overcome
tional database extremely difficult.
+
this deficiency, SAWSDL (Semantic
The current practice is to address
+
Annotations for WSDL and XML  
such transitions case-by-case.
+
Schema; www.w3.org/TR/sawsdl)  
A better approach is to model the
+
attaches semantic-model details to
data in RDF and generate the specific
+
WSDL documents.
target representations, and in some
+
The­ software­ lifecycle­ stage is
cases even the code for the applica-
+
important in determining the modeling requirements. For example, some
tion’s data access layer. This method
+
nonfunctional and system require-
can formulate transformations from
+
ments might not be modeled during
one representation to another using
+
development but will be taken into
the lifting-lowering mechanism.
+
account only during deployment. A
Semantic Annotations for WSDL  
+
different team handles each of these
and XML Schema (SAWSDL; www.
+
lifecycle stages; this separation is
w3.org/TR/sawsdl) demonstrated this
+
important so that one team doesn’t
mechanism’s use for data mediation.
+
step on another’s toes. This separation aims to focus the modeling
2
+
effort on the correct time and people.
Lightweight modeling in terms of
+
Some cloud models fall under
DSLs also applies here. For example,  
+
the nonfunctional/system/  ontology
the Web services community has
+
space in Figure 2. Such models include
long used XML Schema definitions
+
the Elastic Computing Modeling
as platform-agnostic data definitions.  
+
Language (ECML), Elastic Deployment Modeling Language (EDML), </p>
Schema definitions serve as inputs to
+
code generation tools that generate
+
platform-specific data definitions.
+
From the perspective of the type
+
of semantics and software lifecycle
+
stage, most of this data modeling
+
applies during application develop-
+
ment. Concrete artifacts generated
+
from these high-level models would
+
be used mostly during subsequent
+
lifecycle stages.</p>
+
 
<p style="float:left;width:10px"> </p>
 
<p style="float:left;width:10px"> </p>
<p style="float:left;width:200px">Service Enrichment
+
<p style="float:left;width:200px">
One feature differentiating the cloud
+
and Elastic Management Modeling
from other distributed environments
+
Language (EMML), all based on OWL
is the availability of Web services
+
and published by Elastra ([http://www.elastra.com/technology/languages Elastra Languages]).
to manipulate resources. Avail-
+
However, some aspects of cloud  
ability of the service APIs lets you
+
modeling have received little or no
programmatically manage the cloud  
+
attention. For example, there’s no
resources, even from within the
+
comprehensive higher-level modeling in the data and functional spaces.  
same cloud. These capabilities have  
+
Lessons learned during large-scale
revolutionized application deploy-
+
ontological modeling in the Semantic Web and Semantic Web services,
ment and management. For example,  
+
biology, and many other domains are  
you can compose or mash up well-
+
readily applicable here and would
defined services to facilitate elabo-
+
help address some of the challenges
rate workflows.
+
in the cloud space.<br />
Service definitions are usually
+
<span style="font-size:16pt;color:purple">
syntactic, and many researchers
+
I</span>n part 2, we’ll look at opportunities
have focused on embedding rich
+
for semantic models in cloud computing. Particular areas in which
metadata in formal service descrip-
+
these models can help are functional
tions. One result of this research
+
portability, data modeling, and ser-
is SAWSDL. A growing trend is
+
vice enrichment. <br /><br />
to annotate HTML descriptions to
+
<span style="font-size:12pt;color:purple">
embed richer, machine-readable
+
References</span><br />
semantic metadata. One reason for
+
1C. Thomas et al., “Growing Fields of  
this method’s popularity is search
+
Interest—Using an Expand and Reduce
engines’ use of metadata to dis-
+
Strategy for Domain Model Extraction,</p>
play results in customized formats.
+
<p style="width:200px;float:right">Proc.­ 2008­ Int’l­ Conf.­ Web­ Intelligence­
Yahoo’s SearchMonkey and Google
+
and­ Intelligent­ Agent­ Technology (WI-
Rich Snippets are two such micro-
+
IAT 08), vol. 1, IEEE CS Press, 2008, pp.  
format-driven schemes. These anno-
+
496–502.
tations, unlike the DSL annotations
+
2. K. Sivashanmugam et al., “Adding
in Figure 3, might not always point
+
Semantics to Web Services Standards,”  
to ontologies. Their structure can be
+
Proc.­Int’l­Conf.­Web­Services (ICWS 03),  
based on a vocabulary or taxonomy—
+
CSREA Press, 2003, pp. 395–401.
a lower-grade nonsemantic model.
+
Amit Sheth is the director of Kno.e.sis—the
For example, the popular hCalendar </p>
+
Center of Excellence on Knowledge-
<p style="width:200px;float:right">microformat is part of the “lowercase
+
Enabled Human-Centered Computing at  
semantic web” movement, which
+
Wright State University. He’s also the  
emphasizes lightweight models.
+
university’s LexisNexis Ohio Eminent  
Embedding rich semantic meta-
+
Scholar and an IEEE Fellow. He’s on the  
data in cloud service descriptions has
+
Web at [http://knoesis.org/amit http://knoesis.org/amit].
three main benefits that go beyond
+
customized search capabilities.
+
The first benefit deals with Rep-
+
resentational State Transfer (REST)
+
style services. Many cloud service
+
providers adopt REST-style Web ser-
+
vices that don’t advocate a formal
+
service description. These services
+
are described using HTML pages.
+
WSDL 2.0, the latest specification,  
+
explicitly supports formal descrip-
+
tion of “RESTful” services but hasn’t
+
seen quick adoption. Alternative
+
approaches such as SA-REST3 (SA
+
stands for semantic annotation), a
+
generic annotation scheme that fol-
+
lows microformat design principles,
+
are becoming more applicable in
+
this space. These annotations enable
+
the seamless, flexible integration of
+
formalizations into RESTful service
+
descriptions. This opens the door
+
to many exciting avenues such as
+
faceted search to identify relevant
+
reusable services and semiautomated
+
service compositions.
+
The second benefit deals with
+
handling change. The cloud space is
+
still evolving. If the history of soft-
+
ware or component interoperability </p>
+
</td>
+
</tr>
+
<tr>
+
<td>
+
<p style="float:left;width:250px">is any guide, achieving consensus in
+
the cloud space will be difficult and
+
won’t likely happen soon. Attaching
+
formalizations via annotations, how-
+
ever, is flexible enough to accom-
+
modate an evolving model. This is
+
especially attractive to vendors who
+
aren’t willing to invest heavily in
+
interim standards.
+
The third benefit is that the for-
+
malizations apply not only to ser-
+
vice descriptions but also to many
+
other aspects such as service-level
+
agreements (SLAs) and software
+
licenses. You can use annotations
+
to embed formalizations even for
+
these documents, facilitating more
+
automation in the cloud space. For
+
example, Web Service- Level Agree-
+
ment (WSLA; www.research.ibm.
+
com/wsla) specification provides a
+
way to formalize SLAs, but creating
+
and maintaining these formaliza-
+
tions is time-consuming.
+
Figure 3 illustrates using SA-
+
REST annotations on the Amazon
+
Elastic Compute Cloud (EC2) SLA
+
document. It shows how a capable
+
processor could use these annota-
+
tions to extract a WSLA equivalent
+
of the human-readable SLA.
+
These benefits’ importance comes
+
into perspective when you consider
+
the enormous body of research on
+
standard-driven service compo-
+
sitions and agreement matching.
+
The informal, non-standard-driven
+
nature of many cloud services made
+
most of the previous research inap-
+
plicable. However, being able to
+
glean formalizations from existing
+
documents opens the doors to apply
+
many well-researched techniques.
+
The cloud space presents many
+
opportunities for researchers,  
+
and we see a plethora of applica-
+
tions that use semantic modeling.
+
Issues such as interoperability and
+
data portability, which the cloud
+
community is facing right now, are
+
the very issues for which semantic </p>
+
<p style="float:left;width:10px"> </p>
+
<p style="float:left;width:250px">models excel in providing solutions.  
+
However, learning from the past,
+
we advocate a multilevel modeling
+
strategy to provide smooth tran-
+
sitions into different granularity
+
levels. We also think that DSLs can
+
play an important role in the cloud
+
space to provide lightweight model-
+
ing in an appealing manner to the
+
software engineering community.  
+
References
+
1.  F. Chang et al., “Bigtable: A Distributed
+
Storage System for Structured Data,” in
+
Proc. Usenix Symp. Operating Systems
+
Design and Implementation, Usenix
+
Assoc., 2006, p. 15.
+
2. M. Nagarajan et al., “Semantic Interoper-
+
ability of Web Services—Challenges and
+
Experiences,” Proc. 2006 IEEE Int’l Conf.  
+
Web Services (ICWS 06), IEEE CS Press,  
+
2006, p. 373–382.
+
3.  A.P. Sheth, K. Gomadam, and J. Lathem,
+
“SA-REST: Semantically Interoperable
+
and Easier-to-Use Services and Mash-
+
ups,”  IEEE Internet Computing, vol. 11,
+
no. 6, 2007, pp. 91–94.
+
Amit Sheth is the director of the Ohio Cen-
+
ter of Excellence on Knowledge-Enabled
+
Computing (Kno.e.sis) at Wright State  
+
University. He’s also the university’s
+
LexisNexis Ohio Eminent Scholar. He’s  
+
on the Web at http://knoesis.org/amit/.
+
 
Ajith Ranabahu is pursuing a PhD in cloud-
 
Ajith Ranabahu is pursuing a PhD in cloud-
 
computing interoperability at Wright  
 
computing interoperability at Wright  
 
State University. He worked with IBM  
 
State University. He worked with IBM  
 
on Sharable Code and its Altocumulus  
 
on Sharable Code and its Altocumulus  
project, and he coordinates the Cirrocu-
+
project, and he coordinates the [http://knoesis.wright.edu/research/srl/projects/cirrocumulus Cirrocumulus project]. Contact him at ajith.ranabahu@gmail.com.
mulus project (http://knoesis.wright.edu/research/srl/projects/cirrocumulus). Con-
+
</p>
tact him at ajith@knoesis.org
+
Selected CS articles and columns
+
are also available for free at http://ComputingNow.computer.org/</p>
+
 
</td>
 
</td>
 
</tr>
 
</tr>
 
</table>
 
</table>

Latest revision as of 15:43, 30 July 2010

This is the Wiki version of the article published in IEEE Internet Computing May/June 2010 Issue. External Reference

Semantic Modeling

for Cloud Computing, Part 1


Amit Sheth and Ajith Ranabahu • Wright State University

Cloud computing has lately become the attention grabber in both academia and industry. The promise of seemingly unlimited, readily available utility-type computing has opened many doors previously considered difficult, if not impossible, to open. The cloud- computing landscape, however, is still evolving, and we must overcome many challenges to foster widespread adoption of clouds. The main challenge is interoperability. Numerous vendors have introduced paradigms and services, making the cloud landscape diverse and heterogeneous. Just as in the computer hardware industry’s early days, when each vendor made and marketed its own version of (incompatible) computer equipment, clouds are diverse and vendor-locked. Although many efforts are under way to standardize clouds’ important technical aspects, notably from the US National Institute of Standards and Technology (NIST), consolidation and standardization are still far from reality. In this two-part article, we discuss how a little bit of semantics can help address clouds’ key interoperability and porta- bility issues.

Cloud-Related Challenges
Figure 1 shows the three main flavors of clouds as outlined by NIST (http://csrc.nist.gov/groups/SNS/cloud-computing). Infrastructure-as-a-­ service (IaaS) clouds have the largest gap (a high workload but little automation) in terms of deploying and managing an application. Platform-as-a-service (PaaS) or software-as-a-service (SaaS) clouds have substantially lower workloads, but at the expense of flexibility and portability. Given this diverse environment, a cloud service consumer faces four challenges. First, depending on the application’s requirements, legal issues, and other possible consid

erations, the consumer must select a cloud to use. Each cloud vendor exposes these details in different formats and at different granularity levels. Second, the consumer must learn about the vendor’s technical aspects (service interface, scaling configuration, and so on) and workflow. Third, the consumer must then develop an application or customize the vendor-provided multitenant application to fulfill his or her requirements. When doing this, the consumer must take into account various technical details such as the choice of programming language and limitations in the application runtime, which will all be vendor-specific. Finally, after deploying the application, if the consumer must change the service provider (which happens surprisingly often), at least two major considerations arise. First, the consumer might need to rewrite or modify the application code to suit the new provider’s environment. For some clouds (such as IaaS), this is minimal, but porting the code in PaaS and SaaS clouds will likely require more effort. The second consideration is that data col- lected for the application might need transfor- mation. Data is the most important asset the application generates over time and is essential for continued functioning. The transformation might even need to carry across different data models. The industry practice is to address such transformations case-by-case. To overcome these challenges and provide better insight into the aspects requiring attention, proper modeling in this space is essential. Semantic modeling can help with this.

Cloud Interoperability?
First, a word about cloud interoperability is in order. Interoperability requirements in the cloud

landscape in Figure 1 arise owing to two types of heterogeneities. The first is vertical heterogeneity — that is, within a single silo. We can address this by using middleware to homogenize the API and sometimes by enforcing standardization. For example, the Open Virtualization Format (OVF; www.dmtf.org/ vman) is an emerging standard that allows migration of virtual-machine snapshots across IaaS clouds. The second type is horizontal heterogeneity—that is, across silos. Overcoming this is fundamentally more difficult. Each silo provides different abstraction levels and services. High-level modeling pays off, especially when you must move an application and code horizontally across these silos. Surprisingly, many small and medium businesses make horizontal transitions. PaaS clouds offer faster setup for applications, and many exploit the free hosting opportunities of some platform cloud providers (for example, Google’s App Engine). When the application grows in scope and criticality, however, an IaaS cloud might prove cheaper, more flexible, and more reliable, prompting a transition. The following discussion on semantic models applies to both ver- tical and horizontal interoperability. The key in addressing both types is that many of the core data and services causing them follow the same semantic concepts. For example, almost all IaaS clouds follow conceptually similar workflows when allocating resources, although the actual service implementations and tools differ significantly. Similarly, the PaaS modeling space is a subset of that for IaaS, from a semantic perspective. These observations prompt us to argue for semantic models’ applicability, especially to supplement interoperability in the cloud space. Some parts of the scientific and engineering community weren’t

Chart4.png

impressed by early semantic-modeling approaches, especially ones that

required large up-front investment. This perception, however, is changing rapidly with the influx of new applications and technologies that exploit detailed semantic models to provide improved functionality (for example, biomedical ontologies cataloged at the US National Center for Biomedical Ontology; www.bioon- tology.org). Semantic models such as ontologies can formalize more details than other traditional modeling techniques. They also enable reasoning, a way to make inferences and gain new knowledge. Such rich models can improve traditional functions far beyond what was once thought possible. For example, Powerset (now part of Microsoft; www. powerset.com), a search service provider, added considerable value to search results by incorporating semantic models and thus enabling fact discovery. These capabilities are being complemented by the ability to more rapidly create domain models, often by mining crowd knowledge represented, for example, in Wikipedia1 or shared data, exemplified by the linked-object data cloud. The technologies showcased at the

Semantic Technology Conference

(www.semant ic-conference.com)

also provide plenty of evidence of semantic-empowered commercial and scientific applications.

Multidimensional Analysis of Cloud-Modeling Requirements
We suggest a 3D slicing of the modeling requirements along the following dimensions (see Figure 2). The­ types­ of­ semantics that are useful for porting or interoperabil- ity in cloud computing are similar to those we introduced in 2003 for Web services. 2 This is natural because the primary means of interacting with a cloud environment is through Web services. The four types of semantics—data, functional, nonfunctional, and system—are based on the different semantic aspects a model must cover. The­ language­ abstraction­ level indicates the modeling’s granularity and specificity. Although ontological modeling is preferable at a higher level, developers prefer detailed, concrete syntactic representations. These representations of different granularities might need to be related, often through explicit annotations. For example,

although service developers happily

Chart1.png

use Web Services Description Lan- guage (WSDL) descriptions in their Web services, these descriptions are syntactic and can’t provide useful semantic details. To overcome this deficiency, SAWSDL (Semantic Annotations for WSDL and XML Schema; www.w3.org/TR/sawsdl) attaches semantic-model details to WSDL documents. The­ software­ lifecycle­ stage is important in determining the modeling requirements. For example, some nonfunctional and system require- ments might not be modeled during development but will be taken into account only during deployment. A different team handles each of these lifecycle stages; this separation is important so that one team doesn’t step on another’s toes. This separation aims to focus the modeling effort on the correct time and people. Some cloud models fall under the nonfunctional/system/ ontology space in Figure 2. Such models include the Elastic Computing Modeling Language (ECML), Elastic Deployment Modeling Language (EDML),

and Elastic Management Modeling Language (EMML), all based on OWL and published by Elastra (Elastra Languages). However, some aspects of cloud modeling have received little or no attention. For example, there’s no comprehensive higher-level modeling in the data and functional spaces. Lessons learned during large-scale ontological modeling in the Semantic Web and Semantic Web services, biology, and many other domains are readily applicable here and would help address some of the challenges in the cloud space.
In part 2, we’ll look at opportunities for semantic models in cloud computing. Particular areas in which these models can help are functional portability, data modeling, and ser- vice enrichment.

References
1. C. Thomas et al., “Growing Fields of Interest—Using an Expand and Reduce Strategy for Domain Model Extraction,”

Proc.­ 2008­ Int’l­ Conf.­ Web­ Intelligence­ and­ Intelligent­ Agent­ Technology (WI- IAT 08), vol. 1, IEEE CS Press, 2008, pp. 496–502. 2. K. Sivashanmugam et al., “Adding Semantics to Web Services Standards,” Proc.­Int’l­Conf.­Web­Services (ICWS 03), CSREA Press, 2003, pp. 395–401. Amit Sheth is the director of Kno.e.sis—the Center of Excellence on Knowledge- Enabled Human-Centered Computing at Wright State University. He’s also the university’s LexisNexis Ohio Eminent Scholar and an IEEE Fellow. He’s on the Web at http://knoesis.org/amit. Ajith Ranabahu is pursuing a PhD in cloud- computing interoperability at Wright State University. He worked with IBM on Sharable Code and its Altocumulus project, and he coordinates the Cirrocumulus project. Contact him at ajith.ranabahu@gmail.com.