Difference between revisions of "Modeling for cloud"

From Knoesis wiki
Jump to: navigation, search
(First draft)
(More changes)
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
<table border="1" width="650" cellpadding="10">
+
<span style="font-size:28pt;color:purple">Semantic Modeling for Cloud Computing</span><br /><br />
<tr>
+
Amit Sheth and Ajith Ranabahu • <i>Wright State University</i>
<td width="650">
+
 
<span style="font-size:28pt;color:purple">Semantic Modeling<br /><br />for Cloud Computing, Part 1</span><br /><br />
+
Cloud computing has lately become the  
Amit Sheth and Ajith Ranabahu • <i>Wright State University</i><br /><br />
+
<p style="float:left;width:300px">
+
<span style="font-size:16pt;color:purple">
+
C</span>loud computing has lately become the  
+
 
attention grabber in both academia and  
 
attention grabber in both academia and  
 
industry. The promise of seemingly unlimited, readily available utility-type computing  
 
industry. The promise of seemingly unlimited, readily available utility-type computing  
Line 25: Line 21:
 
are still far from reality. In this two-part article,  
 
are still far from reality. In this two-part article,  
 
we discuss how a little bit of semantics can help  
 
we discuss how a little bit of semantics can help  
address clouds’ key interoperability and porta-
+
address clouds’ key interoperability and portability issues.
bility issues.<br /><br />
+
 
<span style="font-size:12pt;color:purple">
+
<span style="font-size:12pt;color:purple">Cloud-Related Challenges</span><br />
Cloud-Related Challenges</span><br />
+
 
 
Figure 1 shows the three main flavors of clouds  
 
Figure 1 shows the three main flavors of clouds  
 
as outlined by NIST (http://csrc.nist.gov/groups/SNS/cloud-computing).  Infrastructure-as-a-­
 
as outlined by NIST (http://csrc.nist.gov/groups/SNS/cloud-computing).  Infrastructure-as-a-­
Line 37: Line 33:
 
but at the expense of flexibility and portability.
 
but at the expense of flexibility and portability.
 
Given this diverse environment, a cloud service consumer faces four challenges.
 
Given this diverse environment, a cloud service consumer faces four challenges.
First, depending on the application’s requirements, legal issues, and other possible consid</p>
+
First, depending on the application’s requirements, legal issues, and other possible considerations, the consumer must select a cloud to  
<p style="float:right;width:300px">erations, the consumer must select a cloud to  
+
 
use. Each cloud vendor exposes these details in  
 
use. Each cloud vendor exposes these details in  
 
different formats and at different granularity  
 
different formats and at different granularity  
Line 62: Line 57:
 
porting the code in PaaS and SaaS clouds will  
 
porting the code in PaaS and SaaS clouds will  
 
likely require more effort.
 
likely require more effort.
The second consideration is that data col-
+
The second consideration is that data collected for the application might need transformation. Data is the most important asset the  
lected for the application might need transfor-
+
mation. Data is the most important asset the  
+
 
application generates over time and is essential  
 
application generates over time and is essential  
 
for continued functioning. The transformation  
 
for continued functioning. The transformation  
Line 73: Line 66:
 
better insight into the aspects requiring attention, proper modeling in this space is essential.  
 
better insight into the aspects requiring attention, proper modeling in this space is essential.  
 
Semantic modeling can help with this.<br /><br />
 
Semantic modeling can help with this.<br /><br />
<span style="font-size:12pt;color:purple">
+
 
Cloud Interoperability?</span><br />
+
[[File:Chart4.png]]
 +
 
 +
<span style="font-size:12pt;color:purple">Cloud Interoperability?</span><br />
 
First, a word about cloud interoperability is in  
 
First, a word about cloud interoperability is in  
order. Interoperability requirements in the cloud </p>
+
order. Interoperability requirements in the cloud landscape in Figure 1 arise owing to  
</td></tr>
+
two types of heterogeneities.The first is vertical heterogeneity — that is, within a single silo. We  
<tr><td>
+
<p style="float:left;width:200px">landscape in Figure 1 arise owing to  
+
two types of heterogeneities.
+
The first is vertical heterogeneity — that is, within a single silo. We  
+
 
can address this by using middleware to homogenize the API and  
 
can address this by using middleware to homogenize the API and  
sometimes by enforcing standardization. For example, the Open Virtualization Format (OVF; www.dmtf.org/
+
sometimes by enforcing standardization. For example, the Open Virtualization Format ([http://www.dmtf.org/vman OVF] ) is an emerging standard that allows migration of virtual-machine  
vman) is an emerging standard that  
+
allows migration of virtual-machine  
+
 
snapshots across IaaS clouds.
 
snapshots across IaaS clouds.
 
The second type is horizontal  
 
The second type is horizontal  
Line 106: Line 95:
 
flexible, and more reliable, prompting a transition.
 
flexible, and more reliable, prompting a transition.
 
The following discussion on  
 
The following discussion on  
semantic models applies to both ver-
+
semantic models applies to both vertical and horizontal interoperability.  
tical and horizontal interoperability.  
+
 
The key in addressing both types is  
 
The key in addressing both types is  
 
that many of the core data and services causing them follow the same  
 
that many of the core data and services causing them follow the same  
Line 119: Line 107:
 
interoperability in the cloud space.  
 
interoperability in the cloud space.  
 
Some parts of the scientific and  
 
Some parts of the scientific and  
engineering community weren’t </p>
+
engineering community weren’t impressed by early semantic-modeling approaches, especially ones that  
<p style="float:right;width:400px">[[File:Chart4.png|380px]]<br /><br />
+
<table width="400" border="0" cellpadding="5">
+
<tr>
+
<td width="200"><p style="float:left">impressed by early semantic-modeling approaches, especially ones that  
+
 
required large up-front investment.  
 
required large up-front investment.  
 
This perception, however, is changing rapidly with the influx of new  
 
This perception, however, is changing rapidly with the influx of new  
Line 129: Line 113:
 
exploit detailed semantic models to  
 
exploit detailed semantic models to  
 
provide improved functionality (for  
 
provide improved functionality (for  
example, biomedical ontologies cataloged at the US National Center for  
+
example, biomedical ontologies cataloged at the [http://www.bioontology.org US National Center for Biomedical Ontology] ). Semantic models such  
Biomedical Ontology; www.bioon-
+
tology.org). Semantic models such  
+
 
as ontologies can formalize more  
 
as ontologies can formalize more  
 
details than other traditional modeling techniques. They also enable  
 
details than other traditional modeling techniques. They also enable  
Line 137: Line 119:
 
rich models can improve traditional  
 
rich models can improve traditional  
 
functions far beyond what was once  
 
functions far beyond what was once  
thought possible. For example, Powerset (now part of Microsoft; www.
+
thought possible. For example, [http://www.powerset.com Powerset] (now part of Microsoft), a search service  
powerset.com), a search service  
+
 
provider, added considerable value  
 
provider, added considerable value  
 
to search results by incorporating  
 
to search results by incorporating  
Line 146: Line 127:
 
to more rapidly create domain models, often by mining crowd knowledge represented, for example, in  
 
to more rapidly create domain models, often by mining crowd knowledge represented, for example, in  
 
Wikipedia1 or shared data, exemplified by the linked-object data cloud.  
 
Wikipedia1 or shared data, exemplified by the linked-object data cloud.  
The technologies showcased at the  
+
The technologies showcased at the [http://www.semantic-conference.com Semantic Technology Conference]
Semantic Technology Conference </p></td>
+
<td width="200"><p style="float:left">(www.semant ic-conference.com)
+
 
also provide plenty of evidence of  
 
also provide plenty of evidence of  
 
semantic-empowered commercial  
 
semantic-empowered commercial  
 
and scientific applications.<br /><br />
 
and scientific applications.<br /><br />
<span style="font-size:12pt;color:purple">
+
 
Multidimensional Analysis  
+
<span style="font-size:12pt;color:purple">Multidimensional Analysis of Cloud-Modeling Requirements</span><br />
of Cloud-Modeling  
+
 
Requirements</span><br />
+
 
We suggest a 3D slicing of the modeling requirements along the following dimensions (see Figure 2).
 
We suggest a 3D slicing of the modeling requirements along the following dimensions (see Figure 2).
 
The­ types­ of­ semantics that are  
 
The­ types­ of­ semantics that are  
useful for porting or interoperabil-
+
useful for porting or interoperability in cloud computing are similar to  
ity in cloud computing are similar to  
+
 
those we introduced in 2003 for Web  
 
those we introduced in 2003 for Web  
 
services.
 
services.
2 This is natural because the  
+
This is natural because the  
 
primary means of interacting with a  
 
primary means of interacting with a  
 
cloud environment is through Web  
 
cloud environment is through Web  
Line 168: Line 145:
 
different semantic aspects a model  
 
different semantic aspects a model  
 
must cover.
 
must cover.
 +
 +
[[File:Chart1.png]]
 +
 
The­ language­ abstraction­ level  
 
The­ language­ abstraction­ level  
 
indicates the modeling’s granularity and specificity. Although ontological modeling is preferable at  
 
indicates the modeling’s granularity and specificity. Although ontological modeling is preferable at  
Line 175: Line 155:
 
need to be related, often through  
 
need to be related, often through  
 
explicit annotations. For example,  
 
explicit annotations. For example,  
although service developers happily </p></td>
+
although service developers happily  
</tr>
+
use Web Services Description Language (WSDL) descriptions in their  
</table>
+
</p>
+
</td>
+
</tr>
+
<tr>
+
<td>
+
[[File:Chart1.png|380px]]
+
<p style="float:left;width:200px">use Web Services Description Lan-
+
guage (WSDL) descriptions in their  
+
 
Web services, these descriptions  
 
Web services, these descriptions  
 
are syntactic and can’t provide useful semantic details. To overcome  
 
are syntactic and can’t provide useful semantic details. To overcome  
Line 195: Line 166:
 
The­ software­ lifecycle­ stage is  
 
The­ software­ lifecycle­ stage is  
 
important in determining the modeling requirements. For example, some  
 
important in determining the modeling requirements. For example, some  
nonfunctional and system require-
+
nonfunctional and system requirements might not be modeled during  
ments might not be modeled during  
+
 
development but will be taken into  
 
development but will be taken into  
 
account only during deployment. A  
 
account only during deployment. A  
Line 208: Line 178:
 
space in Figure 2. Such models include  
 
space in Figure 2. Such models include  
 
the Elastic Computing Modeling  
 
the Elastic Computing Modeling  
Language (ECML), Elastic Deployment Modeling Language (EDML), </p>
+
Language (ECML), Elastic Deployment Modeling Language (EDML),  
<p style="float:left;width:10px"> </p>
+
<p style="float:left;width:200px">
+
 
and Elastic Management Modeling  
 
and Elastic Management Modeling  
 
Language (EMML), all based on OWL  
 
Language (EMML), all based on OWL  
Line 223: Line 191:
 
readily applicable here and would  
 
readily applicable here and would  
 
help address some of the challenges  
 
help address some of the challenges  
in the cloud space.<br />
+
in the cloud space.
<span style="font-size:16pt;color:purple">
+
 
I</span>n part 2, we’ll look at opportunities
+
 
for semantic models in cloud computing. Particular areas in which
+
<span style="font-size:12pt;color:purple">Opportunities for Semantic Models in Cloud Computing</span><br />
these models can help are functional
+
portability, data modeling, and ser-
+
vice enrichment. <br /><br />
+
<span style="font-size:12pt;color:purple">
+
References</span><br />
+
1.  C. Thomas et al., “Growing Fields of
+
Interest—Using an Expand and Reduce
+
Strategy for Domain Model Extraction,”</p>
+
<p style="width:200px;float:right">Proc.­ 2008­ Int’l­ Conf.­ Web­ Intelligence­
+
and­ Intelligent­ Agent­ Technology (WI-
+
IAT 08), vol. 1, IEEE CS Press, 2008, pp.
+
496–502.
+
2.  K. Sivashanmugam et al., “Adding
+
Semantics to Web Services Standards,”
+
Proc.­Int’l­Conf.­Web­Services (ICWS 03),
+
CSREA Press, 2003, pp. 395–401.
+
Amit Sheth is the director of Kno.e.sis—the
+
Center of Excellence on Knowledge-
+
Enabled Human-Centered Computing at
+
Wright State University. He’s also the
+
university’s LexisNexis Ohio Eminent
+
Scholar and an IEEE Fellow. He’s on the
+
Web at [http://knoesis.org/amit http://knoesis.org/amit].
+
Ajith Ranabahu is pursuing a PhD in cloud-
+
computing interoperability at Wright
+
State University. He worked with IBM
+
on Sharable Code and its Altocumulus
+
project, and he coordinates the [http://knoesis.wright.edu/research/srl/projects/cirrocumulus Cirrocumulus project]. Contact him at ajith.ranabahu@gmail.com.
+
</p>
+
</td>
+
</tr>
+
<tr>
+
<td width="650">
+
<span style="font-size:28pt;color:purple">Semantic Modeling<br /><br />for Cloud Computing, Part 2</span><br /><br />
+
Amit Sheth and Ajith Ranabahu • <i>Wright State University</i><br /><br />
+
<p style="float:left;width:300px">
+
Part 1 of this two-part article discussed
+
challenges related to cloud computing,
+
cloud interoperability, and multidimensional analysis of cloud-modeling requirements
+
(see the May/June issue). Here, we look more
+
specifically at areas in which semantic models
+
can support cloud computing.<br /><br />
+
<span style="font-size:12pt;color:purple">
+
Opportunities for Semantic Models in Cloud Computing</span><br />
+
 
Semantic models are helpful in three aspects of  
 
Semantic models are helpful in three aspects of  
cloud computing.
+
cloud computing.The first is functional and nonfunctional  
The first is functional and nonfunctional  
+
 
definitions. The ability to define application  
 
definitions. The ability to define application  
 
functionality and quality-of-service details in  
 
functionality and quality-of-service details in  
Line 282: Line 205:
 
particularly applicable.
 
particularly applicable.
 
The second aspect is data modeling. A crucial  
 
The second aspect is data modeling. A crucial  
difficulty developers face is porting data hori-
+
difficulty developers face is porting data horizontally across clouds. For example, moving data  
zontally across clouds. For example, moving data  
+
 
from a schema-less data store (such as Google  
 
from a schema-less data store (such as Google  
 
Bigtable1) to a schema-driven data store such  
 
Bigtable1) to a schema-driven data store such  
Line 289: Line 211:
 
challenge. For a good overview of this concern,  
 
challenge. For a good overview of this concern,  
 
see the discussion of customer scenarios in the  
 
see the discussion of customer scenarios in the  
[http://www.scribd.com/doc/18172802/Cloud-Computing-Use-Cases-Whitepaper Cloud Computing User Cases White Paper]. The root of this difficulty is the lack of a platform-agnostic data  
+
[http://www.scribd.com/doc/18172802/Cloud-Computing-Use-Cases-Whitepaper Cloud Computing User Cases White Paper].  
 +
The root of this difficulty is the lack of a platform-agnostic data  
 
model. Semantic modeling of data to provide a  
 
model. Semantic modeling of data to provide a  
 
platform-independent data representation would  
 
platform-independent data representation would  
Line 296: Line 219:
 
enhancement. Clouds expose their operations  
 
enhancement. Clouds expose their operations  
 
via Web services, but these service interfaces  
 
via Web services, but these service interfaces  
differ between vendors. The operations’ </p>
+
differ between vendors. The operations’  
<p style="float:right;width:300px">semantics, however, are similar. Metadata added  
+
semantics, however, are similar. Metadata added  
 
through annotations pointing to generic operational models would play a key role in consolidating these APIs and enable interoperability  
 
through annotations pointing to generic operational models would play a key role in consolidating these APIs and enable interoperability  
 
among the heterogeneous cloud environments.<br /><br />
 
among the heterogeneous cloud environments.<br /><br />
 +
 
<span style="font-size:12pt;color:purple">Functional Portability</span><br />
 
<span style="font-size:12pt;color:purple">Functional Portability</span><br />
 
From a perspective of the cloud landscape  
 
From a perspective of the cloud landscape  
 
based on the language abstraction and type  
 
based on the language abstraction and type  
of semantics (that is, viewing the cube in Figure 1 from the top), we see that opportunities  
+
of semantics (that is, viewing the cube in Figure 2 from the top), we see that opportunities  
 
exist to use semantic models to define applications’ functional aspects in a platform-agnostic  
 
exist to use semantic models to define applications’ functional aspects in a platform-agnostic  
 
manner. In most cases, however, converting a  
 
manner. In most cases, however, converting a  
Line 325: Line 249:
 
on advanced tools (for example, [http://www-01.ibm.com/software/awdtools/developer/rose IBM’s Rational Rose]). This limits UML’s applicability.
 
on advanced tools (for example, [http://www-01.ibm.com/software/awdtools/developer/rose IBM’s Rational Rose]). This limits UML’s applicability.
 
A popular alternative to such tool-dependent  
 
A popular alternative to such tool-dependent  
heavy upfront models is domain-specific languages (DSLs). Their popularity is due partly </p>
+
heavy upfront models is domain-specific languages (DSLs). Their popularity is due partly to the availability of extensible  
</td></tr>
+
<tr><td>
+
<p style="float:left;width:250px">to the availability of extensible  
+
 
interpreted programming languages  
 
interpreted programming languages  
 
such as Ruby and Python. Unlike  
 
such as Ruby and Python. Unlike  
Line 354: Line 275:
 
of lightweight models while supporting high-level operations when  
 
of lightweight models while supporting high-level operations when  
 
required. Figure 2 shows an annotation referring to an ontology from  
 
required. Figure 2 shows an annotation referring to an ontology from  
a fictitious DSL script for configuration. The script is more program-
+
a fictitious DSL script for configuration. The script is more programmer-oriented (in fact, it’s derived  
mer-oriented (in fact, it’s derived  
+
 
from Ruby) but lacks an ontology’s  
 
from Ruby) but lacks an ontology’s  
 
richness. However, the annotation links the relevant components  
 
richness. However, the annotation links the relevant components  
Line 367: Line 287:
 
can see the modeling coverage for  
 
can see the modeling coverage for  
 
software deployment and management. Elastra’s Elastic Computing  
 
software deployment and management. Elastra’s Elastic Computing  
Modeling Language (ECML),  </p>
+
Modeling Language (ECML),   
<p style="float:right;width:350px">[[File:Chart1.png|350px]]<br /><br />[[File:Chart2.png|350px]]<br /><br />
+
 
<table width="350" border="0" cellpadding="5">
+
[[File:Chart2.png]]
<tr>
+
 
<td width="175"><p style="float:left">Elastic Deployment Modeling Language  
+
Elastic Deployment Modeling Language  
 
(EDML), and Elastic Management  
 
(EDML), and Elastic Management  
 
Modeling Language (EMML) ontologies cover many system aspects  
 
Modeling Language (EMML) ontologies cover many system aspects  
Line 378: Line 298:
 
such industry initiative in adopting  
 
such industry initiative in adopting  
 
semantic technologies.<br /><br />
 
semantic technologies.<br /><br />
<span style="font-size:12pt;color:purple">
+
 
Data Modeling</span><br />
+
<span style="font-size:12pt;color:purple">Data Modeling</span><br />
Another opportunity for semantic </p></td>
+
 
<td width="175"><p style="float:left">models for clouds lies in Resource  
+
Another opportunity for semantic  
 +
models for clouds lies in Resource  
 
Description Framework (RDF) data  
 
Description Framework (RDF) data  
 
modeling. As we discussed in part  
 
modeling. As we discussed in part  
Line 389: Line 310:
 
schema-less, distributed data stores  
 
schema-less, distributed data stores  
 
with relaxed consistency models to  
 
with relaxed consistency models to  
provide high availability and elas-
+
provide high availability and elasticity to suit clouds’ needs. However,
ticity to suit clouds’ needs. However, </p></td>
+
 
</tr>
+
[[File:Chart3.png]]
</table>
+
 
</p>
+
exploiting these data stores requires  
</td>
+
</tr>
+
<tr>
+
<td>
+
[[File:Chart3.png|600px]]
+
<p style="float:left;width:200px">exploiting these data stores requires  
+
 
substantial redesign of many data-driven applications and often makes  
 
substantial redesign of many data-driven applications and often makes  
 
porting data to a traditional relational database extremely difficult.  
 
porting data to a traditional relational database extremely difficult.  
Line 429: Line 344:
 
from these high-level models would  
 
from these high-level models would  
 
be used mostly during subsequent  
 
be used mostly during subsequent  
lifecycle stages.</p>
+
lifecycle stages.
<p style="float:left;width:10px"> </p>
+
<p style="float:left;width:200px">
+
<span style="font-size:12pt;color:purple">
+
Service Enrichment</span><br />
+
 
One feature differentiating the cloud  
 
One feature differentiating the cloud  
 
from other distributed environments  
 
from other distributed environments  
Line 454: Line 365:
 
engines’ use of metadata to display results in customized formats.  
 
engines’ use of metadata to display results in customized formats.  
 
Yahoo’s SearchMonkey and Google  
 
Yahoo’s SearchMonkey and Google  
Rich Snippets are two such microformat-driven schemes. These anno-
+
Rich Snippets are two such microformat-driven schemes. These annotations, unlike the DSL annotations  
tations, unlike the DSL annotations  
+
 
in Figure 3, might not always point  
 
in Figure 3, might not always point  
 
to ontologies. Their structure can be  
 
to ontologies. Their structure can be  
 
based on a vocabulary or taxonomy—
 
based on a vocabulary or taxonomy—
 
a lower-grade nonsemantic model.  
 
a lower-grade nonsemantic model.  
For example, the popular hCalendar </p>
+
For example, the popular hCalendar microformat is part of the “lowercase  
<p style="width:200px;float:right">microformat is part of the “lowercase  
+
 
semantic web” movement, which  
 
semantic web” movement, which  
 
emphasizes lightweight models.
 
emphasizes lightweight models.
Embedding rich semantic meta-
+
Embedding rich semantic meta-data in cloud service descriptions has  
data in cloud service descriptions has  
+
 
three main benefits that go beyond  
 
three main benefits that go beyond  
 
customized search capabilities.
 
customized search capabilities.
Line 490: Line 398:
 
The second benefit deals with  
 
The second benefit deals with  
 
handling change. The cloud space is  
 
handling change. The cloud space is  
still evolving. If the history of software or component interoperability </p>
+
still evolving. If the history of software or component interoperability is any guide, achieving consensus in  
</td>
+
</tr>
+
<tr>
+
<td>
+
<p style="float:left;width:250px">is any guide, achieving consensus in  
+
 
the cloud space will be difficult and  
 
the cloud space will be difficult and  
 
won’t likely happen soon. Attaching  
 
won’t likely happen soon. Attaching  
Line 503: Line 406:
 
interim standards.
 
interim standards.
 
The third benefit is that the formalizations apply not only to service descriptions but also to many  
 
The third benefit is that the formalizations apply not only to service descriptions but also to many  
other aspects such as service-level  
+
other aspects such as service level  
 
agreements (SLAs) and software  
 
agreements (SLAs) and software  
 
licenses. You can use annotations  
 
licenses. You can use annotations  
Line 509: Line 412:
 
these documents, facilitating more  
 
these documents, facilitating more  
 
automation in the cloud space. For  
 
automation in the cloud space. For  
example, Web ServiceLevel Agreement ([http://www.research.ibm.com/wsla WSLA]) specification provides a  
+
example, Web Service Level Agreement ([http://www.research.ibm.com/wsla WSLA]) specification provides a  
 
way to formalize SLAs, but creating  
 
way to formalize SLAs, but creating  
and maintaining these formalizations is time-consuming.
+
and maintaining these formalizations is timeconsuming.
 
Figure 3 illustrates using SA-REST annotations on the Amazon  
 
Figure 3 illustrates using SA-REST annotations on the Amazon  
 
Elastic Compute Cloud (EC2) SLA  
 
Elastic Compute Cloud (EC2) SLA  
 
document. It shows how a capable  
 
document. It shows how a capable  
 
processor could use these annotations to extract a WSLA equivalent  
 
processor could use these annotations to extract a WSLA equivalent  
of the human-readable SLA.
+
of the human readable SLA.
 
These benefits’ importance comes  
 
These benefits’ importance comes  
 
into perspective when you consider  
 
into perspective when you consider  
Line 533: Line 436:
 
data portability, which the cloud  
 
data portability, which the cloud  
 
community is facing right now, are  
 
community is facing right now, are  
the very issues for which semantic </p>
+
the very issues for which semantic  
<p style="width:10px;float:left"> </p>
+
models excel in providing solutions.  
<p style="float:left;width:250px">models excel in providing solutions.  
+
 
However, learning from the past,  
 
However, learning from the past,  
 
we advocate a multilevel modeling  
 
we advocate a multilevel modeling  
Line 543: Line 445:
 
space to provide lightweight modeling in an appealing manner to the  
 
space to provide lightweight modeling in an appealing manner to the  
 
software engineering community.  
 
software engineering community.  
<br /><br /><span style="font-size:12pt;color:purple">References</span><br />
+
<br /><br />
 +
 
 +
<span style="font-size:12pt;color:purple">References</span><br />
 
<span style="font-size:9pt">
 
<span style="font-size:9pt">
1.  F. Chang et al., “Bigtable: A Distributed<br />
+
1.  C. Thomas et al., “Growing Fields of
Storage System for Structured Data,” in<br />
+
Interest—Using an Expand and Reduce
Proc. Usenix Symp. Operating Systems<br />
+
Strategy for Domain Model Extraction,”Proc.­ 2008­ Int’l­ Conf.­ Web­ Intelligence­
Design and Implementation, Usenix<br />
+
and­ Intelligent­ Agent­ Technology (WI-IAT 08), vol. 1, IEEE CS Press, 2008, pp.
Assoc., 2006, p. 15.<br />
+
496–502.
2. M. Nagarajan et al., “Semantic Interoper-<br />
+
 
ability of Web Services—Challenges and<br />
+
2.  K. Sivashanmugam et al., “Adding Semantics to Web Services Standards,”
Experiences,” Proc. 2006 IEEE Int’l Conf.<br />
+
Proc.­Int’l­Conf.­Web­Services (ICWS 03),
Web Services (ICWS 06), IEEE CS Press,<br />
+
CSREA Press, 2003, pp. 395–401.
2006, p. 373–382.<br />
+
 
3.  A.P. Sheth, K. Gomadam, and J. Lathem,<br />
+
3.  F. Chang et al., “Bigtable: A Distributed
“SA-REST: Semantically Interoperable<br />
+
Storage System for Structured Data,” in Proc. Usenix Symp. Operating Systems
and Easier-to-Use Services and Mashups,”  IEEE Internet Computing, vol. 11,<br />
+
Design and Implementation, Usenix
no. 6, 2007, pp. 91–94.</span><br /><br />
+
Assoc., 2006, p. 15.
Amit Sheth is the director of the Ohio Center of Excellence on Knowledge-Enabled<br />
+
 
Computing (Kno.e.sis) at Wright State<br />
+
4. M. Nagarajan et al., “Semantic Interoper-
University. He’s also the university’s<br />
+
ability of Web Services—Challenges and
LexisNexis Ohio Eminent Scholar. He’s<br />
+
Experiences,” Proc. 2006 IEEE Int’l Conf.
on the Web at http://knoesis.org/amit/.<br /><br />
+
Web Services (ICWS 06), IEEE CS Press,
Ajith Ranabahu is pursuing a PhD in cloud-<br />
+
2006, p. 373–382.
computing interoperability at Wright<br />
+
 
State University. He worked with IBM<br />
+
5.  A.P. Sheth, K. Gomadam, and J. Lathem,
on Sharable Code and its Altocumulus<br />
+
“SA-REST: Semantically Interoperable
project, and he coordinates the Cirrocu-<br />
+
and Easier-to-Use Services and Mashups,”  IEEE Internet Computing, vol. 11,
mulus project ([http://knoesis.org/research/srl/projects/cirrocumulus Cirrocumulus]). Contact him at ajith@knoesis.org
+
no. 6, 2007, pp. 91–94.
/</p>
+
 
</td>
+
Amit Sheth is the director of the Ohio Center of Excellence on Knowledge-Enabled
</tr>
+
Computing (Kno.e.sis) at Wright State
</table>
+
University. He’s also the university’s
 +
LexisNexis Ohio Eminent Scholar. He’s
 +
on the Web at http://knoesis.org/amit/.
 +
 
 +
Ajith Ranabahu is pursuing a PhD in cloud-
 +
computing interoperability at Wright
 +
State University. He worked with IBM
 +
on Sharable Code and its Altocumulus
 +
project, and he coordinates the Cirrocumulus project ([http://knoesis.org/research/srl/projects/cirrocumulus Cirrocumulus]). Contact him at ajith@knoesis.org

Latest revision as of 15:34, 3 August 2010

Semantic Modeling for Cloud Computing

Amit Sheth and Ajith Ranabahu • Wright State University

Cloud computing has lately become the attention grabber in both academia and industry. The promise of seemingly unlimited, readily available utility-type computing has opened many doors previously considered difficult, if not impossible, to open. The cloud- computing landscape, however, is still evolving, and we must overcome many challenges to foster widespread adoption of clouds. The main challenge is interoperability. Numerous vendors have introduced paradigms and services, making the cloud landscape diverse and heterogeneous. Just as in the computer hardware industry’s early days, when each vendor made and marketed its own version of (incompatible) computer equipment, clouds are diverse and vendor-locked. Although many efforts are under way to standardize clouds’ important technical aspects, notably from the US National Institute of Standards and Technology (NIST), consolidation and standardization are still far from reality. In this two-part article, we discuss how a little bit of semantics can help address clouds’ key interoperability and portability issues.

Cloud-Related Challenges

Figure 1 shows the three main flavors of clouds as outlined by NIST (http://csrc.nist.gov/groups/SNS/cloud-computing). Infrastructure-as-a-­ service (IaaS) clouds have the largest gap (a high workload but little automation) in terms of deploying and managing an application. Platform-as-a-service (PaaS) or software-as-a-service (SaaS) clouds have substantially lower workloads, but at the expense of flexibility and portability. Given this diverse environment, a cloud service consumer faces four challenges. First, depending on the application’s requirements, legal issues, and other possible considerations, the consumer must select a cloud to use. Each cloud vendor exposes these details in different formats and at different granularity levels. Second, the consumer must learn about the vendor’s technical aspects (service interface, scaling configuration, and so on) and workflow. Third, the consumer must then develop an application or customize the vendor-provided multitenant application to fulfill his or her requirements. When doing this, the consumer must take into account various technical details such as the choice of programming language and limitations in the application runtime, which will all be vendor-specific. Finally, after deploying the application, if the consumer must change the service provider (which happens surprisingly often), at least two major considerations arise. First, the consumer might need to rewrite or modify the application code to suit the new provider’s environment. For some clouds (such as IaaS), this is minimal, but porting the code in PaaS and SaaS clouds will likely require more effort. The second consideration is that data collected for the application might need transformation. Data is the most important asset the application generates over time and is essential for continued functioning. The transformation might even need to carry across different data models. The industry practice is to address such transformations case-by-case. To overcome these challenges and provide better insight into the aspects requiring attention, proper modeling in this space is essential. Semantic modeling can help with this.

Chart4.png

Cloud Interoperability?
First, a word about cloud interoperability is in order. Interoperability requirements in the cloud landscape in Figure 1 arise owing to two types of heterogeneities.The first is vertical heterogeneity — that is, within a single silo. We can address this by using middleware to homogenize the API and sometimes by enforcing standardization. For example, the Open Virtualization Format (OVF ) is an emerging standard that allows migration of virtual-machine snapshots across IaaS clouds. The second type is horizontal heterogeneity—that is, across silos. Overcoming this is fundamentally more difficult. Each silo provides different abstraction levels and services. High-level modeling pays off, especially when you must move an application and code horizontally across these silos. Surprisingly, many small and medium businesses make horizontal transitions. PaaS clouds offer faster setup for applications, and many exploit the free hosting opportunities of some platform cloud providers (for example, Google’s App Engine). When the application grows in scope and criticality, however, an IaaS cloud might prove cheaper, more flexible, and more reliable, prompting a transition. The following discussion on semantic models applies to both vertical and horizontal interoperability. The key in addressing both types is that many of the core data and services causing them follow the same semantic concepts. For example, almost all IaaS clouds follow conceptually similar workflows when allocating resources, although the actual service implementations and tools differ significantly. Similarly, the PaaS modeling space is a subset of that for IaaS, from a semantic perspective. These observations prompt us to argue for semantic models’ applicability, especially to supplement interoperability in the cloud space. Some parts of the scientific and engineering community weren’t impressed by early semantic-modeling approaches, especially ones that required large up-front investment. This perception, however, is changing rapidly with the influx of new applications and technologies that exploit detailed semantic models to provide improved functionality (for example, biomedical ontologies cataloged at the US National Center for Biomedical Ontology ). Semantic models such as ontologies can formalize more details than other traditional modeling techniques. They also enable reasoning, a way to make inferences and gain new knowledge. Such rich models can improve traditional functions far beyond what was once thought possible. For example, Powerset (now part of Microsoft), a search service provider, added considerable value to search results by incorporating semantic models and thus enabling fact discovery. These capabilities are being complemented by the ability to more rapidly create domain models, often by mining crowd knowledge represented, for example, in Wikipedia1 or shared data, exemplified by the linked-object data cloud. The technologies showcased at the Semantic Technology Conference also provide plenty of evidence of semantic-empowered commercial and scientific applications.

Multidimensional Analysis of Cloud-Modeling Requirements

We suggest a 3D slicing of the modeling requirements along the following dimensions (see Figure 2). The­ types­ of­ semantics that are useful for porting or interoperability in cloud computing are similar to those we introduced in 2003 for Web services. This is natural because the primary means of interacting with a cloud environment is through Web services. The four types of semantics—data, functional, nonfunctional, and system—are based on the different semantic aspects a model must cover.

Chart1.png

The­ language­ abstraction­ level indicates the modeling’s granularity and specificity. Although ontological modeling is preferable at a higher level, developers prefer detailed, concrete syntactic representations. These representations of different granularities might need to be related, often through explicit annotations. For example, although service developers happily use Web Services Description Language (WSDL) descriptions in their Web services, these descriptions are syntactic and can’t provide useful semantic details. To overcome this deficiency, SAWSDL (Semantic Annotations for WSDL and XML Schema; www.w3.org/TR/sawsdl) attaches semantic-model details to WSDL documents. The­ software­ lifecycle­ stage is important in determining the modeling requirements. For example, some nonfunctional and system requirements might not be modeled during development but will be taken into account only during deployment. A different team handles each of these lifecycle stages; this separation is important so that one team doesn’t step on another’s toes. This separation aims to focus the modeling effort on the correct time and people. Some cloud models fall under the nonfunctional/system/ ontology space in Figure 2. Such models include the Elastic Computing Modeling Language (ECML), Elastic Deployment Modeling Language (EDML), and Elastic Management Modeling Language (EMML), all based on OWL and published by Elastra (Elastra Languages). However, some aspects of cloud modeling have received little or no attention. For example, there’s no comprehensive higher-level modeling in the data and functional spaces. Lessons learned during large-scale ontological modeling in the Semantic Web and Semantic Web services, biology, and many other domains are readily applicable here and would help address some of the challenges in the cloud space.


Opportunities for Semantic Models in Cloud Computing
Semantic models are helpful in three aspects of cloud computing.The first is functional and nonfunctional definitions. The ability to define application functionality and quality-of-service details in a platform-agnostic manner can immensely benefit the cloud community. This is particularly important for porting application code horizontally—that is, across silos. Lightweight semantics, which we describe in detail later, are particularly applicable. The second aspect is data modeling. A crucial difficulty developers face is porting data horizontally across clouds. For example, moving data from a schema-less data store (such as Google Bigtable1) to a schema-driven data store such as a relational database presents a significant challenge. For a good overview of this concern, see the discussion of customer scenarios in the Cloud Computing User Cases White Paper. The root of this difficulty is the lack of a platform-agnostic data model. Semantic modeling of data to provide a platform-independent data representation would be a major advantage in the cloud space. The third aspect is service description enhancement. Clouds expose their operations via Web services, but these service interfaces differ between vendors. The operations’ semantics, however, are similar. Metadata added through annotations pointing to generic operational models would play a key role in consolidating these APIs and enable interoperability among the heterogeneous cloud environments.

Functional Portability
From a perspective of the cloud landscape based on the language abstraction and type of semantics (that is, viewing the cube in Figure 2 from the top), we see that opportunities exist to use semantic models to define applications’ functional aspects in a platform-agnostic manner. In most cases, however, converting a high-level model directly to executable artifacts pollutes both representations. Intermediate representations are important to provide a convenient conversion. Applying high-level modeling to describe an application’s functional aspects isn’t new. Many software development companies have been using UML to model application functionality at a high level and use artifacts derived from these models to drive development. This process is commonly called model-driven development. This is an example of using high-level models to derive fine-grain artifacts. UML models usually don’t include code, so you can use them only to generate a skeletal application. That is, low-level details are deliberately kept away from the high-level models. UML models, however, are inherently bound with object-oriented languages, and UML- driven development processes depend heavily on advanced tools (for example, IBM’s Rational Rose). This limits UML’s applicability. A popular alternative to such tool-dependent heavy upfront models is domain-specific languages (DSLs). Their popularity is due partly to the availability of extensible interpreted programming languages such as Ruby and Python. Unlike UML, a DSL is applicable only in a given domain but enables a light-weight model in that domain, often without requiring proprietary tools. For example, you can use IBM’s Sharable Code DSL (ISC), which is a mashup generator, with a basic text editor. (However, providing graphical abstractions and specialized tooling would be more convenient for users.) “Lightweight” signifies that these models don’t use rich knowledge representation languages and so have limited reasoning capabilities. Our Cirrocumulus project for cloud interoperability (Cirrocumulus) uses DSLs to bridge the gap between executable artifacts and high-level semantic models. A DSL, although domain specific, can provide a more programmer-oriented representation of functional, non-functional, or even data descriptions. A best-of-both-worlds approach is to use annotations to link models, which provides the convenience of lightweight models while supporting high-level operations when required. Figure 2 shows an annotation referring to an ontology from a fictitious DSL script for configuration. The script is more programmer-oriented (in fact, it’s derived from Ruby) but lacks an ontology’s richness. However, the annotation links the relevant components between the different levels, providing a way to facilitate high-level operations while maintaining a simpler representation. From the perspective based on the type of semantics and software lifecycle stage—that is, looking at the cube in Figure 1 from the front — you can see the modeling coverage for software deployment and management. Elastra’s Elastic Computing Modeling Language (ECML),

Chart2.png

Elastic Deployment Modeling Language (EDML), and Elastic Management Modeling Language (EMML) ontologies cover many system aspects and some nonfunctional aspects in all stages. We’re pleased to see such industry initiative in adopting semantic technologies.

Data Modeling

Another opportunity for semantic models for clouds lies in Resource Description Framework (RDF) data modeling. As we discussed in part 1, a major concern plaguing cloud computing’s adoption is data lock-in , that is, the inability to port data horizontally. Many vendors designed schema-less, distributed data stores with relaxed consistency models to provide high availability and elasticity to suit clouds’ needs. However,

Chart3.png

exploiting these data stores requires substantial redesign of many data-driven applications and often makes porting data to a traditional relational database extremely difficult. The current practice is to address such transitions case-by-case. A better approach is to model the data in RDF and generate the specific target representations, and in some cases even the code for the application’s data access layer. This method can formulate transformations from one representation to another using the lifting-lowering mechanism. Semantic Annotations for WSDL and XML Schema (SAWSDL) demonstrated this mechanism’s use for data mediation.

Lightweight modeling in terms of DSLs also applies here. For example, the Web services community has long used XML Schema definitions as platform-agnostic data definitions. Schema definitions serve as inputs to code generation tools that generate platform-specific data definitions. From the perspective of the type of semantics and software lifecycle stage, most of this data modeling applies during application development. Concrete artifacts generated from these high-level models would be used mostly during subsequent lifecycle stages. One feature differentiating the cloud from other distributed environments is the availability of Web services to manipulate resources. Availability of the service APIs lets you programmatically manage the cloud resources, even from within the same cloud. These capabilities have revolutionized application deployment and management. For example, you can compose or mash up well-defined services to facilitate elaborate workflows. Service definitions are usually syntactic, and many researchers have focused on embedding rich metadata in formal service descriptions. One result of this research is SAWSDL. A growing trend is to annotate HTML descriptions to embed richer, machine-readable semantic metadata. One reason for this method’s popularity is search engines’ use of metadata to display results in customized formats. Yahoo’s SearchMonkey and Google Rich Snippets are two such microformat-driven schemes. These annotations, unlike the DSL annotations in Figure 3, might not always point to ontologies. Their structure can be based on a vocabulary or taxonomy— a lower-grade nonsemantic model. For example, the popular hCalendar microformat is part of the “lowercase semantic web” movement, which emphasizes lightweight models. Embedding rich semantic meta-data in cloud service descriptions has three main benefits that go beyond customized search capabilities. The first benefit deals with Representational State Transfer (REST) style services. Many cloud service providers adopt REST-style Web services that don’t advocate a formal service description. These services are described using HTML pages. WSDL 2.0, the latest specification, explicitly supports formal description of “RESTful” services but hasn’t seen quick adoption. Alternative approaches such as SA-REST(SA stands for semantic annotation), a generic annotation scheme that follows microformat design principles, are becoming more applicable in this space. These annotations enable the seamless, flexible integration of formalizations into RESTful service descriptions. This opens the door to many exciting avenues such as faceted search to identify relevant reusable services and semiautomated service compositions. The second benefit deals with handling change. The cloud space is still evolving. If the history of software or component interoperability is any guide, achieving consensus in the cloud space will be difficult and won’t likely happen soon. Attaching formalizations via annotations, however, is flexible enough to accommodate an evolving model. This is especially attractive to vendors who aren’t willing to invest heavily in interim standards. The third benefit is that the formalizations apply not only to service descriptions but also to many other aspects such as service level agreements (SLAs) and software licenses. You can use annotations to embed formalizations even for these documents, facilitating more automation in the cloud space. For example, Web Service Level Agreement (WSLA) specification provides a way to formalize SLAs, but creating and maintaining these formalizations is timeconsuming. Figure 3 illustrates using SA-REST annotations on the Amazon Elastic Compute Cloud (EC2) SLA document. It shows how a capable processor could use these annotations to extract a WSLA equivalent of the human readable SLA. These benefits’ importance comes into perspective when you consider the enormous body of research on standard-driven service compositions and agreement matching. The informal, non-standard-driven nature of many cloud services made most of the previous research inapplicable. However, being able to glean formalizations from existing documents opens the doors to apply many well-researched techniques.

The cloud space presents many opportunities for researchers, and we see a plethora of applications that use semantic modeling. Issues such as interoperability and data portability, which the cloud community is facing right now, are the very issues for which semantic models excel in providing solutions. However, learning from the past, we advocate a multilevel modeling strategy to provide smooth transitions into different granularity levels. We also think that DSLs can play an important role in the cloud space to provide lightweight modeling in an appealing manner to the software engineering community.

References
1. C. Thomas et al., “Growing Fields of Interest—Using an Expand and Reduce Strategy for Domain Model Extraction,”Proc.­ 2008­ Int’l­ Conf.­ Web­ Intelligence­ and­ Intelligent­ Agent­ Technology (WI-IAT 08), vol. 1, IEEE CS Press, 2008, pp. 496–502.

2. K. Sivashanmugam et al., “Adding Semantics to Web Services Standards,” Proc.­Int’l­Conf.­Web­Services (ICWS 03), CSREA Press, 2003, pp. 395–401.

3. F. Chang et al., “Bigtable: A Distributed Storage System for Structured Data,” in Proc. Usenix Symp. Operating Systems Design and Implementation, Usenix Assoc., 2006, p. 15.

4. M. Nagarajan et al., “Semantic Interoper- ability of Web Services—Challenges and Experiences,” Proc. 2006 IEEE Int’l Conf. Web Services (ICWS 06), IEEE CS Press, 2006, p. 373–382.

5. A.P. Sheth, K. Gomadam, and J. Lathem, “SA-REST: Semantically Interoperable and Easier-to-Use Services and Mashups,” IEEE Internet Computing, vol. 11, no. 6, 2007, pp. 91–94.

Amit Sheth is the director of the Ohio Center of Excellence on Knowledge-Enabled Computing (Kno.e.sis) at Wright State University. He’s also the university’s LexisNexis Ohio Eminent Scholar. He’s on the Web at http://knoesis.org/amit/.

Ajith Ranabahu is pursuing a PhD in cloud- computing interoperability at Wright State University. He worked with IBM on Sharable Code and its Altocumulus project, and he coordinates the Cirrocumulus project (Cirrocumulus). Contact him at ajith@knoesis.org