skip to main content
10.1145/2487788.2487917acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
demonstration

Optimizing RDF(S) queries on cloud platforms

Published: 13 May 2013 Publication History

Abstract

Scalable processing of Semantic Web queries has become a critical need given the rapid upward trend in availability of Semantic Web data. The MapReduce paradigm is emerging as a platform of choice for large scale data processing and analytics due to its ease of use, cost effectiveness, and potential for unlimited scaling. Processing queries on Semantic Web triple models is a challenge on the mainstream MapReduce platform called Apache Hadoop, and its extensions such as Pig and Hive. This is because such queries require numerous joins which leads to lengthy and expensive MapReduce workflows. Further, in this paradigm, cloud resources are acquired on demand and the traditional join optimization machinery such as statistics and indexes are often absent or not easily supported.
In this demonstration, we will present RAPID+, an extended Apache Pig system that uses an algebraic approach for optimizing queries on RDF data models including queries involving inferencing. The basic idea is that by using logical and physical operators that are more natural to MapReduce processing, we can reinterpret such queries in a way that leads to more concise execution workflows and small intermediate data footprints that minimize disk I/Os and network transfer overhead. RAPID+ evaluates queries using the Nested TripleGroup Data Model and Algebra(NTGA). The demo will show comparative performance of NTGA query plans vs. relational algebra-like query plans used by Apache Pig and Hive.

References

[1]
Dean, J., and Ghemawat, S. MapReduce: Simplified Data Processing on Large Clusters. In Proc. OSDI (2004), pp. 10--10.
[2]
Kim, H., Ravindra, P., and Anyanwu, K. From SPARQL to MapReduce: The Journey Using a Nested TripleGroup Algebra. Proc. VLDB 4, 12 (2011).
[3]
Le, W., Kementsietsidis, A., Duan, S., and Li, F. Scalable multi-query optimization for sparql. In ICDE (2012), pp. 666--677.
[4]
Ravindra, P., Kim, H., and Anyanwu, K. An Intermediate Algebra for Optimizing RDF Graph Pattern Matching on MapReduce. In Proc. ESWC, vol. 6644. 2011, pp. 46--61.
[5]
Stuckenschmidt, H., and Broekstra, J. Time-Space Trade-offs in Scaling up RDF Schema Reasoning. In Proc. WISE (2005), pp. 172--181.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web
May 2013
1636 pages
ISBN:9781450320382
DOI:10.1145/2487788

Sponsors

  • NICBR: Nucleo de Informatcao e Coordenacao do Ponto BR
  • CGIBR: Comite Gestor da Internet no Brazil

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. hadoop
  2. mapreduce
  3. rdf(s)
  4. sparql

Qualifiers

  • Demonstration

Conference

WWW '13
Sponsor:
  • NICBR
  • CGIBR
WWW '13: 22nd International World Wide Web Conference
May 13 - 17, 2013
Rio de Janeiro, Brazil

Acceptance Rates

WWW '13 Companion Paper Acceptance Rate 831 of 1,250 submissions, 66%;
Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Towards Massive RDF Storage in NoSQL DatabasesEmerging Technologies and Applications in Data Processing and Management10.4018/978-1-5225-8446-9.ch013(263-284)Online publication date: 2019
  • (2019)Framework-Based Scale-Out RDF SystemsEncyclopedia of Big Data Technologies10.1007/978-3-319-77525-8_225(771-777)Online publication date: 20-Feb-2019
  • (2018)RDF Data Storage and Query Processing SchemesACM Computing Surveys10.1145/317785051:4(1-36)Online publication date: 6-Sep-2018
  • (2018)Distributed RDF Query ProcessingLinked Data10.1007/978-3-319-73515-3_4(51-83)Online publication date: 2-Mar-2018
  • (2018)Framework-Based Scale-Out RDF SystemsEncyclopedia of Big Data Technologies10.1007/978-3-319-63962-8_225-1(1-7)Online publication date: 24-Mar-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy