Skyline Queries over Incomplete Data - Error Models for Focused Crowd-Sourcing

Lofi, Christoph; El Maarry, Kinda; Balke, Wolf-Tilo

doi:10.1007/978-3-642-41924-9_25

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8217))

Included in the following conference series:

International Conference on Conceptual Modeling

2172 Accesses

Abstract

Skyline queries are a well-known technique for explorative retrieval, multi-objective optimization problems, and personalization tasks in databases. They are widely acclaimed for their intuitive query formulation mechanisms. However, when operating on incomplete datasets, skyline query processing is severely hampered and often has to resort to error-prone heuristics. Unfortunately, incomplete datasets are a frequent phenomenon due to widespread use of automated information extraction and aggregation. In this paper, we evaluate and compare various established heuristics for adapting skylines to incomplete datasets, focusing specifically on the error they impose on the skyline result. Building upon these results, we argue for improving the skyline result quality by employing crowd-enabled databases. This allows dynamic outsourcing of some database operators to human workers, therefore enabling the elicitation of missing values during runtime. Unfortunately, each crowd-sourcing operation will result in monetary and query runtime costs. Therefore, our main contribution is introducing a sophisticated error model, allowing us to specifically concentrate on those tuples that are highly likely to be error-prone, while relying on established heuristics for safer tuples. This technique of focused crowd-sourcing allows us to strike a perfect balance between costs and result’s quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Skyline queries over incomplete data streams

Article 17 October 2019

SCSA: Evaluating skyline queries in incomplete data

Article 07 December 2018

Computing Skyline Query on Incomplete Data

References

Franklin, M., Kossmann, D., Kraska, T., Ramesh, S., Xin, R.: CrowdDB: Answering queries with crowdsourcing. In: ACM SIGMOD Int. Conf. on Management of Data, Athens, Greece (2011)
Google Scholar
Khalefa, M.E., Mokbel, M.F., Levandoski, J.J.: Skyline Query Processing for Incomplete Data. In: Int. Conf. on Data Engineering (ICDE), Cancun, Mexico (2008)
Google Scholar
Börzsönyi, S., Kossmann, D., Stocker, K.: The Skyline Operator. In: Int. Conf. on Data Engineering (ICDE), Heidelberg, Germany (2001)
Google Scholar
Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: Symposium on Principles of Database Systems (PODS), Santa-Barbara, California, USA (2001)
Google Scholar
Godfrey, P., Shipley, R., Gryz, J.: Algorithms and analyses for maximal vector computation. The VLDB Journal 16, 5–28 (2007)
Article Google Scholar
Bartolini, I., Ciaccia, P., Patella, M.: Efficient sort-based skyline evaluation. ACM Transactions on Database Systems 33 (2008)
Google Scholar
Papadias, D., Tao, Y., Fu, G., Seeger, B.: Progressive skyline computation in database systems. ACM Trans. Database Syst. 30, 41–82 (2005)
Article Google Scholar
Selke, J., Lofi, C., Balke, W.-T.: Highly Scalable Multiprocessing Algorithms for Preference-Based Database Retrieval. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5982, pp. 246–260. Springer, Heidelberg (2010)
Chapter Google Scholar
Torlone, R., Ciaccia, P.: Finding the best when it‘s a matter of preference. In: 10th Italian Symposium on Advanced Database Systems (SEBD), Portoferraio, Italy (2002)
Google Scholar
Boldi, P., Chierichetti, F., Vigna, S.: Pictures from Mongolia: Extracting the top elements from a partially ordered set. Theory of Computing Systems 44, 269–288 (2009)
Article MathSciNet MATH Google Scholar
Park, S., Kim, T., Park, J., Kim, J., Im, H.: Parallel skyline computation on multicore architectures. In: Int.Conf. on Data Engineering (ICDE), Shanghai, China (2009)
Google Scholar
Heath, T., Hepp, M., Bizer, C.: Special Issue on Linked Data. International Journal on Semantic Web and Information Systems (IJSWIS) 5 (2009)
Google Scholar
Lofi, C., El Maarry, K., Balke, W.-T.: Skyline Queries in Crowd-Enabled Databases. In: Int. Conf. on Extending Database Technology (EDBT), Genoa, Italy (2013)
Google Scholar
Acu, E.: The treatment of missing values and its effect in the classifier accuracy. In: Classification Clustering and Data Mining Applications, pp. 1–9 (2004)
Google Scholar
Balke, W.-T., Güntzer, U., Siberski, W.: Exploiting Indifference for Customization of Partial Order Skylines. In: Int. DB Engineering & Applications Symposium (IDEAS), Delhi, India (2006)
Google Scholar
Balke, W.T., Güntzer, U., Siberski, W.: Restricting skyline sizes using weak Pareto dominance. Informatik - Forschung und Entwicklung 21, 165–178 (2007)
Article Google Scholar
Balke, W.-T., Zheng, J.X., Güntzer, U.: Approaching the Efficient Frontier: Cooperative Database Retrieval Using High-Dimensional Skylines. In: Zhou, L.-z., Ooi, B.-C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 410–421. Springer, Heidelberg (2005)
Chapter Google Scholar
Godfrey, P.: Skyline cardinality for relational processing. In: Seipel, D., Turull-Torres, J.M. (eds.) FoIKS 2004. LNCS, vol. 2942, pp. 78–97. Springer, Heidelberg (2004)
Chapter Google Scholar
Powers, D.M.W.: Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation. Flinders University Adelaide SIE07001 (2007)
Google Scholar
Lofi, C., Selke, J., Balke, W.-T.: Information Extraction Meets Crowdsourcing: A Promising Couple. Datenbank-Spektrum 12 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

National Institute of Informatics, Tokyo, 101-8430, Japan
Christoph Lofi
Institut für Informationssysteme, Technische Universität Braunschweig, 38106, Braunschweig, Germany
Kinda El Maarry & Wolf-Tilo Balke

Authors

Christoph Lofi
View author publications
You can also search for this author in PubMed Google Scholar
Kinda El Maarry
View author publications
You can also search for this author in PubMed Google Scholar
Wolf-Tilo Balke
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
Wilfred Ng
Department of Computer Information Systems, J. Mack Robinson College of Business, Georgia State University, USA
Veda C. Storey
University of Allicante, 03690, Allicante, Spain
Juan C. Trujillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lofi, C., El Maarry, K., Balke, WT. (2013). Skyline Queries over Incomplete Data - Error Models for Focused Crowd-Sourcing. In: Ng, W., Storey, V.C., Trujillo, J.C. (eds) Conceptual Modeling. ER 2013. Lecture Notes in Computer Science, vol 8217. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41924-9_25

Download citation

DOI: https://doi.org/10.1007/978-3-642-41924-9_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41923-2
Online ISBN: 978-3-642-41924-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Skyline Queries over Incomplete Data - Error Models for Focused Crowd-Sourcing

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Skyline queries over incomplete data streams

SCSA: Evaluating skyline queries in incomplete data

Computing Skyline Query on Incomplete Data

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Skyline Queries over Incomplete Data - Error Models for Focused Crowd-Sourcing

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Skyline queries over incomplete data streams

SCSA: Evaluating skyline queries in incomplete data

Computing Skyline Query on Incomplete Data

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.