计算机科学 ›› 2017, Vol. 44 ›› Issue (3): 42-47.doi: 10.11896/j.issn.1002-137X.2017.03.011
许婧,任开军,李小勇
XU Jing, REN Kai-jun and LI Xiao-yong
摘要: 随着数值天气预报水平和分辨率的不断提高,气象科学数据呈海量增长趋势,导致气象资料归档与检索系统(MARS)处理大数据服务请求的效率较低。针对此情况,开展了基于MARS检索区域查询方式的优化研究,结合数学补集思想与多路数组聚集计算原理,提出了一种高效的补集转换区域查询方法(CTRQ),从而实现大范围区域查询下的“大数据”计算转换为“小数据”计算。其基本思路是通过超立方体聚集维尺寸与区域查询服务请求的属性值集合大小比较,执行“过半求补”的索引计算操作,利用二次求补实现气象场数据物理存储信息的检索。实验表明,相比原始的索引计算方法,该方法能够有效降低数据检索时元数据索引计算的系统开销。在此基础上,结合并行处理方法,设计并实现了CTRQ并行算法,相比其改进后的串行算法最大获得1.9倍加速比,进一步提高了MARS的检索效率。
[1] RAOULT B.Architecture of the new MARS server[EB/OL].[2015-06-01].http://old.ecmwf.int/archive/publications/ma-nuals/mars/server.pdf. [2] SARAWAGI S,AGRAWAL R,MEGIDDO N.Discovery-driven Exploration of OLAP Data Cubes[J].Lecture Notes in Compu-ter Science,1998,1377:168-182. [3] HAN J,KAMBER M.Data Mining:Concepts and Techniques.Second Edition[J].San Francisco,2006(1):1-25. [4] GRAY J,CHAUDHURI S,BOSWORTH A,et al.Data cube:A relational aggregation operator generalizing group-by,cross-tab,and sub-totals[J].Data Mining and Knowledge Discovery,1997,1(1):29-53. [5] SHAPIRO M A,THORPE A J.THORPEX International Scien-ce Plan1[J].Boletín De La Organización Meteorológica Mun-dial,2004,6(11):238-242. [6] SHEN W H,ZHAO F,GAO H Y,et al.The construction of national meteorological archival and retrieval system [J].Journal of Applied Meteorological Science,2004,15(6):727-736.(in Chinese) 沈文海,赵芳,高华云,等.国家级气象资料存储检索系统的建立[J].应用气象学报,2004,5(6):727-736. [7] HO C T,AGRAWAL R, MEGIDDO N,et al.Range queries in OLAP data cubes[J].ACM Sigmod Record,1970,6(2):73-88. [8] HONG S,SONG B,LEE S.Efficient Execution of Range-Aggregate Queries in Data Warehouse Environments[M]∥International Symposium on Requirements for Poultry Virus Vaccines.S.Karger,1974:299-310. [9] AGARWAL S,AGRAWAL R,DESHPANDE P M,et al.Onthe computation of multidimension alaggre gates[C]∥VLDB.1996:506-521. [10] ZHAO Y,DESHPANDE P M,NAUGHTON J F.An array-based algorithm for simultaneous multidimensional aggregates[C]∥Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data.1997:159-170. [11] XUE Y S,HUANG Z H,DUAN J J,et al.An Efficient Method for Parallel Multi-Dimensional Join and Aggregation[J].Journal of Computer Research and Development,2004,41(10):1661-1669.(in Chinese) 薛永生,黄震华,段江娇,等.一种并行处理多维连接和聚集操作的有效方法[J].计算机研究与发展,2004,41(10):1661-1669. [12] BOOCH G.Object-oriented development[J].IEEE Transactions on Software Engineering,1986,12(2):211-221. [13] WU P,XU H P,CHEN H G .Application of Object-oriented for Metadata Research[J].Journal of Tongji University (Natural Science),2010,38(11):145-151.(in Chinese) 吴萍,许惠平,陈华根.面向对象方法在元数据研究中的应用[J].同济大学学报(自然科学版),2010,38(11):145-151. [14] SHENG L I,WANG S.Star Cube——An Approach to Implementing Data Cube Efficiently[J].Journal of Computer Research & Development,2004,41(4):587-593. [15] GUTTMAN A.R-trees:A dynamic index structure for spatialsearching[C]∥ Proc.of the ACM SIGMOD International Conference on Management of Data.1984:47-57. [16] LI J,ROTEM D,SRIVASTAVA J.Aggregation Algorithms for Very Large Compressed Data Warehouses[C]∥Proceeding of the 25th VLDB Conference.1999:651-662. [17] SONG S L,SONG J Q,REN K J.Design of a parallel algorithm for data cube of MARS[J].Computer Engineering & Science,2014,6(12):2410-2417.(in Chinese) 宋石磊,宋君强,任开军.气象数据归档与查询系统超立方体结构并行算法设计[J].计算机工程与科学,2014,6(12):2410-2417. [18] SATO M.OpenMP:parallel programming API for shared me-mory multiprocessors and on-chip multiprocessors[C]∥International Symposium on System Synthesis.IEEE,2002:109-111. |
No related articles found! |
|