CONTROL AND DECISION-MAKING
CONTROL SYSTEMS
SOFTWARE ENGINEERING
DATA PROCESSING AND ANALYSIS
R. K. Klassen, V. A. Raikhlin Improving the Efficiency of ClusterixLike DBMS for Big Data Analytical Processing
PATTERN RECOGNITION
SECURITY ISSUES
R. K. Klassen, V. A. Raikhlin Improving the Efficiency of ClusterixLike DBMS for Big Data Analytical Processing

Abstract.

Commercial OLAP-systems are economically unavailable for organizations with limited financial capabilities. Analytical processing large amounts of data in these organizations can be accomplished using open source software systems on a cost-effective cluster platform. Previously created Clusterix-like DBMS were not efficient enough according to the «performance/cost» criterion. With a view to the enhance the effectiveness of such systems in the article considers their further development with a focus on a full load of processor cores and the using GPU acceleration (systems Clusterix-N, N – from New) up to the development of a system comparable in efficiency to the open source system Spark, which is currently considered the most promising. The development methodology was based on the constructive system modeling methodology.

Keywords:

analytic processing of significant data volumes, open source software systems on a cluster platform, increasing the efficiency of Clusterix-like DBMS, full loading of processor cores, full load of processor cores, GPU acceleration, comparison with Spark, accepted methodology.

PP. 43-59.

DOI 10.14357/20718632190405

References

1. E. F. Codd. Providing OLAP to user-analysts: an it mandate, Apr. 1993. Technical Report, E. F. Codd and Associates.
2. Microsoft. Parallel Query Processing //Resources and Tools for IT Professionals | TechNet. 2018. URL: https://technet.microsoft.com/enus/library/ms178065(v=sql.105).aspx (accessed: 05.04.2018).
3. Lenovo System x3950 X6 // TPC-H Result Highlights. 2016. URL: http://www.tpc.org/3321 (accessed: 10.08.2018).
4. Lenovo. System x3950 X6 Rack Server //Lenovo official website in Russia. 2017. URL: https://www3.lenovo.com/ru/ru/data-center/servers/missioncritical/System-x3950-X6/p/WMD00000002 (accessed: 15.07.2018).
5. Oracle Exadata Database Machine X7 //Oracle Russia and CIS. 2018. URL: https://www.oracle. com/ru/engineeredsystems/exadata/database-machine-x7/index.html (accessed: 10.08.2018).
6. EMC Education Services. Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data // John Wiley & Sons. 432 p.
7. Xin, Reynold & Rosen, Josh & Zaharia, Matei & J. Franklin, Michael & Shenker, Scott & Stoica, Ion. (2012). Shark: SQL and Rich Analytics at Scale. Proceedings of the ACM SIGMOD International Conference on Management of Data. 10.1145/2463676.2465288.
8. Russian DBMS industry advances on «elephants» [Rossijjskaja otrasl' SUBD prodvigaetsja na «slonakh»]//Connect. 2017. No. 5-6. pp.34-38.
9. Postgres Pro DBMS //Postgres Professional. 2018. URL: https://postgrespro.ru/products/postgrespro (accessed: 03.05.2018).
10. Hellerstein J.M., Stonebraker M., Hamilton J. Architecture of a Database System //Foundations and Trends in Databases. 2007. Vol. 1. No. 2. pp. 141-259.
11. Raikhlin V.A. Simulation of Distributed Database Machines //Programming and Computer Software, Vol. 22, No. 2, 1996. pp. 68-74.
12. Raikhlin V.A., Klassen R.K. Sravnitel'no nedorogie gibridnye tekhnologii konservativnykh SUBD bol'shikh ob"-emov [Relatively inexpensive hybrid technology of large volumes conservative DBMS] //Journal of Information Technologies and Computing Systems. 2018. Vol 68. №1. P. 46-59.
13. Raikhlin V.A., Minjazev R.Sh. Mul'tiklasterizaciya raspredelennyx SUBD konservativnogo tipa [Multiclusterization of distributed dbms of conservative type] // Nonlinear world, 2011. №8. P.473-481.
14. Klassen R.K. Osobennosti ehffektivnojj obrabotki SQL zaprosov k bazam dannykh konservativnogo tipa [Features of efficient processing of SQL-queries to conservative type databases] // Journal of Information Technologies and Computing Systems. 2018. Vol 68. №4. P. 108-118.
15. Oracle. The MySQL Plugin API //MySQL Documentation. 2018. URL: https://dev.mysql.com/doc/refman/5.7/en/pluginapi. html (accessed: 09.04.2018).
16. Raikhlin V.A. Konstruktivnoe modelirovanie sistem [Constructive system modeling]. – Kazan. Publisher: «Feng» («Nauka» [«Science»]), 2005. – 304 pp.
17. Haken, Hermann. (2004). Synergetics: Introduction and Advanced Topics. 10.1007/978-3-662-10184-1.
18. Klassen R.K.: PerformSys. https://github.com/rozh1/PerformSys/ (2018). (accessed: 09.12.2018).
19. Martin J. Computer database organization. 2nd ed. New Jersey 07632: Prentice-Hall, Inc., Englewood Cliffs, 1977.713 pp.
20. Raikhlin V.A., Klassen R.K. Can GPU-accelerator significantly increase the effectiveness of conservative DBMS considerable volumes on cluster platforms? //2017 International Siberian Conference on Control and Communications (SIBCON). 2017. DOI: 10.1109/SIBCON.2017.7998474
21. CoGaDB – Column-oriented GPU-accelerated DBMS. URL: http://cogadb.cs.tudortmund.de/wordpress. (accessed: 29.01.2019)
22. PGStrom 2016. URL: https://wiki.postgresql.org/index.php?title=PGStrom&oldid=25517 (accessed: 05.10.2018).
23. Rauhe H. Finding the Right Processor for the Job Co-Processors in a DBMS, Ilmenau University of Technology, Ilmenau, Dissertation urn:nbn:de:gbv:ilm1-2014000240, 2014.
24. Wenbin F., Bingsheng H., Qiong L. Database Compression on Graphics Processors //Proc. VLDB Endow., Vol 3, No. 1-2, Sep 2010. P.670-680.
25. Bres S. Efficient query processing in co-processor-accelerated database. PhD dissertation, University of Magdeburg (2015)
26. Klassen R.K.: Clusterix-N. https://bitbucket.org/rozh/clusterixn/ (2019). (accessed: 10.03.2019).
 

 

2024 / 01
2023 / 04
2023 / 03
2023 / 02

© ФИЦ ИУ РАН 2008-2018. Создание сайта "РосИнтернет технологии".