Viscous gravitational algorithm for clustering inacurate data

Keywords: data clustering, imprecise data, gravity algorithm, viscosity, Pauli repulsion

Abstract

Clustering is one of the basic problems of machine learning, along with pattern recognition, classification and forecasting. The role of clustering is especially important in the analysis of Big Data, work with which can only be carried out using computer technologies. At the same time, the problem of automatic partitioning into clusters, taking into account the errors of the initial data, has not up to now an unambiguous solution and requires a search for more adequate approaches, including automatic determination of the number of clusters. The paper proposes a new method for data clustering, based on a modification of the gravitational algorithm, which uses an analogy with the formation of stellar clusters due to the attraction of masses in accordance with the law of universal gravitation. When applying this approach to data clustering, real physical masses are replaced by points in a multidimensional data space, and the motion of these points, taking into account their attraction, leads to the formation of clusters. The disadvantage of this method is the manifestation of the effects of inertia, which can hinder the clustering process and lead to the ejection of accelerated particles from the cluster at the stage of its formation. To exclude such phenomena, we use a model of the dynamics of viscous motion of particles representing the data and the natural limitation of the cluster size due to the repulsion of particles. When simulating the repulsive force of particles, the interaction in the Pauli form was taken for fermions with the same spins and the Gaussian distribution of the error density. The basic equations describing the steps of the presented modification of the gravitational algorithm are written. A numerical example demonstrates the features and advantages of the viscous gravity algorithm in comparison with the k-means method and the density-based DBSCAN method, including automatic termination of the procedure when the main clustering process is completed. The results obtained allow for blind clustering of Big Data, and can be generalized to solving multidimensional optimization problems.

Downloads

Download data is not yet available.

Author Biography

Pavel A. Golovinski, Voronezh State Technical University

Dr. Phys.-Math. Sci., Professor of the Department of Innovation and Building Physics named after I. S. Surovtsev, Voronezh State Technical University

References

1. Evans R., Pfahringer B. and Holmes G. (2011) Clustering for classification. 2011 7th International Conference on Information Technology in Asia. Kuching, Sarawak. P. 1–8.
2. Shirkhorshidi A. S., Aghabozorgi S., Wah T. Y. and Herawan T. (2014) Big Data clustering: A review. In: Murgante B. et al. (eds) Computational Science and Its Applications – ICCSA 2014. Lecture Notes in Computer Science. Springer, Cham. 8583. P. 707–720.
3. Ge M., Bangui H. and Buhnova B. (2018) Big Data for Internet of Things: A Survey. Future Generation Computer Systems. 87. P. 601–614.
4. Xu Rui and Wunsch D. (2005) Survey of clustering algorithms. IEEE Transactions on Neural Networks. 16(3). P. 645–678.
5. Xu D. and Tian Y. (2015) A Comprehensive survey of clustering algorithms. Ann. Data. Sci. 2. P. 165–193.
6. Jain A. K. (2010) Data clustering: 50 years beyond K-means. Pattern Recognition Letters. 31(8). P. 651–666.
7. Lemke O. and Keller B. G. (2016) Density-based cluster algorithms for the identification of core sets. J. Chem. Phys. 145(16). P. 164104(14).
8. Lee K. M., Lee S. Y. [et al] (2017) Density and frequency-aware cluster identification for spatio-temporal sequence data. Wireless Pers. Commun. 93. P. 47–65.
9. Wright W. E. (1977) Gravitational clustering. Pattern Recognition. 9. P. 151–166.
10. Gorbonos D., van der Vaart K., Sinhuber M., Puckett J. G., Reynolds A. M., Ouellette N. T. and Gov N. S. (2020) Similarities between insect swarms and isothermal globular clusters. Phys. Rev. Research. 2. P. 013271(5).
11. Fazliana Abdul Kadir A., Mohamed A., Shareef H., Asrul Ibrahim A., Khatib T. and Elmenreich W. (2014) An improved gravitational search algorithm for optimal placement and sizing of renewable distributed generation units in a distribution system for power quality enhancement. Journal of Renewable and Sustainable Energy. 6(3). P. 033112(17).
12. Mahdad B. and Srairi K. (2014) Interactive gravitational search algorithm and pattern search algorithms for practical dynamic economic dispatch. International Transactions on Electrical Energy Systems. 25(10). P. 2289–2309.
13. Kou Z. (2019) Association rule mining using chaotic gravitational search algorithm for discovering relations between manufacturing system capabilities and product features. Concurrent Engineering. 27(3). P. 213– 232.
14. Huang M.-L. and Chou Y.-C. (2019) Combining a gravitational search algorithm, particle swarm optimization, and fuzzy rules to improve the classification performance of a feed-forward neural network. Computer Methods and Programs in Biomedicine. 180. P. – 105016(12).
15. Siddique N. and Adeli H. (2016) Applications of gravitational search algorithm in engineering / Journal of Civil Engineering and Management. 22(8). P. 981–990.
16. Ali A. F. and Tawhid M. A. (2016) Direct gravitational search algorithm for global optimisation problems. East Asian Journal on Applied Mathematics. 6(03). P. 290–313.
17. Koay Y. Y., Tan J. D., Lim C. W., Koh S. P., Tiong S. K. and Ali K. (2019) An adaptive gravitational search algorithm for global optimization. Indonesian Journal of Electrical Engineering and Computer Science. 16(2). P. 724–729.
18. Zhang A., Sun G., Wang Z. and Yao Y. (2015) A hybrid genetic algorithm and gravitational search algorithm for global optimization. Neural Network World. 25(1). P. 53–73
19. Rashedi E., Rashedi E. and Nezamabadi-pour H. (2018) A comprehensive survey on gravitational search algorithm. Swarm and Evolutionary Computation. 41. P. 141–158.
20. Xiaobing Y., Xianrui Y. and Hong C. (2019) An improved gravitational search algorithm for global optimization. J. Intell. Fuzzy Syst. 37. P. 50395047.
21. Vasile M., Martin J. M. R., Masi L., Minisci E., Epenoy R., Martinot V. and Baig J. F. (2015) Incremental planning of multi-gravity assist trajectories. Acta Astronautica. 115. P. 407–421.
22. Binder P., Muma M. and Zoubir A. M. (2018) Gravitational clustering: A simple, robust and adaptive approach for distributed networks. Signal Processing. 149. P. 36–48.
23. Rashedi E., Nezamabadi-pour H. and Saryazdi S. (2009) GSA: A gravitational search algorithm. Information Sciences. 179(13). P. 2232–2248.
24. Sabri N. M., Puth M. and Mahmood M. (2013) R. A review of gravitational search algorithm. Int. J. Advance. Soft. Comp. Appl. 5. (3). P. 1–39.
25. Bala I. and Yadav A. (2019) Gravitational search algorithm: A state-of-the-art review. In: Yadav N., Yadav A., Bansal J., Deep K., Kim J. (eds) Harmony Search and Nature Inspired Optimization Algorithms. Advances in Intelligent Systems and Computing. Springer, Singapore. 741. P. 27–37.
26. Gauci M., Dodd T. J. and Groß R. (2012) Why ‘CSA: a gravitational search algorithm’ is not genuinely based on the law of gravity. Nat. Comput. 11. P. 719–720.
27. Alswaitti M., Ishak M. K. and Isa N. A. M. (2018) Optimized gravitational-based data clustering algorithm. Engineering Applications of Artificial Intelligence. 73. P. 126–148.
28. Gomez J., Leon E., Nasraoui O. and Giraldo F. (2014) The Parameter-less Randomized Gravitational Clustering algorithm with online clusters’ structure characterization. Prog. Artif. Intell. 2. P. 217–236.
29. Han X., Quan L., Xiong X., Almeter M., Xiang J. and Lan Y. (2017) A novel data clustering algorithm based on modified gravitational search algorithm. Engineering Applications of Artificial Intelligence. 61. P. 1–7.
30. Rashedi E. and Nezamabadi-pour H. (2018) A comprehensive survey on gravitational search algorithm. Swarm and Evolutionary Computation. 41. P. 141–158.
31. Mustafa H. M. J., Ayob M., Nazri M. Z. A. and Kendall G. (2019) An improved adaptive memetic differential evolution optimization algorithms for data clustering problems. PLOS ONE. 4(5). P. e0216906(28).
32. Slamet M. and Sahni V. (1992) Rigorous and unifying physical interpretation of the exchange potential and energy in the local-density approximation. Phys. Rev. B. 45(8). P. 4013–4019
33. Sander J., Ester M., Kriegel HP. and Xu X. (1998) Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery. 2. P. 169–194.
34. Rodriguez A. and Laio A. (2014) Clustering by fast search and find of density peaks. Science. 344(6191). P. 1492–1496.
35. Mahmoudi S. M., Aghaie M., Bahonar M. and Poursalehi N. (2016) A novel optimization method, Gravitational Search Algorithm (GSA), for PWR core optimization. Annals of Nuclear Energy. 95. P. 23–34.
Published
2022-04-26
How to Cite
Golovinski, P. A. (2022). Viscous gravitational algorithm for clustering inacurate data. Proceedings of Voronezh State University. Series: Systems Analysis and Information Technologies, (1), 79-89. https://doi.org/10.17308/sait.2022.1/9203
Section
Intelligent Information Systems, Data Analysis and Machine Learning