Detection of outliers in the data is practically homogeneous technical systems

Keywords: system, system analysis, time series, patterns, a priori information, almost identical technical systems, similarity measures, outliers, data, defects

Abstract

The article presents a method for detecting outliers in data on a particular parameter of practically homogeneous technical systems (hereinafter referred to as PHTS) in the form of time series by their comparative analysis, reflecting cases of impossibility of its implementation. The method is very relevant due to the widespread use of time series in various modern technical systems. To develop the method, a non-exhaustive review of various methods for detecting defects in data was carried out. The main attention in the review was focused on the detection of outliers, since such a defect as omissions in the data is present in them almost explicitly. Some advantages and disadvantages of outlier detection methods are considered, taking into account which the method was developed. The method has broad prospects for its further application, as it can help in identifying outliers in the data of the systems under study, and in case of data similarity, it can give additional confidence to the researcher that the systems whose data are being studied were at the designated time in good condition and almost equal conditions. In turn, the difference may indicate the presence of outliers in the data, the causes of which may be: malfunctions of one or another of the systems or in the systems for collecting and storing data about them, the influence of an unaccounted factor on one or another system. Despite some subjectivity, the method has a significant plus in the form of flexibility, in addition, it does not require the construction of complex models that determine the reference behavior of the investigated parameter of the PHTS for a comparative analysis of the model data and the systems under study. This fact indicates the promise of its application for data analysis even of complex PHTS.

Downloads

Download data is not yet available.

Author Biographies

Alexander S. Dulesov, Khakass State University N. F. Katanov

Doctor of Engineering. Sci., Professor of the Department of Digital Technologies and Design, Khakass State University named after N. F. Katanov

Anatoly V. Bayshev, Khakass State University named after N. F. Katanov

3-rd year postgraduate student of the Department of Digital Technologies and Design, Khakass State University named after N. F. Katanov

References

1. Zolotova T. V. and Volkova D. A. (2022) Intelligent data processing methods for the atypical values correction of stock quotes. Statistics and Economics. 19 (2). P. 4–13. DOI
2. Desherevskii A. V. [et al.] (2017) Problems in analyzing time series with gaps and their solution with the WinABD software package. Izvestiya, Atmospheric and Oceanic Physics. 53(7). P. 659–678. DOI
3. Kontsevaya N. V. (2012) The analysis of methods of filling of admissions in temporary ranks of indicators of the financial markets. Vestnik Voronezhskogo gosudarstvennogo universiteta. (8). P. 18–20. (in Russian)
4. Zymbler M. L. [et al.] (2021) Cleaning sensor data in Intelligent Heating Control System. Bulletin of the South Ural State University. Series “Computational Mathematics and Software Engineering”. 10(3). P. 16–36. DOI
5. Turner W. J. N., Staino A. and Basu B. (2017) Residential HVAC fault detection using a system identification approach. Energy and Buildings. 151. P. 1–17. DOI
6. Farouq S. [et al.] (2020) Large-scale monitoring of operationally diverse district heat ing substations: A reference-group based approach. Engineering Applications of Artificial Intelligence. 90. P. 103–492. DOI
7. Generalov I. G., Zavivaeva O. E. and Suslov S. A. (2019) Anomalies in the structure of time series when assessing the stability of grain production. Azimuth of scientific research: economics and administration. 8(29). P. 351–354. DOI
8. Zimek A. and Filzmoser P. (2018) There and back again: Outlier detection be-tween statistical reasoning and data mining algorithms. WIREs Data Mining and Knowledge Discovery. 8(6). P. 1280. DOI
9. Huang X. [et al.] (2016) Time Series K-means: A new K-means type smooth subspace clustering for time series data. Information Sciences. 367(368). P. 1–13. DOI
10. Pestunov I. A. and Sinyavskiy Yu. N. (2012) Clustering algorithms in satellite images segmentation tasks. Vestnik Kemerovskogo gosudarstvennogo universiteta. Series “Mathematics”. 4(52). P. 1–13. (in Russian)
11. Zueva V. N. (2017) Regressive methods of prognostication of the load-graph of electrical equipment. Polythematic Online Scientific Journal of Kuban State Agrarian University. 126(2). P. 1–12. DOI
12. Torgashin A. S. [et al.] (2022) Method for processing the results of cavitation tests of TNA pumps in order to obtain an approximating function. Siberian Aerospace Journal. 23(3). P. 498–507. DOI
13. Ikuta S. [et al.] (2023) Development of bicarbonate buffer flow-through cell dissolution test and its application in prediction of in vivo performance of colon targeting tablets. European Journal of Pharmaceutical Sciences. (180), 106326. DOI
14. Boyarsky M. V. (2003) On the detection and elimination of anomalous results of observations in woodworking processes. Izvestiya VUZov. Forest magazine. 1. P. 66–70. (in Russian)
15. Popukailo V. S. (2016) Construction of a mathematical model for the efficiency of a bank in a small sample. Actual problems of the humanities and natural sciences. 3(2). P. 1–6. (in Russian)
16. Vodanyuk S. A. (2013) The practice of applying a comparative approach to the assessment of claims (receivables). Property relations in the Russian Federation. 6(141). P. 42–53. (in Russian)
17. Petrovskaya Y. A. (2015) Methods for Eliminating Gross Error. Actual problems of aviation and cosmonautics. (11). P. 109–110. (in Russian)
18. Saha S. [et al.] (2020) A statistical and numerical modeling approach for spatiotemporal reconstruction of glaciations in the Central Asian Mountains. MethodsX. (7). 100820 DOI
19. Muravyov P. A. and Dementiev M. V. (2011) Cleaning time series from outliers. Health and education in the XXI century. 3. P. 328–329. (in Russian)
20. Pozolotin V. E. and Sultanova E. A. (2019) Application of data transformation algorithms in the analysis of time series to eliminate outliers. Software systems and computational methods. 2(2). P. 33–42. DOI
Published
2023-10-26
How to Cite
Dulesov, A. S., & Bayshev, A. V. (2023). Detection of outliers in the data is practically homogeneous technical systems. Proceedings of Voronezh State University. Series: Systems Analysis and Information Technologies, (3), 121-133. https://doi.org/10.17308/sait/1995-5499/2023/3/121-133
Section
Intelligent Information Systems, Data Analysis and Machine Learning