Multiple estimation of parameters of a linear regression model with an interval specified dependent variable

Authors

DOI:

https://doi.org/10.17308/sait/1995-5499/2024/2/71-79

Keywords:

regression model, parameter estimates, interval data assignment, polyhedron vertex, center of gravity, compromise solution, Pareto set, linear programming problem

Abstract

The paper formulates the problem of determining the unknown parameters of a linear regression model for the case when the initial information (data sample) for predictor variables is specified traditionally, pointwise, and for the dependent variable — intervally. It is assumed that any information, in particular of a probabilistic nature, clarifying the “true” location of the variable values within or on the boundaries of the specified intervals, is absent. In the general case, for such a situation, the set of estimates of the model parameters is described by a system of linear inequalities. When it is compatible, it is proposed to consider the vector of parameter estimates that provides the maximum resolution of the system as a solution to the problem; this technique is often used in the theory of vector optimization. If the system of inequalities is incompatible, the task of finding a quasi-solution to a two-criteria linear programming problem is posed, in which the first component corresponds to the loss function for the least modulus method, and the second to the method of antirobust parameter estimation. These methods behave differently in relation to outliers in the data — the first one ignores them, while the second one, on the contrary, strongly gravitates towards them. The problem is proposed to be solved in three stages. First, by solving a series of linear programming problems, a set of Pareto vertices of the simplex is formed, which is the domain of compatibility of the system of linear inequalities. Then the Pareto set is constructed as a union of edges connecting neighboring vertices. After this, from this entire set, one of its representatives (or the so-called compromise solution) is selected, reflecting the configuration of this set. A simple numerical example has been solved. The obtained solution is compared with the one that corresponds to the least modulus method for averaged data.

Author Biographies

  • Sergey Ivanovich Noskov, Irkutsk State Transport University

    Doctor of Technical Sciences, Professor, Professor of the Department of Information Systems and Information Security, Irkutsk State University of Railways

  • Yurii Markovich Sapozhnikov, Irkutsk State Transport University

    Candidate of Chemical Sciences, Associate Professor of the Department of Customs and Law Irkutsk State University of Railways

References

Published

2024-10-14

Issue

Section

Mathematical Methods of System Analysis, Management and Modelling

How to Cite

Multiple estimation of parameters of a linear regression model with an interval specified dependent variable. (2024). Proceedings of Voronezh State University. Series: Systems Analysis and Information Technologies, 2, 71-79. https://doi.org/10.17308/sait/1995-5499/2024/2/71-79

Most read articles by the same author(s)