Аннотация
The main mistakes made by researchers when predicting events using models based on machine learning are discussed. Such errors are: loss of events themselves, due to the construction of abstract features; models are trained on customers rather than events from customers; construction of artificial features; incorrect validation and erroneous model quality metrics; and static parameters are used. An analysis of the mistakes made in one example from Kaggle is provided. The area under the ROC curve for this example is very high — 0.88, but this quality metric is calculated incorrectly. After correcting all errors, the correct metric turned out to be 0.599. A different approach to analyzing and predicting events is presented, which differs significantly from classical machine learning methods. The method is based on consideration of individual mechanisms of event formation for each client. Mechanism models are being built. Using mathematical methods, the parameters of the models of these event formation mechanisms are restored. Parameters are extrapolated to the future. The forecast of a future event is obtained as a result of the functioning of the mechanism model with established parameter values. The model quality metric, the area under the ROC curve, turned out to be 0.615, which is slightly higher than in the Kaggle example, based on machine learning. Thereby, it is shown that the proposed approach is competitive to advanced machine learning techniques.