This paper presents a method for preparing data for machine learning for semantic segmentation of informative classes in images based on clustering for solving problems of space monitoring of impact areas. A classification of clustering methods by various criteria is given. The choice of hierarchical clustering methods as the most effective for working with clusters of arbitrary structure and shape is substantiated. A general scheme for calculating a clustering model is given, which includes, in addition to the clustering itself, procedures for data tiling, estimating the optimal clustering parameters, registering objects, and assessing the quality of the obtained data. A scheme for preparing data for machine learning is shown, including the construction of a reference markup, calculation of a clustering model, markup correction, and testing the obtained clustering models for different informative classes on new images.