||Ochotta, T., Gebhardt, C., Saupe, D., Wergen, W.
||Adaptive thinning of atmospheric observations in data assimilation with vector quantization and filtering methods
||In data assimilation for numerical weather prediction, measurements of various observation systems are combined with background data to define initial states for the forecasts. Current and future observation systems, in particular satellite instruments, produce huge amounts of measurements with high spatial and temporal density. Such data sets significantly increase the computational costs of the assimilation and, moreover, can violate the assumption of spatially independent observation errors and more complex observation error statistics would be needed leading to additional increase in the computational costs. To ameliorate these problems, we propose two greedy thinning algorithms which reduce the number of assimilated observations while retaining the essential information content of the data. Our approach is inspired by simplification methods from geometry processing in computer graphics and by clustering algorithms in vector quantization. In the first method we iteratively estimate the redundancy of the current point set and remove the most redundant one. The degree of redundancy of an observation is defined to be inversely proportional to the interpolation error of its reconstruction obtained by applying an interpolation filter to a neighborhood in which the observation is removed. In a second scheme the number of points in the output set is increased iteratively. These observations correspond to centers of clusters of observations. A distance measure that combines spatial distance with the difference in observation values defines an error measure for the overall quality of a clustering. We evaluated the proposed methods with respect to a geometrical error measure and compared them with a uniform sampling scheme. We also evaluated our thinnings of ATOVS satellite data using the assimilation system of the Deutsche Wetterdienst. Impact of the thinning on the analysed fields and on the subsequent forecasts is discussed. We obtain good representations of the original data with thinnings retaining only a small portion of observations.