Och, Hannah; Strutz, Tilo; Kaup, André (2021)
VCIP 2021, Munich, 5-8 December 2021.
DOI: 10.1109/VCIP53242.2021.9675326
Probability distribution modeling is the basis for most competitive methods for lossless coding of screen content. One such state-of-the-art method is known as soft context formation (SCF). For each pixel to be encoded, a probability distribution is estimated based on the neighboring pattern and the occurrence of that pattern in the already encoded image. Using an arithmetic coder, the pixel color can thus be encoded very efficiently, provided that the current color has been observed before in association with a similar pattern. If this is not the case, the color is instead encoded using a color palette or, if it is still unknown, via residual coding. Both palette-based coding and residual coding have significantly worse compression efficiency than coding based on soft context formation. In this paper, the residual coding stage is improved by adaptively trimming the probability distributions for the residual error. Furthermore, an enhanced probability modeling for indicating a new color depending on the occurrence of new colors in the neighborhood is proposed. These modifications result in a bitrate reduction of up to 2.9% on average. Compared to HEVC (HM-16.21 + SCM-8.8) and FLIF, the improved SCF method saves on average about 11% and 18% rate, respectively.
Reißing, Ralf; Wilde, Mathias; Wige, E.; Abeler , L.; Lindner , P.; Piechaczyk , F. (2021)
Der Nahverkehr - Öffentlicher Personenverkehr in Stadt und Region (11), S. 49-55.
Strutz, Tilo (2021)
Technical paper, June, 2021, TECH/2021/06, arxiv.org/abs/2106.03503v1.
DOI: 10.48550/arXiv.2106.03503
Distance transformation is an image processing technique used for many different
applications. Related to a binary image, the general idea is to determine the distance of
all background points to the nearest object point (or vice versa). In this tutorial, different
approaches are explained in detail and compared using examples. Corresponding source
code is provided to facilitate own investigations. A particular objective of this tutorial
is to clarify the difference between arbitrary distance transforms and exact Euclidean
distance transformations.
Strutz, Tilo; Leipnitz, Alexander; Jokisch, Oliver (2020)
5th Int. Conf. on Interactive Collaborative Robotics, ICR.
Advanced object detectors based on Convolutional Neural Networks (CNNs) offer high detection rates for many application scenarios but only within their respective training, validation and test data. Recent studies show that such methods provide a limited generalization ability for unknown data, even for small image modifications including a limited scale invariance. Reliable person detection with aerial robots (Unmanned Aerial Vehicles, UAVs) is an essential task to fulfill high security requirements or to support robot control, communication, and human-robot interaction. Particularly in an agricultural context persons need to be detected from a long distance and a high altitude to allow the UAV an adequate and timely response. While UAVs are able to produce high resolution images that enable the detection of persons from a longer distance, typical CNN input layer sizes are comparably low. The inevitable scaling of images to match the input-layer size can lead to a further reduction in person sizes. We investigate the reliability of different YOLOv3 architectures for person detection in regard to those input-scaling effects. The popular VisDrone data set with its varying image resolutions and relatively small depiction of humans is used as well as high resolution UAV images from an agricultural data set. To overcome the scaling problem, an algorithm is presented for segmenting high resolution images in overlapping tiles that match the input-layer size. The number and overlap of the tiles are dynamically determined based on the image resolution. It is shown that the detection rate of very small persons in high resolution images can be improved using this tiling approach.
Strutz, Tilo; Möller, Phillip (2020)
IEEE Transactions on Multimedia 22 (5), S. 1126 - 1138.
DOI: 10.1109/TMM.2019.2941270
The compression of screen content has attracted the interest of researchers in the last years as the market for transferring data from computer displays is growing. It has already been shown that especially those methods can effectively compress screen contentwhich are able to predict the probability distribution of next pixel values. This prediction is typically based on a kind of learning process. The predictor learns the relationship between probable pixel colours and surrounding texture. Recently, an effective method called ‘soft context formation’ (SCF) had been proposed which achieves much lower bitrates for images with less than 8 000 colours than other state-of-the-art compression schemes.
This paper presents an enhanced version of SCF. The average lossless compression performance has increased by about 5% in
application to images with less than 8 000 colours and about 10% for imageswith up to 90 000 colours. In comparison to FLIF, FP8v3, andHEVC(HM−16.20+SCM−8.8), it achieves savings of about 33%, 4%, and 11% on average. The improvements compared to
the original version result from various modifications. The largest contribution is achieved by the local estimation of the probability
distribution for unpredictable colours in stage II of the compression scheme.
Friedrich-Streib-Str. 2
96450 Coburg