% The method described in this paper is a simple method that has intended use with thermal images.
This method operates by scanning through each pixel in a raster pattern, using already scanned pixels to decompress the next pixel's value.
By saving the error between the predicted pixel value and the actual value, we were able to losslessly compress thermal images to be less than 41\% of their original size.
%This method operates by scanning through each pixel in a raster pattern, using already scanned pixels to decompress the next pixel's value.
%By saving the error between the predicted pixel value and the actual value, we were able to losslessly compress thermal images to be less than 41\% of their original size.
%The resulting files were approximately 34\% smaller than their equivalent PNGs, and 35\% smaller than TIFF files compressed with LZW.
The specific properties of thermal images compared to photographic ones are higher dynamic range (16 bits) and dependence of pixels only on the temperature variations of self-radiating objects. The ambient temperature variations add to the pixel values, not multiply them as in the case of the illuminated scenes.
We base our algorithm on the 4-neighbor method and use local context to switch between encoding tables as the expected prediction error depends only on the differences between the known pixels invariant of their average value.
This approach allows for building a 2D histogram for the prediction error and the "smoothness" of the known pixels and using it to construct the encoding tables.
Table selection only depends on the four-pixel values (so available to the decoder) and does not increase the compressed stream.
As a result, we could losslessly compress thermal images to be less than 41\% of their original size.
The resulting files were approximately 34\% smaller than their equivalent PNGs, and 35\% smaller than TIFF files compressed with LZW.
\end{abstract}
\section{Introduction}
...
...
@@ -104,16 +109,18 @@ Like a cathode-ray tube in a television, the algorithm goes line by line, readin
Each pixel, as long as it is not on the top or side boundaries, will have 4 neighbors that have already been read into the machine.
Those points can be analyzed and interpolated to find the next pixel's value.
A visual demostration of this pattern is given in \figurename{\ref{fig:pixels}}.
The goal is to encode the error between that value and the original value, save that, and use that to compress and decompress the image.
Even though a possibly larger integer may need to be stored, it is more likely that the guess will be correct or off by a small margin, making the distribution better for compression.
The approach of using the neighboring pixels for compression is not new, as evidenced by its use in ISO/IEC14495-1:1999 \cite{ISO/IEC14495-1} and ``CALIC-a context based adaptive lossless image codec''\cite{544819}, which were both written more than 20 years before the publication of this paper.
The approach of using the neighboring pixels for compression is not new, as evidenced by its use in ISO/IEC14495-1:1999 \cite{ISO/IEC14495-1} and ``CALIC-a context based adaptive lossless image codec''\cite{544819}, which were both written more than 20 years before the publication of this paper.
%This ``neighbor'' system is not as common as it should be, as it provides a base for simple implementation with high rates of compression.
Our final implementation differs from these methods, and others, in ways that we found beneficial, and in ways others may find to be beneficial as well.
\caption{\label{fig:pixels}The other 4 pixels are used to find the value of the 5th.}
\end{figure}
\subsection{Background}
...
...
@@ -179,7 +186,7 @@ There are not many values here and the algorithm needs a place to start.
Alternate things could have been done, but they would have raised temporal complexity with marginal gain.
Once the middle points are reached, the pixel to the left, top left, directly above, and top right have already been read into the system.
Each of these values is given a point in the x-y plane, with the top left at (-1,1), top pixel at (0,1), top right pixel at (1,1), and the middle left pixel at (-1,0), giving the target the coordinates (0,0).
Using the formula for a plane in 3D ($ax + by + c = z$) we get the system of equations
Using the formula for a plane in 3D ($ax + by + c = z$) we have the system of equations
$$-a + b + c = z_0$$
$$b + c = z_1$$
$$a + b + c = z_2$$
...
...
@@ -234,20 +241,21 @@ $$
The new matrix is full rank and can therefore be solved using \verb|numpy.linalg.solve|\cite{Numpy}.
The x that results corresponds to two values followed by the original $c$ from the $ax+by+c=z$ form, which is the predicted pixel value.
Huffman encoding performs well on data with varying frequency \cite{Huffman}, making it a good candidate for saving the error numbers.
Huffman encoding performs well on data with varying frequency \cite{Huffman}, making it a good candidate for saving the error numbers.
Figures \ref{fig:Uniform} and \ref{fig:Normal} give a representation of why saving the error numbers is better than saving the actual values.
Most pixels will be off the predicted values by low numbers since many objects have close to uniform surface temperature or have an almost uniform temperature gradient.