\title{Lossless Single Pass Image Compression by Interpolation with Efficient Error Handling}
\title{Lossless Single Pass Image Compression with Efficient Error Handling for Thermal Images}
\author{Andrey Filippov \qquad Nathaniel Callens Jr. \\
Kelly Chang \qquad Bryce Hepner \qquad Nikolai Masnev \\
...
...
@@ -104,7 +104,7 @@ The resulting files were approximately 34\% smaller than their equivalent PNGs,
\section{Introduction}
\subsection{Overview}
The base system is not new, but it will be explained here in order to keep consistant definitions and in case any reader is not familiar with the method.
The base system is not new, but it will be explained here in order to keep consistent definitions and in case any reader is not familiar with the method.
The idea is based on how images are scanned in originally.
Like a cathode-ray tube in a television, the algorithm goes line by line, reading/writing each pixel individually in a raster pattern.
...
...
@@ -181,77 +181,31 @@ The closest method is ``Near-lossless image compression by relaxation-labelled p
The algorithm detailed in the paper uses a clustering algorithm of the nearby points to create the interpolation, saving the errors to be used later in the reconstruction of the original image.
This method is much more complex, not using a direct interpolation method but instead using a clustering algorithm to find the next point.
This could potentially have an advantage over what we did by using more points in the process, but in proper implementation it may become too complicated and lose value.
The goal for us was to have a simple and efficient encoding operation, and this would have too many errors to process.
This could potentially have an advantage over what we did by using more points in the process, but in proper implementation it would become too complicated for our purposes.
The goal for us was to have a simple and efficient encoding operation, and this would have too much to process.
It also has a binning system like ours, with theirs based off of the mean square prediction error.
The problem is that which bin it goes into can shift over the classification process adding to the complexity of the algorithm.
%The use of more points could have been implemented into ours too but we chose not to due to the potential additional temporal complexity.
\section{The Approach}
To begin, the border values are encoded into the system, starting with the first value.
The values after that are just modifications from the first value.
There are not many values here and the algorithm needs a place to start.
Alternate things could have been done, but they would have raised temporal complexity with marginal gain.
%Once the middle points are reached, the pixel to the left, top left, directly above, and top right have already been read into the system.
%Each of these values is given a point in the x-y plane, with the top left at (-1,1), top pixel at (0,1), top right pixel at (1,1), and the middle left pixel at (-1,0), giving the target the coordinates (0,0).
Using the formula for a plane in 3D ($ax + by + c = z$) we have the system of equations
$$-a + b + c = z_0$$
$$b + c = z_1$$
$$a + b + c = z_2$$
$$-a + c = z_3$$
Which complete the form $Ax = b$ as
$$A =
\begin{bmatrix}
-1&1&1\\
0&1&1\\
1&1&1\\
-1&0&1
\end{bmatrix}
$$
$$b =
\begin{bmatrix}
z_0\\
z_1\\
z_2\\
z_3
\end{bmatrix}
$$
Due to there being 4 equations and 4 unknowns, this is unsolvable.
This can be corrected by making
$$A =
\begin{bmatrix}
3&0&-1\\
0&3&3\\
1&-3&-4
\end{bmatrix}
$$
and
$$b =
\begin{bmatrix}
-z_0+ z_2- z_3\\
z_0+ z_1+ z_2\\
-z_0- z_1- z_2- z_3
\end{bmatrix}
$$
.
The new matrix is full rank and can therefore be solved using \verb|numpy.linalg.solve|~\cite{Numpy}.
The x that results corresponds to two values followed by the original $c$ from the $ax+by+c=z$ form, which is the predicted pixel value.
Once that is saved, the rest of the values are just saved as the difference to the first.
This is not the most technical approach, but it reduces complexity, leaving room for the body of the system.
Huffman encoding performs well on data with varying frequency~\cite{Huffman}, making it a good candidate for saving the error numbers.
Figures \ref{fig:Uniform} and \ref{fig:Normal} give a representation of why saving the error numbers is better than saving the actual values.
Most pixels will be off the predicted values by low numbers since many objects have close to uniform surface temperature or have an almost uniform temperature gradient.
This is compounded on the additive nature of thermal images, since temperature values can range greatly, a system is needed that efficiently incorporates that.
Most pixels will be off from the predicted values by low numbers since many objects have close to uniform surface temperature or have an almost uniform temperature gradient.
Planar interpolation between the 4 known points is done in order to predict the next pixel value.
Because this is an overdetermined system, it will not only output the predicted pixel value, but the square of the residuals (squared euclidean norm of b - Ax) as well~\cite{Numpy}.
Other than in the title, we use the term ``error'' to describe the difference between the predicted pixel value and the actual value, and the term ``difference'' to describe the square root of the residuals.
This difference number is a valuable predictor, not of the original pixel value, but of the error that will be outputted.
It is not good enough to predict it outright, as there is too much noise and not enough direct correlation, but it can be used to create several different encoding tables that can aid in compression.
Another approach was also used in testing, which was using the difference between the maximum pixel value and the minimum.
This had similar results, but was not used in the final process since the residuals were already automatically calculated, while the min and max differencing would have to be done in addition to this, further complicating it.
\begin{figure}[h]
\centering
...
...
@@ -267,15 +221,25 @@ Most pixels will be off the predicted values by low numbers since many objects h
\end{figure}
In order to adjust for objects in images that are known to have an unpredictable temperature (fail the cases before), a bin system is used.
The residuals from \verb|numpy.linalg.lstsq|~\cite{Numpy} are used to determine the difference across the 4 known points, which the difference is then used to place it in a category.
This number is the difference between trying to fit a plane between 4 different points.
In order to adjust for objects in images that are known to have an unpredictable temperature (have high difference values), a bin system is used.
If a plane is able to be drawn that contains all 4 points, it makes sense that the error will be much smaller than if the best-fitted plane was not very close to any of the points.
Something more certain is more likely to be correctly estimated.
5 bins were used with splits chosen by evenly distributing the difference numbers into evenly sized bins.
Many of the images had several different bin sizes ranging from 11 in the first category to a difference of 30 as the size of the first category.
An average number between all of them was chosen since using the average for bin sizes versus specific bin sizes had an effect on compression of less than half a percent.
An average number between all of them was chosen in order to save space.
This makes the system much better adapted to larger ranges of error values, such as looking at grass or another high frequency surface.
The system performs better than a standard system without bins on this data since it is able to optimize better for these larger values.
As shown in \ref{fig:2DHist}, the average error gets worse as difference increases.
By focusing the root of the Huffman tree on the shift in error values, it is possible to get better compression.