Commit 4b2dff06 authored by Bryce Hepner's avatar Bryce Hepner

typo changes

parent 758a087d
Pipeline #2587 passed with stage
in 7 seconds
......@@ -20,7 +20,7 @@
\citation{PNGdetails}
\citation{PNGdetails}
\@writefile{toc}{\contentsline {section}{\numberline {1}Introduction}{1}{section.1}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {1.1}Technical Overview}{1}{subsection.1.1}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {1.1}Overview}{1}{subsection.1.1}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {1.2}Background}{1}{subsection.1.2}\protected@file@percent }
\@writefile{lof}{\contentsline {figure}{\numberline {1}{\ignorespaces The other 4 pixels are used to find the value of the 5th.\relax }}{1}{figure.caption.1}\protected@file@percent }
\providecommand*\caption@xref[2]{\@setref\relax\@undefined{#1}}
......@@ -48,11 +48,11 @@
\citation{Numpy}
\@writefile{brf}{\backcite{Numpy}{{3}{3}{section.3}}}
\@writefile{brf}{\backcite{Huffman}{{3}{3}{section.3}}}
\@writefile{brf}{\backcite{Numpy}{{3}{3}{figure.caption.3}}}
\@writefile{lof}{\contentsline {figure}{\numberline {2}{\ignorespaces Encoding the Pixel Values\relax }}{3}{figure.caption.2}\protected@file@percent }
\newlabel{fig:sub1}{{2}{3}{Encoding the Pixel Values\relax }{figure.caption.2}{}}
\@writefile{lof}{\contentsline {figure}{\numberline {3}{\ignorespaces Encoding the Error Values\relax }}{3}{figure.caption.3}\protected@file@percent }
\newlabel{fig:sub2}{{3}{3}{Encoding the Error Values\relax }{figure.caption.3}{}}
\@writefile{brf}{\backcite{Numpy}{{3}{3}{figure.caption.3}}}
\@writefile{toc}{\contentsline {section}{\numberline {4}Results}{3}{section.4}\protected@file@percent }
\citation{LAPACKAlgorithms}
\citation{LeastSquaredProblem}
......
This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) (preloaded format=pdflatex 2020.7.20) 28 JUN 2022 16:05
This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) (preloaded format=pdflatex 2020.7.20) 11 JUL 2022 11:52
entering extended mode
restricted \write18 enabled.
%&-line parsing enabled.
......@@ -421,16 +421,13 @@ File: Uniform_No_Title.png Graphic file (type png)
<use Uniform_No_Title.png>
Package pdftex.def Info: Uniform_No_Title.png used on input line 239.
(pdftex.def) Requested size: 237.13594pt x 177.8515pt.
LaTeX Warning: `h' float specifier changed to `ht'.
<Normal_No_Title.png, id=85, 462.528pt x 346.896pt>
File: Normal_No_Title.png Graphic file (type png)
<use Normal_No_Title.png>
Package pdftex.def Info: Normal_No_Title.png used on input line 245.
(pdftex.def) Requested size: 237.13594pt x 177.8515pt.
LaTeX Warning: `h' float specifier changed to `ht'.
[3 <./Uniform_No_Title.png> <./Normal_No_Title.png>]
......@@ -465,19 +462,19 @@ Underfull \hbox (badness 7362) in paragraph at lines 26--26
[]
)
Package atveryend Info: Empty hook `BeforeClearDocument' on input line 315.
Package atveryend Info: Empty hook `BeforeClearDocument' on input line 316.
[4]
Package atveryend Info: Empty hook `AfterLastShipout' on input line 315.
Package atveryend Info: Empty hook `AfterLastShipout' on input line 316.
(./main.aux)
Package atveryend Info: Executing hook `AtVeryEndDocument' on input line 315.
Package atveryend Info: Executing hook `AtVeryEndDocument' on input line 316.
\snap@out=\write5
\openout5 = `main.dep'.
Dependency list written on main.dep.
Package atveryend Info: Executing hook `AtEndAfterFileList' on input line 315.
Package atveryend Info: Executing hook `AtEndAfterFileList' on input line 316.
Package rerunfilecheck Info: File `main.out' has not changed.
(rerunfilecheck) Checksum: 90A24BEB086706678095977998C56209;523.
(rerunfilecheck) Checksum: 32E97EDE93C04899CE7128EA0CB0D790;513.
Package rerunfilecheck Info: File `main.brf' has not changed.
(rerunfilecheck) Checksum: BB047529470216DFDC4D0933E0F06F40;613.
......@@ -506,7 +503,7 @@ y9.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmti10.pfb
></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmti9.pfb></usr/
share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmtt10.pfb></usr/share/
texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmtt9.pfb>
Output written on main.pdf (4 pages, 248231 bytes).
Output written on main.pdf (4 pages, 247963 bytes).
PDF statistics:
180 PDF objects out of 1000 (max. 8388607)
152 compressed objects within 2 object streams
......
\BOOKMARK [1][-]{section.1}{Introduction}{}% 1
\BOOKMARK [2][-]{subsection.1.1}{Technical Overview}{section.1}% 2
\BOOKMARK [2][-]{subsection.1.1}{Overview}{section.1}% 2
\BOOKMARK [2][-]{subsection.1.2}{Background}{section.1}% 3
\BOOKMARK [1][-]{section.2}{Related Work}{}% 4
\BOOKMARK [2][-]{subsection.2.1}{PNG}{section.2}% 5
......
No preview for this file type
No preview for this file type
......@@ -94,17 +94,17 @@ Elphel, Inc.\\
% The method described in this paper is a simple method that has intended use with thermal images.
This method operates by scanning through each pixel in a raster pattern, using already scanned pixels to decompress the next pixel's value.
By saving the error between the predicted pixel value and the actual value, we were able to losslessly compress thermal images to be less than 41\% of their original size.
The resulting files were approximately 34\% smaller than their equivalent PNGs, and 35\% smaller than LZW compression with TIFF files.
The resulting files were approximately 34\% smaller than their equivalent PNGs, and 35\% smaller than TIFF files compressed with LZW.
\end{abstract}
\section{Introduction}
\subsection{Technical Overview}
\subsection{Overview}
The idea is based off of how images are scanned in originally.
Like a cathode-ray tube in a television, the algorithm goes line by line, reading/writing each pixel individually in a raster pattern.
Each pixel, as long as it is not on the top or side boundaries, will have 4 neighbors that have already been read into the machine.
Those points can be analyzed and interpolated to find the next pixel's value.
Those points can be analyzed and interpolated to find the next pixel's value.
The goal is to encode the error between that value and the original value, save that, and use that to compress and decompress the image.
Even though a possibly larger integer may need to be stored, it's more likely that the guess will be correct, or off by a small margin, making the distribution better for compression.
......@@ -118,7 +118,7 @@ The images that were used in the development of this paper were all thermal imag
In the system, total possible values can range from 0 to 32,768.
Most images had ranges of at most 4,096 between the smallest and the largest pixel values.
The camera being used has 16 forward facing thermal sensors creating 16 similar thermal images every frame.
Everything detailed here can still apply to standard grayscale or RGB images, but for testing, only 16 bit thermal images were used.
Everything detailed here can still apply to standard grayscale or RGB images, but only 16 bit thermal images were used in testing.
\section{Related Work}
......@@ -147,42 +147,42 @@ Ours, similarly to PNG, only looks at a short portion of the data, which may hav
Images generally do not have the same patterns that text does, so it may be advantageous to not use the entire corpus in compressing an image and instead only evaluate it based off of nearby objects.
The blue parts of the sky will be next to other blue parts of the sky, and in the realm of thermal images, temperatures will probably be most similar to nearby ones due to how heat flows.
\subsection{Similar Methods}
Our research did not find any very similar approaches, especially with 16-bit thermal images.
There are many papers however that may have influenced ours indirectly or come close to ours and need to be mentioned for both their similarities and differences.
Our prior searches did not find any very similar approaches, especially with 16-bit thermal images.
There are many papers however that may have influenced ours indirectly or are similar to ours and need to be mentioned for both their similarities and differences.
One paper that is close is ``Encoding-interleaved hierarchical interpolation for lossless image compression'' \cite{ABRARDO1997321}.
This method seems to operate with a similar end goal, to save the interpolation, but operates on a different system, including how it interpolates.
This method seems to operate with a similar end goal, to save the interpolation, but operates using a different system, including how it interpolates.
Instead of using neighboring pixels in a raster format, it uses vertical and horizontal ribbons, and a different way of interpolating.
The ribbons alternate, going between a row that is just saved and one that is not saved but is later interpolated.
The ribbons alternate, going between a row that is directly saved and one that is not saved but is later interpolated.
In this way it is filling in the gaps of an already robust image and saving the finer details.
This other method could possibily show an increase in speed but not likely in overall compression.
This will not have the same benefit as ours since ours uses interpolation on almost the entire image, instead of just parts, optimizing over a larger amount of data.
This will not have the same benefit as ours since ours uses interpolation on almost the entire image, instead of just parts, helping it optimize over a larger amount of data.
This paper is also similar to ``Iterative polynomial interpolation and data compression'' \cite{Dahlen1993}, where the researchers did a similar approach but with different shapes.
The error numbers were still saved, but they used specifically polynomial interpretation which we did not see fit to use in ours.
The closest method is ``Near-lossless image compression by relaxation-labelled prediction'' \cite{AIAZZI20021619} which has similarity with the general principles of the interpolation and encoding.
The algorithm detailed in the paper uses a clustering algorithm of the nearby points to create the interpolation, saving the errors in order to retrieve the original later.
The closest method is ``Near-lossless image compression by relaxation-labelled prediction'' \cite{AIAZZI20021619} which is similar in the general principles of the interpolation and encoding.
The algorithm detailed in the paper uses a clustering algorithm of the nearby points to create the interpolation, saving the errors to be used later in the reconstruction of the original image.
This method is much more complex, not using a direct interpolation method but instead using a clustering algorithm to find the next point.
This could potentially have an advantage by using more points in the process, but the implementation becomes too complicated and may lose value.
The goal for us was to have a simple and efficient encoding operation, and this would have too many things to process.
This could potentially have an advantage over what we did by using more points in the process, but in implementation it may become too complicated and lose value.
The goal for us was to have a simple and efficient encoding operation, and this would have too many errors to process.
It also has a binning system like ours, with theirs based off of the mean square prediction error.
The problem is that which bin it goes into can shift over the classification process adding to the complexity of the algorithm.
The use of more points could have been implemented into ours too but we chose not to due to the potential additional temporal complexity.
%The use of more points could have been implemented into ours too but we chose not to due to the potential additional temporal complexity.
\section{The Approach}
To begin, the border values are encoded into the system starting with the first value.
To begin, the border values are encoded into the system, starting with the first value.
The values after that are just modifications from the first value.
There are not many values here and the algorithm needs a place to start.
Alternate things could have been done but they would have raised temporal complexity with marginal gain.
Alternate things could have been done, but they would have raised temporal complexity with marginal gain.
Once the middle points are reached, the pixel to the left, top left, directly above, and top right have already been read in.
Each of these values is given a point in the x-y plane, with the top left at (-1,1), top pixel at (0,1), top right pixel at (1,1), and the middle left pixel at (-1,0), giving the target (0,0).
Each of these values is given a point in the x-y plane, with the top left at (-1,1), top pixel at (0,1), top right pixel at (1,1), and the middle left pixel at (-1,0), giving the target the coordinates (0,0).
Using the formula for a plane in 3D ($ax + by + c = z$) we get the system of equations
$$-a + b + c = z_0$$
$$b + c = z_1$$
$$a + b + c = z_2$$
$$-a + c = z_3$$.
These complete the form $Ax = b$ as
Which complete the form $Ax = b$ as
$$A =
\begin{bmatrix}
-1 & 1 & 1\\
......@@ -232,7 +232,7 @@ The new matrix is full rank and can therefore be solved using \verb|numpy.linalg
The x that results corresponds to two values followed by the original $c$ from the $ax+by+c=z$ form, which is the predicted pixel value.
Huffman encoding performs well on data with varying frequency \cite{Huffman}, which makes it a good candidate for saving the error numbers.
Most pixels will be off by low numbers since many objects have close to uniform surface temperature or have an almost uniform temperature gradient.
Most pixels will be off the predicted values by low numbers since many objects have close to uniform surface temperature or have an almost uniform temperature gradient.
\begin{figure}[h]
\centering
......@@ -252,7 +252,7 @@ In order to adjust for objects in images that are known to have an unpredictable
The residuals from \verb|numpy.linalg.lstsq| \cite{Numpy} are used to determine the difference across the 4 known points, which is then used to place it in a category.
This number is the difference between trying to fit a plane between 4 different points.
If a plane is able to be drawn that contains all 4 points, it makes sense that the error will be much smaller than if the best fitted plane was not very close to any of the points.
Something more certain in this case is likely to be more correct.
Something more certain is more likely to be correctly estimated.
5 bins were used with splits chosen by evenly distributing the difference numbers into evenly sized bins.
Many of the images had several different bin sizes ranging from 11 in the first category to a difference of 30 as the first category.
An average number between all of them was chosen, since using the average versus specific bins had an effect on compression of less than half a percent.
......@@ -261,7 +261,7 @@ An average number between all of them was chosen, since using the average versus
We attained an average compression ratio of $0.4057$ on a set of 262 images, with compression ratios ranging from $0.3685$ to $0.4979$.
Because the system as it stands runs off of a saved dictionary, it is better to think of the system as a cross between an individual compression system and a larger archival tool.
Because the system runs off of a saved dictionary, it is better to think of the system as a cross between an individual compression system and a larger archival tool.
This means that there are large changes in compression ratios depending on how many files are compressed at a time, despite the ability to decompress files individually.
When the size of the saved dictionary was included, the compression ratio on the entire set only changed from $0.4043$ to $0.4057$. However, when tested on just the first image in the set, it went from $0.3981$ to $0.7508$.
......@@ -287,9 +287,10 @@ Our method created files that are on average 33.7\% smaller than PNG and 34.5\%
\section{Discussion}
The files produced through this method are much smaller than the others, but this comes at great computational costs.
PNG compression was several orders of magnitude faster on the local machine than the method that was used in this project.
Using a compiled language instead of python will increase the speed substantially, but there are other improvements that can be made.
Using a compiled language instead of python will increase the speed, but there are other improvements that can be made.
The issue with \verb|numpy.linalg.solve| was later addressed to fix the potential slowdown, but calculating the inverse beforehand and using that in the system had marginal temporal benefit.
The issue with \verb|numpy.linalg.solve| was later addressed to fix the potential slowdown.
Calculating the inverse beforehand and using that in the system had marginal temporal benefit.
\verb|numpy.linalg.solve| runs in $O(N^3)$ for an $N\times N$ matrix, while the multiplication runs in a similar time. \cite{LAPACKAlgorithms}
The least squares method mentioned in this project also has a shortcoming, but this one cannot be solved as easily.
The psudoinverse can be calculated beforehand, but the largest problem is that it is solving the system for every pixel individually and calculating the norm.
......@@ -303,7 +304,7 @@ It was therefore not seen necessary to create a different system to compress ind
A potential workaround for this problem would be to code extraneous values into the image directly instead of adding them to the full dictionary.
This has the downside of not being able to integrate perfectly with Huffman encoding.
A leaf of the tree would have to be a trigger to not use Huffman encoding anymore and use an alternate system to read in the bits.
A leaf of the tree could be a trigger to not use Huffman encoding anymore and use an alternate system to read in the bits.
We did not to do this, but it would be a simple change for someone with a different use case.
{\small
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment