Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
I
image-compression
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Elphel
image-compression
Commits
328d8ed8
Commit
328d8ed8
authored
Mar 08, 2022
by
Nathaniel Callens
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Delete arithmetic.py
parent
1991ccfc
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
0 additions
and
167 deletions
+0
-167
arithmetic.py
arithmetic.py
+0
-167
No files found.
arithmetic.py
deleted
100644 → 0
View file @
1991ccfc
# Example implementation of simple arithmetic coding in Python (2.7+).
#
# USAGE
#
# python -i arithmetic.py
# >>> m = {'a': 1, 'b': 1, 'c': 1}
# >>> model = dirichlet(m)
# >>> encode(model, "aabbaacc")
# '00011110011110010'
#
# NOTES
#
# This implementation has many shortcomings, e.g.,
# - There are several inefficient tests, loops, and conversions
# - There are a few places where code is uncessarily duplicated
# - It does not output the coded message as a stream
# - It can only code short messages due to machine precision
# - The is no defensive coding against errors (e.g., out-of-model symbols)
# - I've not implemented a decoder!
#
# The aim was to make the implementation here as close as possible to
# the algorithm described in lectures while giving some extra detail about
# routines such as finding extensions to binary intervals.
#
# For a more sophisticated implementation, please refer to:
#
# "Arithmetic Coding for Data Compression"
# I. H. Witten, R. M. Neal, and J. G. Cleary
# Communications of the ACM, Col. 30 (6), 1987
#
# AUTHOR: Mark Reid
# CREATED: 2014-09-30
def
encode
(
G
,
stream
):
'''
Arithmetically encodes the given stream using the guesser function G
which returns probabilities over symbols P(x|xs) given a sequence xs.
'''
u
,
v
=
0.0
,
1.0
# The interval [u, v) for the message
xs
,
bs
=
""
,
""
# The message xs, and binary code bs
p
=
G
(
xs
)
# Compute the initial distribution over symbols
# Iterate through stream, repeatedly finding the longest binary code
# that surrounds the interval for the message so far
for
x
in
stream
:
# Record the new symbol
xs
+=
x
# Find the interval for the message so far
F_lo
,
F_hi
=
cdf_interval
(
p
,
x
)
u
,
v
=
u
+
(
v
-
u
)
*
F_lo
,
u
+
(
v
-
u
)
*
F_hi
# Find a binary code whose interval surrounds [u,v)
bs
=
extend_around
(
bs
,
u
,
v
)
# Update the symbol probabilities
p
=
G
(
xs
)
# Stream finished so find shortest extension of the code that sits inside
# the top half of [u, v)
bs
=
extend_inside
(
bs
,
u
+
(
v
-
u
)
/
2
,
v
)
return
bs
##############################################################################
# Models
def
dirichlet
(
m
):
'''
Returns a Dirichlet model (as a function) for probabilities with
prior counts given by the symbol to count dictionary m.
Probabilities returned by the returned functions are (symbol, prob)
dictionaries.
'''
# Build a function that returns P(x|xs) based on the priors in m
# and the counts of the symbols in xs
def
p
(
xs
):
counts
=
m
.
copy
()
for
x
in
xs
:
counts
[
x
]
+=
1
total
=
sum
(
counts
.
values
())
return
{
a
:
float
(
c
)
/
total
for
a
,
c
in
counts
.
items
()
}
# Return the constructed function
return
p
##############################################################################
# Interval methods
def
cdf_interval
(
p
,
a
):
'''
Compute the cumulative distribution interval [F(a'), F(a)) for the
probabilities p (represented as a (symbol,prob) dict) where
F(a) = P(x <= a) and a' is the symbol preceeding a.
'''
F_lo
,
F_hi
=
0
,
0
A
=
sorted
(
p
)
for
x
in
A
:
F_lo
,
F_hi
=
F_hi
,
F_hi
+
p
[
x
]
if
x
==
a
:
break
return
F_lo
,
F_hi
def
binary_interval
(
bs
):
'''
Returns an interval [n, m) for n and m integers, and denominator d
representing the interval [n/d, m/d) for the binary string bs.
'''
n
,
d
=
to_rational
(
bs
)
return
n
,
n
+
1
,
d
def
to_rational
(
bs
):
'''Return numerator and denominator for ratio of 0.bs.'''
n
=
0
for
b
in
bs
:
n
*=
2
n
+=
int
(
b
)
return
n
,
2
**
len
(
bs
)
def
around
(
bs
,
u
,
v
):
'''Tests whether [0.bs, 0.bs111...) contains [u, v).'''
n
,
m
,
d
=
binary_interval
(
bs
)
return
(
n
<=
u
*
d
)
and
(
v
*
d
<=
m
)
def
extend_around
(
bs
,
u
,
v
):
'''Find the longest extension of the given binary string so its interval
wraps around the interval [u, v).'''
contained
=
True
while
contained
:
if
around
(
bs
+
"0"
,
u
,
v
):
bs
+=
"0"
elif
around
(
bs
+
"1"
,
u
,
v
):
bs
+=
"1"
else
:
contained
=
False
return
bs
def
inside
(
bs
,
u
,
v
):
'''Tests whether [0.bs, 0.bs111...) is contained by [u, v).'''
n
,
m
,
d
=
binary_interval
(
bs
)
return
(
u
*
d
<=
n
)
and
(
m
<=
v
*
d
)
def
extend_inside
(
bs
,
u
,
v
):
'''Find the shortest extension of the given binary string so its interval
sits inside the interval [u, v).'''
while
not
inside
(
bs
,
u
,
v
):
# Test whether gap between binary interval and [u,v) is bigger at the
# bottom than at the top
n
,
m
,
d
=
binary_interval
(
bs
)
if
u
*
d
-
n
>
m
-
v
*
d
:
bs
+=
"1"
# If so, move bottom up by halving
else
:
bs
+=
"0"
# If not, move top down by halving
return
bs
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment