" tiff_image_path (string): path to the tiff file\n",
" \n",
" Return:\n",
" image ndarray(512 X 640): original image \n",
" predict ndarray(325380,): predicted image excluding the boundary\n",
" diff. ndarray(325380,): IF difference = TRUE, difference between the min and max of four neighbors exclude the boundary\n",
" ELSE: the residuals of the four nearest pixels to a fitted hyperplane\n",
" error ndarray(325380,): difference between the original image and predicted image\n",
" A ndarray(3 X 3): system of equation\n",
" \"\"\"\n",
" image_obj = Image.open(tiff_image_path) #Open the image and read it as an Image object\n",
" image_array = np.array(image_obj)[1:,:].astype(int) #Convert to an array, leaving out the first row because the first row is just housekeeping data\n",
" # image_array = image_array.astype(int) \n",
" A = np.array([[3,0,-1],[0,3,3],[1,-3,-4]]) # the matrix for system of equation\n",
" tiff_image_path (string): path to the tiff file\n",
" \n",
" Return:\n",
" image ndarray(512 X 640): original image \n",
" predict ndarray(325380,): predicted image excluding the boundary\n",
" diff. ndarray(325380,): IF difference = TRUE, difference between the min and max of four neighbors exclude the boundary\n",
" ELSE: the residuals of the four nearest pixels to a fitted hyperplane\n",
" error ndarray(325380,): difference between the original image and predicted image\n",
" A ndarray(3 X 3): system of equation\n",
" \"\"\"\n",
" image_obj = Image.open(tiff_image_path) #Open the image and read it as an Image object\n",
" image_array = np.array(image_obj)[1:,:].astype(int) #Convert to an array, leaving out the first row because the first row is just housekeeping data\n",
" # image_array = image_array.astype(int) \n",
" A = np.array([[3,0,-1],[0,3,3],[1,-3,-4]]) # the matrix for system of equation\n",
"Round 1 : Using PIL to compress the tiff file.\n",
"\n",
"Conclusion : The best that I could get using the lzw compressor from PIL was 61.5%. This is worse than the other one at 40, which is likely because it is using multiple channels instead of just one."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"def file_extractor(dirname=\"images\"):\n",
" files = os.listdir(dirname)\n",
" scenes = []\n",
" for file in files:\n",
" if file == '.DS_Store':\n",
" continue\n",
" else:\n",
" scenes.append(os.path.join(dirname, file))\n",
" return scenes\n",
"\n",
"def image_extractor(scenes):\n",
" image_folder = []\n",
" for scene in scenes:\n",
" files = os.listdir(scene)\n",
" for file in files:\n",
" if file[-5:] != \".tiff\" or file[-7:] == \"_6.tiff\":\n",
"Round 2: Trying the same thing but with tifffile and doing grayscale\n",
"\n",
"Conclusion: The documentation was terrible. It advertizes compression but doesn't say how it's done, and the specifics of it. Moving on to something else, but still worth a shot."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(640, 513)\n"
]
}
],
"source": [
"scenes = file_extractor()\n",
"images = image_extractor(scenes)\n",
"first_source = images[0]\n",
"picture = Image.open(first_source)\n",
"print(picture.size)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"import tifffile"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 2 1 54668 ... 65535 65535 65535]\n",
" [22275 22292 22292 ... 22280 22212 22270]\n",
" [22303 22301 22298 ... 22254 22248 22262]\n",
" ...\n",
" [21832 21820 21844 ... 21892 21852 21845]\n",
" [21843 21821 21830 ... 21870 21865 21864]\n",
" [21836 21829 21840 ... 21858 21857 21860]]\n"
]
}
],
"source": [
"otherpic = tifffile.imread(first_source)\n",
"print(otherpic)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Round 3: Trying numcompress, something that compresses numbers. Should provide an ok benchmark, and is actually documented.\n",
"\n",
"Conclusion: The headline spoofed, it said I could get over 80%, I got under 30%. So that's worth looking into. Also this algorithm doesn't look into verticle changes, just horizontal, so I don't know why they advertized it to be as good as it is. No way it could be better than PNG, which doesn't advertize anything that high."