transpose_list()

home > kero > Documentation

kero.DataHandler.Generic.py

def transpose_list(input_list):
  return out

This is the usual matrix transpose operation defined 2D matrix in the form of for list of list.

input_list List of list. 2D matrix. Note that each item in the input_list is a row in the matrix.
out List of list. 2D matrix. The matrix transpose of input list.

Example usage

import kero.DataHandler.Generic as kg

mat = [[1,2],[3,4]]
for x in mat:
	print(x)

print("\nTransposed:")
for x in kg.transpose_list(mat):
	print(x)

The output is:

[1, 2]
[3, 4]

Transposed:
[1, 3]
[2, 4]

kero version: 0.1 and above

clone_to_annotate()

home > kero > Documentation

Given a folder of images and a tag folder containing the same images, each of which is marked with white color, RGB=(255,255,255), this function creates

  1. a clone of the folder,
  2. a folder of its ground truth images based on the tag folder,
  3. a folder containing the images in the folder with bounding boxes based on the ground truth images along with text files containing
    image_label img-width img-height xmin ymin xmax ymax
    where (xmin,ymin) the top-left coordinates of the bounding box and (xmax,ymax) bottom-right bounding box.

Tips. Consider using clone_to_annotate_faster().

kero.ImageProcessing.photoBox.py

def clone_to_annotate(self,this_folder, tag_folder,starting_label,annotation_name,
        order_name="img",
        tag_name="imggt",
        check_missing_mode=False,
        rotate_angle_set=None,
        skip_ground_truth=False,
        significant_fraction=0.01,
        thresh=254,
        scale="Auto"):
  return

 

this_folder String. The name of the folder with images.
tag_folder String. The name of the folder with images. If each image in the tag_folder has a corresponding image of the same name and type, then clone folder, clone tag folder, ground truth image folder and annotation folder will be created with their corresponding contents.
starting_label Integer. The sequence of numbers will start with this integer.
annotation_name String. The image label for the white region marked in the tag folder. This will be saved in annotation .txt files.

In example 1, this is “butterfly”. This means that we label the object marked with white a “butterfly”.

order_name String. The clone of this_folder will be relabelled with prefix specified by this string.

Default value =”img”

tag_name String. The clone of tag_folder will be relabelled with prefix specified by this string.

Default value =”imggt”

check_missing_mode Boolean. If True, cloning process of the folders are not performed. The file names of images in this_folder that do not have the corresponding images in the tag_folder will be printed.

Default value =False

rotate_angle_set List of float. Each float is a value in degree with which an image is rotated.

Default value =None

skip_ground_truth Boolean. Set to True if the ground truth images have been created in the manner spawn_ground_truth() spawns them. The function will continue with annotations.
significant_fraction Float. This argument takes in values between 0 and 1.0. It specifies the fraction of area relative to the area of whole matrix, above which a component will be considered as a component. Smaller than this value, the component will be treated as noise. This is an argument to the function get_connected_components().

Default value = .001

thresh Integer, from 0 to 255. This specifies the RGB values below which the ground truth color is treated converted to black, RGB=(0,0,0) and otherwise white, RGB=(255,255,255).

For example, if the value is set to 244, a pixel with (255,254,246) is converted to white since 244<255,254 and 246 while a pixel with (20,40,120) is converted to black.

Default value = 254

scale “Auto”, (Integer,Integer) or None. Annotation of images are computed from down-scaled images to improve the processing speed (unless scale=None). If set to “Auto”, the image will be downsized to (200,150) for processing.

This is an argument to the function multiple_scaled_box_positions().

Default value = “Auto”

Note: The actual image is not changed, only that the annotation positions are re-computed back from down-scaled images, giving potential loss of accuracy.

Tips: If annotation fails for one reason or another after ground truth image generation is complete, then make sure to set skip_ground_truth=True before rerunning the function, so that we do not waste time re-spawning the ground truth images.

Tips: For images whose objects are very small, setting a small scale might be a bad choice, since the position of annotation boxes might loss a lot of accuracy during rescaling.

Example usage 1.

Download the example here and put them in the working directory under the folder /bb.

import kero.ImageProcessing.photoBox as kip

this_folder = "bb\\Butterflies"
tag_folder =  "bb\\Butterflies_canvas"

gsw=kip.GreyScaleWorkShop()
rotate_angle_set = [0,30,60,90,120,150,180] # None
annotation_name = "butterfly"
gsw.clone_to_annotate(this_folder, tag_folder,1,annotation_name, check_missing_mode=False,rotate_angle_set=rotate_angle_set)

This function will call spawn_ground_truth(), i.e. create ground truth image (figure 1 top-right) of the image (figure 1 top-left) based on the corresponding image from tag folder (figure 1 top-center), and furthermore create annotations for each images. Samples of images from the folders are shown below.

gtspawn

Figure 1.

A folder containing rotated figures (figure 2) will be created. Also, a txt file will be created for each rotated copy of the image. The bounding boxes are the thin green rectangles.

annot

Figure 2.

kero version: 0.4.2 and above

spawn_ground_truth()

home > kero > Documentation

Given a folder of images and a tag folder containing the same images, marked with white color, RGB=(255,255,255), this function creates a clone of the folder and a folder of its ground truth images based on the tag folder.

kero.ImageProcessing.photoBox.py

class GreyScaleWorkShop:
  def spawn_ground_truth(self,this_folder, tag_folder,starting_label,
        order_name="img",
        tag_name="img",
        check_missing_mode=False,
        rotate_angle_set=None,
        thresh=254)
    return

 

this_folder String. The name of the folder with images.
tag_folder String. The name of the folder with images. If each image in the tag_folder has a corresponding image of the same name and type, then the rotated image clone and ground truth images will be created and relabelled accordingly.
starting_label Integer. The sequence of numbers will start with this integer.
order_name String. The clone of this_folder will be relabelled with prefix specified by this string.

Default value =”img”

tag_name String. The clone of tag_folder will be relabelled with prefix specified by this string.

Default value =”img”

check_missing_mode Boolean. If True, cloning process of the folders are not performed. The file names of images in this_folder that do not have the corresponding images in the tag_folder will be printed.

Default value =False

rotate_angle_set List of float. Each float is a value in degree with which an image is rotated.

Default value =None

thresh Integer, from 0 to 255. This specifies the RGB values below which the ground truth color is treated converted to black, RGB=(0,0,0) and otherwise white, RGB=(255,255,255).

For example, if the value is set to 244, a pixel with (255,254,246) is converted to white since 244<255,254 and 246 while a pixel with (20,40,120) is converted to black.

Default value = 254

Example usage 1

Download the example here and put them in the working directory under the folder /bb.

import kero.ImageProcessing.photoBox as kip

this_folder = "bb\\Butterflies"
tag_folder =  "bb\\Butterflies_canvas"

gsw=kip.GreyScaleWorkShop()
rotate_angle_set = [0,30,60,90,120,150,180] # None
gsw.spawn_ground_truth(this_folder, tag_folder,1, check_missing_mode=False,rotate_angle_set=rotate_angle_set)

The _canvas folder (top-middle) marks out the butterfly with white colored marker. The ground truth is produced as shown in the top-right.

gtspawn.JPG

Given a folder of images and a tag folder containing the same images, marked with white color, RGB=(255,255,255), this function creates a clone of the folder and a folder of its ground truth images based on the tag folder.

kero.ImageProcessing.photoBox.py

class GreyScaleWorkShop:
  def spawn_ground_truth(self,this_folder, tag_folder,starting_label,
        order_name="img",
        tag_name="img",
        check_missing_mode=False,
        rotate_angle_set=None,
        thresh=254)
    return

 

this_folder String. The name of the folder with images.
tag_folder String. The name of the folder with images. If each image in the tag_folder has a corresponding image of the same name and type, then the rotated image clone and ground truth images will be created and relabelled accordingly.
starting_label Integer. The sequence of numbers will start with this integer.
order_name String. The clone of this_folder will be relabelled with prefix specified by this string.

Default value =”img”

tag_name String. The clone of tag_folder will be relabelled with prefix specified by this string.

Default value =”img”

check_missing_mode Boolean. If True, cloning process of the folders are not performed. The file names of images in this_folder that do not have the corresponding images in the tag_folder will be printed.

Default value =False

rotate_angle_set List of float. Each float is a value in degree with which an image is rotated.

Default value =None

thresh Integer, from 0 to 255. This specifies the RGB values below which the ground truth color is treated converted to black, RGB=(0,0,0) and otherwise white, RGB=(255,255,255).

For example, if the value is set to 244, a pixel with (255,254,246) is converted to white since 244<255,254 and 246 while a pixel with (20,40,120) is converted to black.

Default value = 254

Example usage 1

Download the example here and put them in the working directory under the folder /bb.

import kero.ImageProcessing.photoBox as kip

this_folder = "bb\\Butterflies"
tag_folder =  "bb\\Butterflies_canvas"

gsw=kip.GreyScaleWorkShop()
rotate_angle_set = [0,30,60,90,120,150,180] # None
gsw.spawn_ground_truth(this_folder, tag_folder,1, check_missing_mode=False,rotate_angle_set=rotate_angle_set)

The _canvas folder (top-middle) marks out the butterfly with white colored marker. The ground truth is produced as shown in the top-right.

gtspawn.JPG

kero version: 0.5.1 and above

tag_rename_clone()

home > kero > Documentation

kero.ImageProcessing.photoBox.py

def tag_rename_clone(this_folder, tag_folder,starting_label,
    order_name="img",tag_name="imggt",filetype=".png",
    clone_name="_CLONE",tag_clone_name="_TAG",
    dump_folder="this_dump",
    check_missing_mode=False,
    rotate_angle_set=None):
  return

Description. Cloning folder for image pre-processing. See example usage 1.

Rotation function. When the variable rotate_angle_set is set to a list of float [a1, a2,…] , then each image in the folder will be cloned, and then copies of the image rotated with angles a1, a2, … will be created in the clone folder and labelled along the sequence of numbers as well.

this_folder String. The name of the folder with images.
tag_folder String. The name of the folder with images. If each image in the tag_folder has a corresponding image of the same name and type, then the image will be cloned into a clone folder and tag clone folder and relabelled accordingly.
starting_label Integer. The sequence of numbers will start with this integer.
order_name String. The clone of this_folder will be relabelled with prefix specified by this string.

Default value =”img”

tag_name String. The clone of tag_folder will be relabelled with prefix specified by this string.

Default value =”imggt”

filetype String. Each image in both clone folders will have file type specified by this string.

Default value =”.png”

clone_name String. The clone folder of this_folder will be named this_folder+clone_name

Default value =”_CLONE”

tag_clone_name String. The clone folder of tag_folder will be named this_folder+tag_clone_name

Default value =”_GT”

dump_folder String. The name of dump folder. Dump folder will be filled with images from this_folder that do not have corresponding images with the same name and type in the tag_folder. If this folder is empty, it will be deleted at the end.

Default value =”this_dump”

check_missing_mode Boolean. If True, cloning process of the folders are not performed. The file names of images in this_folder that do not have the corresponding images in the tag_folder will be printed.

Default value =False

rotate_angle_set List of float. Each float is a value in degree with which an image is rotated.

Default value =None

 

Example usage 1

Given a folder of images and a tag folder, this function clones both folder and relabel the images in numerical sequence. Consider the folder as shown below (left) and its tag folder (right) which is identical to the folder except each corresponding butterfly object is marked white. Download the example here and put them in the working directory under the folder /bb.

import kero.ImageProcessing.photoBox as kip

this_folder = "bb\\Butterflies"
tag_folder =  "bb\\Butterflies_canvas"

kip.tag_rename_clone(this_folder, tag_folder, 1, check_missing_mode=False)

 

bbfolder.png

The cloned and relabelled folders are shown below.

bbfolder2.png

kero version: 0.4.3 and above

multiple_scaled_box_positions()

home > kero > Documentation

This function returns the set of bounding boxes for all connected components in a scaled gray scale matrix. The outcome is optimal if the matrix is a zeros-and-ones matrix. See the image in example 1.

kero.ImageProcessing.photoBox.py

class GreyScaleWorkShop:
  def multiple_scaled_box_positions(self, gsimg, scale="Auto", significant_fraction=0.001):
    return rect_set
gsimg List of list. 2D matrix scaled to 0 and 1.

Note: To better obtain bounding box, use zeros and ones 2D matrix.

scale=”Auto” None: Input image is processed without scaling. This will be slow.

“Auto”: Input image is scaled to (200,150) pixels before being scanned.

(Integer, Integer): a 2-tuple of integers specifying the size in pixels the image is resize before being scanned.

Image will be fed into the function get_connected_components() as an argument. It is recommended that the scale is chosen to be small if processing speed is the priority.

Default value is “Auto”

significant_fraction Float, between 0 and 1.0. Argument to the function get_connected_components().

Default value = 0.001

return rect_set List of [xmin, ymin, xmax, ymax], each specifying the shape of the rectangle of a bounding box. (xmin, ymin) is the top-left coordinates and (xmax, ymax) the bottom-right.

Download the image from here and place the image in the working directory.

Example usage 1

Try the following with different value of scale as well.

import kero.ImageProcessing.photoBox as kip
import cv2, time

t=time.time()
gsw = kip.GreyScaleWorkShop()
fn= "pbtest3.png" 
s1 = cv2.imread(fn)
s1grey = cv2.cvtColor(s1, cv2.COLOR_BGR2GRAY)/255
# The time listed in the following applies for "pbtest3.png", 576 by 480 pixels
# rect_set = gsw.multiple_scaled_box_positions(s1grey, scale="Auto", significant_fraction=0.001) # 1.8397045135498047 s
rect_set = gsw.multiple_scaled_box_positions(s1grey, scale="Auto", significant_fraction=0.001) # 4.6596386432647705 s
print("no of rects = ", len(rect_set))
print("rect_set=",rect_set)
for i in range(len(rect_set)):
	print("index of rect = ",i)
	rect = rect_set[i]
	xmin = rect[0]
	ymin = rect[1]
	xmax = rect[2]
	ymax = rect[3]
	cv2.rectangle(s1, (xmin, ymin), (xmax,ymax),(0,255,0),1)

	# cv2.imshow("grey", s1grey)
cv2.imshow("color", s1)
elapsed = time.time() - t
print("time taken (s) : ",elapsed)
cv2.waitKey(0)

The output is shown below where the green boxes show the bounding boxes delineating each connected component. However, with scale set to “Auto” which is (200,150), we can see the loss of precision. With larger scale (300,200), the correct boxes are obtained.

multibox.JPG

kero version: 0.4.3 and above

get_connected_components()

home > kero > Documentation

kero.ImageProcessing.photoBox.py

def get_connected_components(gs_matrix, significant_fraction=0.001):
  return out

This function takes in 2D matrix containing only zeros and ones. They are to represent images thresholded to either black and white, i.e. RGB (0,0,0) and (255,255,255) converted to gray scale and scaled to 0 and 1. And then it outputs a list of similarly zero and one 2D matrices, each of which only contains a single connected component. See the image shown in example 1.

Important: this function makes use of some recursions and is thus slow for processing large image. The strategy is to shrink the image (for example to about 200 by 150) and compute the actual components of the original large image, although we will lose some precision.

gs_matrix List of integers, 2D matrix of zeros and ones.
significant_fraction Float. This argument takes in values between 0 and 1.0. It specifies the fraction of area relative to the area of whole matrix, above which a component will be considered as a component.

The total area is computed by multiplying the number of rows to the number of columns. On the other hand, we can see the matrix as an image, where each entry corresponds to a pixel.

A pixel (i, j) is connected to the next pixel if and only if the next pixel differs by at most 1 in at most one entry. i.e. for (i, j), the entry (i, j+1) is a next pixel but (i-1,j+1) one pixel diagonal away is not. The area a component is the sum of all entries in the component matrix.

Default value is .001

return out List of 2D matrix. Each 2D matrix is a zero and one matrix. Each 2D matrix contains a single connected component.

Example usage 1

Uncomment the commented parts of the codes to try this function on different images and 2D matrix.

import cv2
import numpy as np
import kero.ImageProcessing.photoBox as kip

# # TEST
# gs_matrix = [[0, 0, 0, 0,0,1], [1, 0, 1,0,0,1], [1, 1, 1,0,0,1],
#              [0, 0, 0, 0, 0, 0],[0,0,0,0,0,0],[0,0,0,1,1,0],
#              [0, 0, 0, 0, 1, 0]]
# out = kip.get_connected_components(gs_matrix)
# cv2.imshow("init", np.array(gs_matrix)*255)
# for x in gs_matrix:
#     print(x)
# for i in range(len(out)):
#     print("__________________________")
#     for x in out[i]:
#         print(x)
#     # cv2.imshow(str(i), np.array(out[i])*255)
# # cv2.waitKey(0)

s1 = cv2.imread("pbtest3.png")
s1grey = cv2.cvtColor(s1, cv2.COLOR_BGR2GRAY)/255
cv2.imshow("grey", s1grey)
print(np.array(s1grey).shape)

s1grey_components = kip.get_connected_components(s1grey, significant_fraction=0.001)

print(np.array(s1grey_components).shape)
for i in range(len(s1grey_components)):
    cv2.imshow("grey" + str(i), 255*np.array(s1grey_components[i]))

cv2.waitKey(0)

The figure below is pbtest3.png. With the above settings, 3 connected components are obtained. Try increase the argument significant_fraction to 0.01. The bottom left component will be ignored, since the area is not large enough relative to the entire image to qualify as a component. This is useful to prevent noises from being recognized as components.

getcc

Some other images that you can segment using this function are shown below, and can be downloaded here.

thisimg.JPG

kero version: 0.4 and above

get_segment_set()

home > kero > Documentation

This function deconstruct a matrix of zeros and ones to a list of its rows. And then, each row is deconstructed to its segments.

kero.ImageProcessing.photoBox.py 

def get_segment_set(mat):
  return liber_mat_set, segment_set

 

mat 2D matrix. List of list or numpy array
return liber_mat_set List of 2D matrix. Each matrix is a zero matrix, except it contains a segment from some row.
return segment_set List (3 level). It is a collection of list (a matrix row) of segments.

Example usage 1

import kero.ImageProcessing.photoBox as kip
import numpy as np

# gs_matrix = np.random.randint(2, size=(5, 10))
gs_matrix=[[0,1,1,0,0],[1,0,0,0,1],[1,1,1,1,1]]

print("gs_matrix:\n")
for x in gs_matrix:
    print(x)

liber_mat_set, segment_set = kip.get_segment_set(gs_matrix)

print("\n**** liber mat view: ****")
seg = [[0]*len(gs_matrix[0]) for i in range(len(gs_matrix))]
for k in range(len(liber_mat_set)):
    print("---- k = ", k , " ----")
    for i in range(len(liber_mat_set[k])):
        print(liber_mat_set[k][i])
        seg = kip.list_matrix_add(seg, liber_mat_set[k][i])

print("\nmatrix reconstructed from sum of segments. They should be equal to gs_matrix")
for k in seg:
    print(k)


print("\n**** segment view: ****")
for k in range(len(segment_set)):
    print("---- k = ", k , " ----")
    for i in range(len(segment_set[k])):
        print(segment_set[k][i])

The output is shown below.

gs_matrix:

[0, 1, 1, 0, 0]
[1, 0, 0, 0, 1]
[1, 1, 1, 1, 1]

**** liber mat view: ****
---- k = 0 ----
[[0, 1, 1, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
---- k = 1 ----
[[0, 0, 0, 0, 0], [1, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 1], [0, 0, 0, 0, 0]]
---- k = 2 ----
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [1, 1, 1, 1, 1]]

matrix reconstructed from sum of segments. They should be equal to gs_matrix
[0, 1, 1, 0, 0]
[1, 0, 0, 0, 1]
[1, 1, 1, 1, 1]

**** segment view: ****
---- k = 0 ----
[0, 1, 1, 0, 0]
---- k = 1 ----
[1, 0, 0, 0, 0]
[0, 0, 0, 0, 1]
---- k = 2 ----
[1, 1, 1, 1, 1]

kero version: 0.4 and above

find_segment_index()

home > kero > Documentation

kero.ImageProcessing.photoBox.py

def find_segment_index(mat, start_entry):
  return row_index, column_index_list, out_mat, liber_mat
mat 2D matrix. List of list or numpy array.
start_entry [Integer, Integer]. This is the matrix index (zero-based) that is to point to a non-zero zero element. This function will search for a segment in the matrix that is non-zero in all entries.
return row_index Integer. The index of the row where the non-zero segment is located.
return column_index_list List of integer. The list of indices at row_index-th row that makes the non-zero segment.
return out_mat 2D matrix. This is the input mat, but with the segment removed.
return liber_mat 2D matrix. This is the matrix containing only the non-zero segment.

Example usage 1:

import kero.ImageProcessing.photoBox as kip
import numpy as np


gs_matrix = np.random.randint(2, size=(5, 10))
print(gs_matrix)

this_i, this_j = kip.find_first_non_zero_index(gs_matrix)
print("first non zero index: ", this_i, this_j)

start_entry = [this_i, this_j]
[row_index, column_index_list, out_mat, liber_mat] = kip.find_segment_index(gs_matrix, start_entry)
print("first segment index: ", row_index, " : ", column_index_list)

print("third output:\n", out_mat)
print("last output:")
for x in liber_mat:
	print(x)

As shown below, the first non-zero index and the first continuously non-zero segment are highlighted in blue. The red highlight shows the same matrix with this segment removed. The next blue highlight shows the extracted segment.

[[1 1 1 0 1 1 1 0 0 1]
 [1 1 0 0 1 0 0 1 0 1]
 [0 1 0 0 0 0 1 0 1 1]
 [1 0 1 1 1 1 0 0 0 0]
 [1 1 0 1 1 0 0 1 0 0]]
first non zero index:  0 0
first segment index (first and second output):  0  :  [0, 1, 2]
third output:
 [[0 0 0 0 1 1 1 0 0 1]
 [1 1 0 0 1 0 0 1 0 1]
 [0 1 0 0 0 0 1 0 1 1]
 [1 0 1 1 1 1 0 0 0 0]
 [1 1 0 1 1 0 0 1 0 0]]
last output:
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

kero version: 0.4 and above

find_first_non_zero_index()

home > kero > Documentation

kero.ImageProcessing.photoBox.py

def find_first_non_zero_index(mat, row_start=0):
  return i, j
mat 2D matrix. List of list or numpy array
row_start Integer. The value row_start is the index of the row from which this function will start looking for the first non-zero element. This means the function will start searching from the (row_start + 1) row of the matrix.

Default to zero, i.e. start searching from the first row.

Return i, j The index (i, j) where the matrix has non-zero element, starting from row (row_start)-th index.

Example Usage 1

import kero.ImageProcessing.photoBox as kip
import numpy as np

# gs_matrix = [
# [1,1,0,0,1 ],
# [1,0,0,1,1 ],
# [1,1,1,0,1 ]
# ]

gs_matrix = np.random.randint(2, size=(5, 10))
print(gs_matrix)

this_i, this_j = kip.find_first_non_zero_index(gs_matrix)
print("first non zero index: ", this_i, this_j)

start_entry = [this_i, this_j]
[row_index, column_index_list, out_mat, liber_mat] = kip.find_segment_index(gs_matrix, start_entry)
print("first segment index: ", row_index, " : ", column_index_list)

The output:

[[1 1 0 1 1 1 0 1 1 0]
[0 1 1 1 1 1 1 0 1 1]
[0 0 1 1 0 0 1 1 1 1]
[0 0 1 0 1 0 1 0 0 0]
[1 1 0 0 0 1 0 1 0 1]]
first non zero index: 0 0
first segment index: 0 : [0, 1]

The first non zero index is bolded in blue.

kero version: 0.4 and above

Object Detection using Tensorflow: bee and butterfly Part V

homeMachine Learning >Image Processing

Object Detection using Tensorflow: bee and butterflies

  1. Part 1: set up tensorflow in a virtual environment
  2. adhoc functions
  3. Part 2: preparing annotation in PASCAL VOC format
  4. Part 3: preparing tfrecord files
  5. more scripts
  6. Part 4: start training our machine learning algorithm!
  7. COCO API for Windows
  8. Part 5: perform object detection

Tips. Instead of reading this post, read instead Object Detection using Tensorflow: bee and butterfly Part V, faster. The object detection process is performed with a much more efficient arrangement. The code below has re-run tensorflow session at each iteration of image which we want to perform object detection on. However, this costs a large time overhead. In our new code, the session is only run once. The first image will take some tiem, but the subsequent images will be processed very quickly.

In part IV, we end with completing the training of our faster R-CNN model. Since we ran 2000 training steps, the last produced model checkpoints will be model.ckpt-2000. We need to make a frozen graph out of it to be able to successfully utilize it for prediction.

Freezing the graph

Let’s go into command line cmd.exe. Remember to go into the virtual environment if you started with one, as we instructed.

cd C:\Users\acer\Desktop\adhoc\myproject\Lib\site-packages\tensorflow\models\research
SET INPUT_TYPE=image_tensor
SET TRAINED_CKPT_PREFIX="C:\Users\acer\Desktop\adhoc\myproject\models\model\model.ckpt-2000"
SET PIPELINE_CONFIG_PATH="C:\Users\acer\Desktop\adhoc\myproject\models\model\faster_rcnn_resnet101_coco.config"
SET EXPORT_DIR="C:\Users\acer\Desktop\adhoc\myproject\models\export"
python object_detection/export_inference_graph.py --input_type=%INPUT_TYPE% --pipeline_config_path=%PIPELINE_CONFIG_PATH% --trained_checkpoint_prefix=%TRAINED_CKPT_PREFIX% --output_directory=%EXPORT_DIR%

Upon successful completion, the following will be produced in the directory .

adhoc/myproject/models/export
+ saved_model
  + variables
  - saved_model.pb
+ checkpoint
+ frozen_inference_graph.pb
+ pipeline.config
+ model.ckpt.data-00000-of-00001
+ model.ckpt.index
+ model.ckpt.meta

Notice that three ckpt files are created. We can use this for further training by replacing the 3 ckpt files from part 4.

frozen_inference_graph.pb is the file we will be using for prediction. We just need to run the following python file with suitable configuration. Create the following directory and put all the images that contain butterflies or bees which you want the algorithm to detect into the folder img_predict. I will put 4 images, 2 from the images/test and 2 completely new pictures, neither from images/test nor images/train (but also from https://www.pexels.com/). As seen in blue, these images are named predict1.png ,predict2.png, predict3.png and predict4.png.

adhoc/myproject/
+ ...
+ img_predict

Finally, to perform prediction, just run the following using cmd.exe after moving into adhoc/myproject folder where we place our prediction.py (see the script below).

python prediction.py

and predict1_MARKED.png, for example, will be produced in img_predict, with boxes showing the detected object, either butterfly or bee.

See the blue highlight below; most configurations that need to be done are in blue. The variable TEST_IMAGES_NAMES contains the name of the files we are going to predict. You can rename the images or just change the variable. Note that in this code, the variable filetype stores the file type of images we are predicting. For each prediction, thus, we can only perform prediction for the same type of images. Of course we can do better. Modify the script accordingly.

prediction.py

# from distutils.version import StrictVersion
import os, sys,tarfile, zipfile
import numpy as np
import tensorflow as tf
import six.moves.urllib as urllib
from PIL import Image
from io import StringIO
from matplotlib import pyplot as plt
from collections import defaultdict
from object_detection.utils import ops as utils_ops

# Paths settings
THE_PATH = "C:/Users/acer/Desktop/adhoc/myproject/Lib/site-packages/tensorflow/models/research"
sys.path.append(THE_PATH)
sys.path.append(THE_PATH+"/object_detection")
PATH_TO_FROZEN_GRAPH = "C:/Users/acer/Desktop/adhoc/myproject/models/export/frozen_inference_graph.pb"
PATH_TO_LABELS = "C:/Users/acer/Desktop/adhoc/myproject/data/butterfly_bee_label_map.pbtxt"
PATH_TO_TEST_IMAGES_DIR = 'C:/Users/acer/Desktop/adhoc/myproject/img_predict'

TEST_IMAGE_NAMES = ["predict1","predict2","predict3","predict4"]
filetype = '.png'
TEST_IMAGE_PATHS = [''.join((PATH_TO_TEST_IMAGES_DIR, '\\', x, filetype)) for x in TEST_IMAGE_NAMES]
# print("test image path = ", TEST_IMAGE_PATHS)
IMAGE_SIZE = (12, 8) # Size, in inches, of the output images.
NUM_CLASSES = 90

from utils import label_map_util
from utils import visualization_utils as vis_util
sys.path.append("..")
# MODEL_NAME = 'faster_rcnn_resnet101_pets'

detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')


label_map  = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
print(category_index)

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)


def run_inference_for_single_image(image, graph):
  with graph.as_default():
    with tf.Session() as sess:
      # Get handles to input and output tensors
      ops = tf.get_default_graph().get_operations()
      all_tensor_names = {output.name for op in ops for output in op.outputs}
      tensor_dict = {}
      for key in [
          'num_detections', 'detection_boxes', 'detection_scores',
          'detection_classes', 'detection_masks'
      ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
          tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
              tensor_name)
      if 'detection_masks' in tensor_dict:
        # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes, image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(
            tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(
            detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict,
                             feed_dict={image_tensor: np.expand_dims(image, 0)})

      # all outputs are float32 numpy arrays, so convert types as appropriate
      output_dict['num_detections'] = int(output_dict['num_detections'][0])
      output_dict['detection_classes'] = output_dict[
          'detection_classes'][0].astype(np.uint8)
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
      output_dict['detection_scores'] = output_dict['detection_scores'][0]
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]
  return output_dict

for image_path, image_name in zip(TEST_IMAGE_PATHS, TEST_IMAGE_NAMES):
  image = Image.open(image_path)
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)
  # Actual detection.
  output_dict = run_inference_for_single_image(image_np, detection_graph)
  # Visualization of the results of a detection.
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      # ercx!
      # each element in DETECTION BOX is [ymin, xmin, ymax, xmax]
      # do consider the following
      #           im_width, im_height = image.size
      #       if use_normalized_coordinates:
      #         (left, right, top, bottom) = (xmin * im_width, xmax * im_width,
      #                                       ymin * im_height, ymax * im_height)
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks'),
      use_normalized_coordinates=True,
      line_thickness=2,
      min_score_thresh = 0.4)
  # print("detection_boxes:")
  # print(output_dict['detection_boxes'])
  # print(type(output_dict['detection_boxes']),len(output_dict['detection_boxes']))
  # print('detection_classes')
  # print(output_dict['detection_classes'])
  # print(type(output_dict['detection_classes']),len(output_dict['detection_classes']))
  # print('detection_scores')
  # print(output_dict['detection_scores'], len(output_dict['detection_scores']))
  print('\n**************** detection_scores\n')
  print(output_dict['detection_scores'][1:10])
  plt.figure(figsize=IMAGE_SIZE)
  # plt.imshow(image_np)
  plt.imsave(''.join((PATH_TO_TEST_IMAGES_DIR, '\\',image_name,"_MARKED", filetype)), image_np)

Here are the outputs, pretty good I will say.

predict.png

Note: the commented codes are there to assist you with the output of prediction. For example, if you would like to extract the data points of the rectangles that show the position of the butterflies or the bees, note that you can obtain it from output_dict[‘detection_boxes’]. Other information is stored in the dictionary output_dict as well.

You can play around with different models. But that’s it for now, cheers!