Object Detection using Tensorflow: bee and butterflies
- Part 1: set up tensorflow in a virtual environment
- adhoc functions
- Part 2: preparing annotation in PASCAL VOC format
- Part 3: preparing tfrecord files
- more scripts
- Part 4: start training our machine learning algorithm!
- COCO API for Windows
- Part 5: perform object detection
In part II we have created a directory storing butterflies and bees images, together with all the annotations showing where in each image a butterfly or a bee is. Now we convert them into tfrecord files, i.e. convert them into the format that the tensorflow algorithm we use can read.
You are encouraged to create an adhoc script to automate this whole part as well. Our demonstrations will be semi-manual. This part follows the steps recommended here.
Train and test split
Create the following empty folders:
~/adhoc/keropb/butterflies_and_bees + Butterflies_train + Butterflies_test + Bees_train + Bees_test + ...
From Butterflies_CLONE, copy all images to Butterflies_train. From Butterflies_ANNOT, copy all xml files to the same Butterflies_train folder. Do the corresponding steps to the Bees. Now run the script do_move_a_fraction.py in the folder adhoc/keropb (We again make use of adhoc_functions.py from here).
import adhoc_functions as af src="C:\\Users\\acer\\Desktop\\adhoc\\keropb\\butterflies_and_bees\\Butterflies_train" tgt="C:\\Users\\acer\\Desktop\\adhoc\\keropb\\butterflies_and_bees\\Butterflies_test" af.move_some_percent(src,tgt) src="C:\\Users\\acer\\Desktop\\adhoc\\keropb\\butterflies_and_bees\\Bees_train" tgt="C:\\Users\\acer\\Desktop\\adhoc\\keropb\\butterflies_and_bees\\Bees_test" af.move_some_percent(src,tgt)
What we are doing above is to move 10% of the images and annotations from the train folders to the corresponding test folders. Here we are using roughly 10% of all the images we have to test if the model we train using the rest 90% is performing well. As of now, I have 291 images of bees and butterflies for training and 31 for testing (yes, by right we should have more).
Now create the following directory.
C:\Users\acer\Desktop\adhoc\myproject\images + train + test
Put all files from Butterflies_train and Bees_train into images/train and all files from Butterflies_test and Bees_test into images/test.
Conversion to tfrecords
The following step will be quite memory inefficient. Copy all files from images/train and images/test into the images folder. We will need it.
Add the following file to the directory
adhoc/myproject + ... + xml_to_csv.py
Note that this file, as shown here, needs to be configured. The variable image_path has to point to the train and test folders in images folder in adhoc/myproject. See the instruction in the link. Now go to the command line cmd.exe.
cd "C:\Users\acer\Desktop\adhoc\myproject" python xml_to_csv.py
Both test_labels.csv and train_labels.csv will be produced in adhoc/myproject/data if the process is successful. Check that the csv files contains something like this
filename, width, height, class, xmin, ymin, xmax, ymax imgBEE_107.png, 524, 350, butterfly, 151, 9, 424, 224 ...
Also, add the following file to the directory
adhoc\myproject\Lib\site-packages\tensorflow\models\research\object_detection + ... + generate_tf_records.py
This file is also shown here, and needs to be configured similarly. In main() of the script, adjust the variable path and output_path (see green highlight in the link) to the following.
path = os.path.join("C:\\Users\\acer\\Desktop\\adhoc\\myproject\\", 'images') output_path = "C:\\Users\\acer\\Desktop\\adhoc\\myproject\\"
Also, edit the following function to correspond to the label map in the case you want to add more types of insects (see the orange highlight in the link).
def class_text_to_int(row_label): if row_label == 'butterfly': return 1 elif row_label == 'bee': return 2 else: None
Now go to the command line cmd.exe, move into directory tensorflow\models\research\object_detection using
Create the tfrecord files using the following. Of course the paths arguments output_path and csv_input must be changed accordingly.
python generate_tf_records.py --csv_input="C:/Users/acer/Desktop/adhoc/myproject/data/train_labels.csv" --output_path="C:/Users/acer/Desktop/adhoc/myproject/data/train.record" python generate_tf_records.py --csv_input="C:/Users/acer/Desktop/adhoc/myproject/data/test_labels.csv" --output_path="C:/Users/acer/Desktop/adhoc/myproject/data/test.record"
Both test.record and train.record will be produced in adhoc/myproject/data.
See the next part, part IV, for training and prediction.