Object Detection using Tensorflow: bee and butterfly Part IV

homeMachine Learning >Image Processing

Object Detection using Tensorflow: bee and butterflies

  1. Part 1: set up tensorflow in a virtual environment
  2. adhoc functions
  3. Part 2: preparing annotation in PASCAL VOC format
  4. Part 3: preparing tfrecord files
  5. more scripts
  6. Part 4: start training our machine learning algorithm!
  7. COCO API for Windows
  8. Part 5: perform object detection

We have prepared tfrecord files, which are basically just the images and annotations bundled into a format that we can feed into our tensorflow algorithm. Now we start the training.

Before proceeding, we need to use coco API for python. It is given here, though the instruction given is to set up for Linux. See here for the instruction to set it up in Windows. Once done, copy the entire folder pycocotools from inside PythonAPI into the following folder.


Some recap: in part I, we have set the label map but have not configured the model we will use.


item {
id: 1
name: 'butterfly'

item {
id: 2
name: 'bee'

Download the previously trained model* by going to this link and click “Downloading a COCO-pretrained Model for Transfer Learning” or directly download it here. Unzip the .tar file (download winrar to unzip it if your system could not do it), among the unzipped items there will be 3 checkpoint files (.ckpt). Move them into our model directory so that we have:

+ faster_rcnn_resnet101_coco.config
+ model.ckpt.data-00000-of-00001
+ model.ckpt.index
+ model.ckpt.meta

We will now configure the PATH_TO_BE_CONFIGURED in faster_rcnn_resnet101_coco .config inside the folder adhoc/myproject/models/model/. As the name suggest, we are using faster R-CNN, regions with convolutional neural network features by Ross Girshick et al. There are 5 PATH_TO_BE_CONFIGURED, each pointing to the corresponding files.

train_config: {
# fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
fine_tune_checkpoint: "C:/Users/acer/Desktop/adhoc/myproject/models/model/model.ckpt"

train_input_reader: {
  tf_record_input_reader {
    # input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-?????-of-00100"
    input_path: "C:/Users/acer/Desktop/adhoc/myproject/data/train.record"
  # label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
  label_map_path: "C:/Users/acer/Desktop/adhoc/myproject/data/butterfly_bee_label_map.pbtxt"

eval_input_reader: {
  tf_record_input_reader {
    # input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-?????-of-00010"
    input_path: "C:/Users/acer/Desktop/adhoc/myproject/data/test.record"
  # label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
  label_map_path: "C:/Users/acer/Desktop/adhoc/myproject/data/butterfly_bee_label_map.pbtxt"
  shuffle: false
  num_readers: 1

Let us start training! Here we are using a small number of training and evaluation steps, just to finish the training fast, of course at the expense of accuracy. The official site recommends NUM_TRAIN_STEPS=50000 and NUM_EVAL_STEPS=2000 steps in its tutorial. Go to command line, cmd.exe.

cd "C:\Users\acer\Desktop\adhoc\myproject\Lib\site-packages\tensorflow\models\research"

SET PIPELINE_CONFIG_PATH="C:\Users\acer\Desktop\adhoc\myproject\models\model\faster_rcnn_resnet101_coco.config"
SET MODEL_DIR="C:\Users\acer\Desktop\adhoc\myproject\models\model"
echo %MODEL_DIR%
python object_detection/model_main.py --pipeline_config_path=%PIPELINE_CONFIG_PATH% --model_dir=%MODEL_DIR% --num_train_steps=%NUM_TRAIN_STEPS% --num_eval_steps=%NUM_EVAL_STEPS% --alsologtostderr

Some possible errors may arise are listed at the end of this post***.

Many warnings may pop-up, but it will be fairly obvious if the training is ongoing (and not terminating due to some error). If you train your model in a laptop like mine, with only a single NVIDIA GeForce GTX 1050, you might be running out of memory as well. From task manager, my utilization profile looks like this. GPU is used in larger spikes consistently (more than 25% each spike), and CPU resource is heavily consumed.


See the next part on how to use to perform object detection after the training is completed.

Note: I notice there is some strange behaviour. Sometimes the training stops (task manager shows no resource consumption) that will proceed when I press enter at the command lines. To make sure this does not prevent us from completing the training, press enter several times in advance at the start of training.

Monitoring progress

We can monitor progress using tensorboard. Open another command line cmd.exe, enter the following

SET MODEL_DIR="C:\Users\acer\Desktop\adhoc\myproject\models\model"
tensorboard --logdir=%MODEL_DIR%

then open your web browser and type in the URL, given at the name of your computer port 6006. For mine, it is http://NITROGIGA:6006. About 15 minutes into the training, tensorboard shows the following:


Okay, it seems like the algorithm is detecting some yellow little parts of the flowers as a bee as well…

The following is an indication that the training is in progress.

creating index...
index created!
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.12s).
Accumulating evaluation results...
DONE (t=0.04s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.015
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.056
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.003
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.052
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.040
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.087
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.093
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000

and in the directory adhoc/myprojec/models/model, more checkpoints files will be created in a batch of 3: data, index and meta. For example, at the 58-th (out of the specified 2000) training steps, these are created.


Update: the training lasted about 4 hours.

*** Possible errors

If you see errors like

(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

do check .config files. When setting the PATH_TO_BE_CONFIGURED, use either double back slash \\ or a single slash /. Using a single backslash \ will give error. This is just a problem of escaping character in string.

*** Another possible error.

We successfully used tensorflow-gpu version 1.10.0. However, when trying it with version 1.11.0, our machine does not recognize the gpu.