【Tensorflow】Tensorflow Object Detection API 学習させてみた


Tensorflow Object Detection APIがリリースされたので
試しにPASCAL Visual Object Classesの分類を学習させてみた。

Tensorflow Object Detection API
https://github.com/tensorflow/models/tree/master/object_detection

The PASCAL Visual Object Classes Homepage
http://host.robots.ox.ac.uk/pascal/VOC/index.html

■環境

Ubuntu 14.04
Python 2.7 (python3では動作しなかったのでpython2を使用)
tensorflow 1.2.0

■環境準備

anacondaインストール

$ git clone git://github.com/yyuu/pyenv.git ~/.pyenv
$ git clone https://github.com/yyuu/pyenv-pip-rehash.git ~/.pyenv/plugins/pyenv-pip-rehash
$ echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
$ echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
$ echo 'eval "$(pyenv init -)"' >> ~/.bashrc
$ source ~/.bashrc
$ pyenv install anaconda3-4.3.0
$ pyenv global  anaconda3-4.3.0
$ echo 'export PATH="$PYENV_ROOT/versions/anaconda3-4.3.0/bin:$PATH"' >> ~/.bashrc
$ source ~/.bashrc
$ python --version

色々とインストール

$ conda create -n py27-cpu python=2.7 anaconda
$ source activate py27-cpu

※cpu版
$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.2.0-cp27-none-linux_x86_64.whl

※GPU版
$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.0-cp27-none-linux_x86_64.whl

$ conda install -y -c vdbwrair imagemagick
$ conda install -y -c menpo opencv3
$ conda install -y matplotlib 
$ conda install -y pillow
$ conda install -y h5py

$ sudo apt-get install libprotobuf-dev protobuf-compiler python-pil python-lxml


tensorflow-model を任意のディレクトリにダウンロード
/path/to/の部分は自分の環境に書き換える


$ mkdir /path/to/tensorflow
$ cd /path/to/tensorflow
$ git clone https://github.com/tensorflow/models.git

Google Protocol Bufferによるprotoファイルのバイナリ化
protocのバージョンが3以上である必要あり。ubuntu14だとバージョン2なので、3を持ってくる必要がある。

$ cd ~/
$ curl -OL https://github.com/google/protobuf/releases/download/v3.2.0/protoc-3.2.0-linux-x86_64.zip
$ unzip protoc-3.2.0-linux-x86_64.zip -d protoc
$ echo "alias protoc='~/protoc/bin/protoc'" >> ~/.bashrc
$ source ~/.bashrc

$ echo 'export PYTHONPATH=$PYTHONPATH:/path/to/tensorflow/models:/path/to/tensorflow/models/slim' >> ~/.bashrc
$ source ~/.bashrc

※source ~/.bashrcにてconda環境が解除されるので再度activate 
$ source activate py27-cpu

$ cd /path/to/tensorflow/models
$ protoc object_detection/protos/*.proto --python_out=.

動作確認

$ cd /path/to/tensorflow
$ python models/object_detection/builders/model_builder_test.py 
※OKが出たら環境準備完了。

■学習用データ準備

VOCのデータを任意のディレクトリにダウンロード
/path/to/の部分は自分の環境に書き換える

$ mkdir /path/to/dataset
$ cd /path/to/dataset
$ mkdir VOCtest
$ mkdir VOCtrain
$ cd VOCtest
$ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
$ tar xf VOCtest_06-Nov-2007.tar

$ cd ../VOCtrain

$ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
$ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
$ tar xf VOCtrainval_11-May-2012.tar
$ tar xf VOCtrainval_06-Nov-2007.tar

$ cd /path/to/tensorflow/models/object_detection/
$ mkdir data/pascal_voc2007test
$ mkdir data/pascal_voc2007-2012train

VOCのxmlをバイナリ化する
FutureWarningが出るけど気にしない

$ python ./create_pascal_tf_record.py --data_dir=/path/to/dataset/VOCtest/VOCdevkit \
 --year=VOC2007 \
 --set=test \
 --output_path=/path/to/tensorflow/models/object_detection/data/pascal_voc2007test/pascal.record
$ python ./create_pascal_tf_record.py --data_dir=/path/to/dataset/VOCtrain/VOCdevkit \
 --year=merged \
 --output_path=/path/to/tensorflow/models/object_detection/data/pascal_voc2007-2012train/pascal.record

■学習用のconfig作成

object_detection/samples/configs/ssd_inception_v2_pets.configを参考に
VOC用のconfigを新規作成する

$ cd /path/to/tensorflow/models/object_detection
$ vi ssd_inception_v2_voc.config
# SSD with Inception v2 configured for Oxford-IIIT Pets Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  ssd {
    num_classes: 21
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
        reduce_boxes_in_lowest_layer: true
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 3
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_inception_v2'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
          anchorwise_output: true
        }
      }
      localization_loss {
        weighted_smooth_l1 {
          anchorwise_output: true
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 24
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  # fine_tune_checkpoint: "/path/to/tensorflow/models/object_detection/models/model.ckpt-273067"
  from_detection_checkpoint: true
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "/path/to/tensorflow/models/object_detection/data/pascal_voc2007-2012train/pascal.record"
  }
  label_map_path: "/path/to/tensorflow/models/object_detection/data/pascal_label_map.pbtxt"
}

eval_config: {
  num_examples: 2000
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "/path/to/tensorflow/models/object_detection/data/pascal_voc2007test/pascal.record"
  }
  label_map_path: "/path/to/tensorflow/models/object_detection/data/pascal_label_map.pbtxt"
}

「num_classes: 21」は20個の分類+未分類で21
/path/to/の部分は自分の環境に書き換える

■学習

$ python ./train.py \
 --logtostderr \
 --train_dir=/path/to/tensorflow/models/object_detection/models \
 --pipeline_config_path=ssd_inception_v2_voc.config

■学習状況の確認

tensorboardで学習状況を確認

$ tensorboard --logdir=/path/to/tensorflow/models/object_detection/models/

上記でプロセスを立ち上げてブラウザで6006ポートにアクセスすると確認できる

■学習結果の変換

$ cd /path/to/tensorflow/models/object_detection
$ python ./export_inference_graph.py \
 --input_type image_tensor \
 --pipeline_config_path ssd_inception_v2_voc.config \
 --checkpoint_path models/model.ckpt-403 \
 --inference_graph_path ssd_inception_v2_voc.pb

※checkpoint_path models/model.ckpt-403には任意のチェックポイントファイルのパスで置き換える。
実行して得られるssd_inception_v2_voc.pbを使って、任意の画像を分類にかける。

■学習結果を用いて物体検出

jupyterを起動してブラウザから「object_detection_tutorial.ipynb」にアクセスする

$ cd /path/to/tensorflow/models/object_detection
$ jupyter notebook --ip=0.0.0.0 

「Model preparation」の「Variables」を今回学習させたファイル用に書き換える
修正箇所は下記の3つ
* PATH_TO_CKPT
* PATH_TO_LABELS
* NUM_CLASSES

# What model to download.
MODEL_NAME = 'ssd_mobilenet_v1_coco_11_06_2017'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = '/path/to/tensorflow/models/object_detection/ssd_inception_v2_voc07.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'pascal_label_map.pbtxt')

NUM_CLASSES = 20

「Download Model」は全てコメントアウトする

# opener = urllib.request.URLopener()
# opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
# tar_file = tarfile.open(MODEL_FILE)
# for file in tar_file.getmembers():
#     file_name = os.path.basename(file.name)
#     if 'frozen_inference_graph.pb' in file_name:
#         tar_file.extract(file, os.getcwd())

これで実行すればサンプル画像の犬とかを検知できるんじゃないかな。

コメントを残す

メールアドレスが公開されることはありません。 * が付いている欄は必須項目です