前言

Object Detection API 是由Google所推出的。可以辨識圖片中的多個物件的類型，並加以標示框線。所以這跟VGG的影像分類(Image Classification)用途不一樣。只可惜目前只能辨識100種物件，沒有老虎(所以會變成狗)，沒有公雞(所以會變成小鳥)。不過對於想把物件框選起來的人而言，實在是首選。

當然沒有公雞，我們可以自已加，只不過，工程浩大而以。

下圖為本篇的實例

Google 推出的 Object Detection API，包括Tensorflow 1.x 及Tensorflow 2.x的版本，本編以Tensorflow 2.x版本為主。

安裝套件

pip install tensorflow Pillow opencv-python tf2-tensorflow-object-detection-api PyQt5

無法顯示圖片

如果使用 cv2.imshow() 顯圖時，圖片可正常顯示。但如果使用 plt.imshow()時，依官方的說明，會出現如下錯誤，無法顯示圖形。
UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.

原因是使用object_detection時，會自動把 matplotlib 設定為 agg 模式。agg 模式是處於在沒有任何GUI繪圖功能的背景中執行，所以plt.show()就會出現無法顯圖的警告。為了能顯示處理後的圖形，必需先安裝PyQt5, 然後設定如下

pip install PyQt5

#然後再於程式碼中設定啟動qt5agg
import matplotlib
matplotlib.use('qt5agg')

下載及載入模型

可先到如下網址，查詢要下載的模型網址
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md

keras.utils.get_file() 使用網路將模型下載下來，下載後的檔案名為.tar.gz檔，儲存在本機的C:\Users\{user Name}\.keras\datasets裏，然後自動解壓縮到 “模組名/saved_model” 目錄中。當第二次執行時，會先檢查上述路徑是否已下載，若已有模型了就不會重新下載。所以如果要重新下載，就需將datasets裏的相關檔案刪除。

請注意不要使用 keras.models.load_model, 因為此模型是由 tf.saved_model.save產生的。此模型有14G，下載約需100~120秒。

import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf
import time
import keras
def download_model(model_name, model_date):
    base_url = 'http://download.tensorflow.org/models/object_detection/tf2'
    # 解壓縮
    model_dir = keras.utils.get_file(fname=model_name,
                                        origin=f'{base_url}/{model_date}/{model_name}.tar.gz',
                                        untar=True)
    return f'{str(model_dir)}/saved_model'
MODEL_DATE = '20200711'
MODEL_NAME = 'centernet_hg104_1024x1024_coco17_tpu-32'
model_path = download_model(MODEL_NAME, MODEL_DATE)
print('載入模型...', end='')
t1 = time.time()
#請不要使用 keras.models.load_model, 因為此模型是由 tf.saved_model.save產生的
#model = keras.models.load_model(model_path)
# 開始由本地端載入模型，會很久
model = tf.saved_model.load(model_path)

t2 = time.time()
print(f'共花費 {t2-t1} 秒.')

下載及載入Label

一樣使用tf.keras.utils.get_file() 取得Label的文字檔，再用
label_map_util.create_category_index_from_labelmap()取得Label的字典。從如下程式碼，可知Label的字典格式。

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
import tensorflow as tf
import pathlib
from object_detection.utils import label_map_util
def download_labels(filename):
    base_url = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/data'
    label_dir = tf.keras.utils.get_file(fname=filename,
                                        origin=f'{base_url}/{filename}',
                                        untar=False)
    label_dir = pathlib.Path(label_dir)
    return str(label_dir)

LABEL_FILENAME = 'mscoco_label_map.pbtxt'
label_path = download_labels(LABEL_FILENAME)
category_index = label_map_util.create_category_index_from_labelmap(label_path, use_display_name=True)
print(category_index)
for key in category_index.keys():
    print(category_index[key])

結果
{1: {'id': 1, 'name': 'person'}, 2: {'id': 2, 'name': 'bicycle'}, 3: {'id': 3, 'name': 'car'}, 4: {'id': 4, 'name': 'motorcycle'}, 5: {'id': 5, 'name': 'airplane'},......}

{'id': 1, 'name': 'person'}
{'id': 2, 'name': 'bicycle'}
{'id': 3, 'name': 'car'}
.
.
.
{'id': 87, 'name': 'scissors'}
{'id': 88, 'name': 'teddy bear'}
{'id': 89, 'name': 'hair drier'}
{'id': 90, 'name': 'toothbrush'}

載入圖片

此處我們使用cv2 將圖片載入，經縮放後，由 tf.convert_to_tensor()改成 tf 的格式, 然後需再拓展成四維。最後放入 model(input_tensor)即可開始偵測。

偵測完會有好幾組可能，再此我們取第一組資訊即可

# 開啟一張圖片
img_path='cat.jpg'
img=cv2.imdecode(np.fromfile(img_path, dtype=np.uint8), cv2.IMREAD_UNCHANGED)
img=cv2.resize(img, (1024,768), interpolation=cv2.INTER_LINEAR)
image_np=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
#image_np = np.array(Image.open('tiger.jpg'))

# 轉為 TensorFlow tensor 資料型態
input_tensor = tf.convert_to_tensor(image_np)
# 最前面增加一維，變為 (筆數, 寬, 高, 顏色)
input_tensor = np.expand_dims(image_np, 0)

detections = model(input_tensor)
num_detections = int(detections.pop('num_detections'))#偵測到的數量

# detections：物件資訊 內含 (候選框, 類別, 機率)
print(f'物件個數：{num_detections}')
detections = {key: value[0].numpy()
               for key, value in detections.items()}

detections['num_detections'] = num_detections

print(f'物件資訊 (候選框, 類別, 機率)：')
for detection_boxes, detection_classes, detection_scores in \
    zip(detections['detection_boxes'], detections['detection_classes'], detections['detection_scores']):
    print(np.around(detection_boxes,4), detection_classes, round(detection_scores*100, 2))

標線及標題

將影像放入 viz_utils_visualize_boxes_and_labels_on_image_array, 即可開始進行標線

image_np_with_detections = image_np.copy()
viz_utils.visualize_boxes_and_labels_on_image_array(
      image_np_with_detections,
      detections['detection_boxes'],
      detections['detection_classes'].astype(np.int64),
      detections['detection_scores'],
      category_index,
      use_normalized_coordinates=True,
      max_boxes_to_draw=200,
      min_score_thresh=.30,
      agnostic_mode=False)

完整代碼

底下為完整代碼

import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
#import pylab as plt
import cv2
import tensorflow as tf
import time
import keras
import numpy as np
import pathlib
from object_detection.utils import visualization_utils as viz_utils
from object_detection.utils import label_map_util
def download_model(model_name, model_date):
    base_url = 'http://download.tensorflow.org/models/object_detection/tf2'
    # 解壓縮
    model_dir = keras.utils.get_file(fname=model_name,
                                        origin=f'{base_url}/{model_date}/{model_name}.tar.gz',
                                        untar=True)
    return f'{str(model_dir)}/saved_model'
def download_labels(filename):
    base_url = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/data'
    label_dir = tf.keras.utils.get_file(fname=filename,
                                        origin=f'{base_url}/{filename}',
                                        untar=False)
    return str(label_dir)
MODEL_DATE = '20200711'
MODEL_NAME = 'centernet_hg104_1024x1024_coco17_tpu-32'
model_path = download_model(MODEL_NAME, MODEL_DATE)
print('載入模型...', end='')
t1 = time.time()
#請不要使用 keras.models.load_model, 因為此模型是由 tf.saved_model.save產生的
#model = keras.models.load_model(model_path)
# 開始由本地端載入模型，會很久
model = tf.saved_model.load(model_path)

t2 = time.time()
print(f'共花費 {t2-t1} 秒.')

LABEL_FILENAME = 'mscoco_label_map.pbtxt'
label_path = download_labels(LABEL_FILENAME)
category_index = label_map_util.create_category_index_from_labelmap(label_path, use_display_name=True)
print(category_index)
for key in category_index.keys():
    print(category_index[key])

# 開啟一張圖片
img_path='motor.jpg'
img=cv2.imdecode(np.fromfile(img_path, dtype=np.uint8), cv2.IMREAD_UNCHANGED)
img=cv2.resize(img, (1024,768), interpolation=cv2.INTER_LINEAR)
img=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
#image_np = np.array(Image.open('tiger.jpg'))

# 轉為 TensorFlow tensor 資料型態
input_tensor = tf.convert_to_tensor(img)
# 最前面增加一維，變為 (筆數, 寬, 高, 顏色)
input_tensor = np.expand_dims(input_tensor, 0)

detections = model(input_tensor)
num_detections = int(detections.pop('num_detections'))#偵測到的數量

# detections：物件資訊 內含 (候選框, 類別, 機率)
print(f'物件個數：{num_detections}')
detections = {key: value[0].numpy()
               for key, value in detections.items()}

detections['num_detections'] = num_detections

print(f'物件資訊 (候選框, 類別, 機率)：')
for detection_boxes, detection_classes, detection_scores in \
    zip(detections['detection_boxes'], detections['detection_classes'], detections['detection_scores']):
    print(np.around(detection_boxes,4), detection_classes, round(detection_scores*100, 2))

viz_utils.visualize_boxes_and_labels_on_image_array(
    img,
    detections['detection_boxes'],
    detections['detection_classes'].astype(np.int64),  # 標籤只吃int64的格式
    detections['detection_scores'],
    category_index,
    use_normalized_coordinates=True,
    max_boxes_to_draw=200,
    min_score_thresh=.40,
    #groundtruth_box_visualization_color='red',
    #agnostic_mode=False)
# plt.axis("off")
# plt.imshow(img)
# plt.show()
cv2.imshow('test', cv2.cvtColor(img, cv2.COLOR_RGB2BGR))
cv2.waitKey(0)

附註

底下為各種模型的評估，藍色為最佳，粉紅為不能用

MODEL_NAME = 'centernet_hg104_512x512_coco17_tpu-8'#載入31.38秒，預測 : 4.54秒，4*
MODEL_NAME = 'centernet_hg104_512x512_kpts_coco17_tpu-32'#載入31.92秒，預測 : 4.85秒, 2*
MODEL_NAME = 'centernet_hg104_1024x1024_coco17_tpu-32'#載入31.05秒，預測 : 8.34秒, 5*
MODEL_NAME = 'centernet_hg104_1024x1024_kpts_coco17_tpu-32'#載入31.47秒，預測 : 8.65秒 3*

MODEL_NAME = 'centernet_resnet50_v1_fpn_512x512_coco17_tpu-8'#載入8.85秒，預測 : 1.89秒, 3*
MODEL_NAME = 'centernet_resnet50_v1_fpn_512x512_kpts_coco17_tpu-8'#載入9.20秒，預測 : 2.19秒, 3*
MODEL_NAME = 'centernet_resnet101_v1_fpn_512x512_coco17_tpu-8'#載入16.21秒，預測 : 2.22秒,  2*
MODEL_NAME = 'centernet_resnet50_v2_512x512_coco17_tpu-8'#載入4.45秒，預測 : 2.04秒, 3*
MODEL_NAME = 'centernet_resnet50_v2_512x512_kpts_coco17_tpu-8'#載入8.74秒，預測 : 2.33秒, 2*

MODEL_NAME = 'efficientdet_d0_coco17_tpu-32'#載入16.58秒，預測 : 4.39秒, 2*
MODEL_NAME = 'efficientdet_d1_coco17_tpu-32'#載入秒，預測 : 秒
MODEL_NAME = 'efficientdet_d2_coco17_tpu-32'#載入秒，預測 : 秒
MODEL_NAME = 'efficientdet_d3_coco17_tpu-32'#載入秒，預測 : 秒
MODEL_NAME = 'efficientdet_d4_coco17_tpu-32'#載入秒，預測 : 秒
MODEL_NAME = 'efficientdet_d5_coco17_tpu-32'#載入秒，預測 : 秒
MODEL_NAME = 'efficientdet_d6_coco17_tpu-32'#載入秒，預測 : 秒
MODEL_NAME = 'efficientdet_d7_coco17_tpu-32'#載入 : 40.66秒，預測 : 9.30秒, 6*
MODEL_NAME = 'ssd_mobilenet_v2_320x320_coco17_tpu-8'#載入秒，預測 : 秒, 非常不準
MODEL_NAME = 'ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8'#載入秒，預測 : 秒
MODEL_NAME = 'ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8'#載入秒，預測 : 秒
MODEL_NAME = 'ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8'#載入秒，預測 : 秒，非常不準
MODEL_NAME = 'ssd_resnet50_v1_fpn_640x640_coco17_tpu-8'#載入秒，預測 : 秒
MODEL_NAME = 'ssd_resnet50_v1_fpn_1024x1024_coco17_tpu-8'#載入 : 9.29秒，預測 : 5.15秒，4*
MODEL_NAME = 'ssd_resnet101_v1_fpn_640x640_coco17_tpu-8'#載入秒，預測 : 秒
MODEL_NAME = 'ssd_resnet101_v1_fpn_1024x1024_coco17_tpu-8'#載入 : 14.77秒，預測 : 5.53秒，4*
MODEL_NAME = 'ssd_resnet152_v1_fpn_640x640_coco17_tpu-8'#載入秒，預測 : 秒
MODEL_NAME = 'ssd_resnet152_v1_fpn_1024x1024_coco17_tpu-8'#載入 : 20.06秒，預測 : 5.97秒，3*
MODEL_NAME = 'faster_rcnn_resnet50_v1_640x640_coco17_tpu-8'#載入秒，預測 : 秒
MODEL_NAME = 'faster_rcnn_resnet50_v1_1024x1024_coco17_tpu-8'#載入6.39秒，預測 : 5.23秒, 2*
MODEL_NAME = 'faster_rcnn_resnet50_v1_800x1333_coco17_gpu-8'#載入 : 秒，預測 : 秒
MODEL_NAME = 'faster_rcnn_resnet101_v1_640x640_coco17_tpu-8'#載入 : 秒，預測 : 秒
MODEL_NAME = 'faster_rcnn_resnet101_v1_1024x1024_coco17_tpu-8'#載入 : 秒，預測 : 秒
MODEL_NAME = 'faster_rcnn_resnet101_v1_800x1333_coco17_gpu-8'#載入 : 秒，預測 : 秒
MODEL_NAME = 'faster_rcnn_resnet152_v1_640x640_coco17_tpu-8'#載入 : 秒，預測 : 秒
MODEL_NAME = 'faster_rcnn_resnet152_v1_1024x1024_coco17_tpu-8'#載入 : 13.94秒，預測 : 秒，不能用
MODEL_NAME = 'faster_rcnn_resnet152_v1_800x1333_coco17_gpu-8'#載入 : 13.99秒，預測 : 6.03秒，3*

MODEL_NAME = 'faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8'#載入秒，預測 : 秒
MODEL_NAME = 'faster_rcnn_inception_resnet_v2_1024x1024_coco17_tpu-8'#載入: 18.46秒，預測 : 9.69秒，6*

MODEL_NAME = 'mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8'#載入 : 18.50秒，預測 : 秒，不能用
MODEL_NAME = 'extremenet'#載入 : 秒，預測 : 秒，無法載入

底下為模型可預測的圖片

{'id': 1, 'name': 'person'}
{'id': 2, 'name': 'bicycle'}
{'id': 3, 'name': 'car'}
{'id': 4, 'name': 'motorcycle'}
{'id': 5, 'name': 'airplane'}
{'id': 6, 'name': 'bus'}
{'id': 7, 'name': 'train'}
{'id': 8, 'name': 'truck'}
{'id': 9, 'name': 'boat'}
{'id': 10, 'name': 'traffic light'}
{'id': 11, 'name': 'fire hydrant'}
{'id': 13, 'name': 'stop sign'}
{'id': 14, 'name': 'parking meter'}
{'id': 15, 'name': 'bench'}
{'id': 16, 'name': 'bird'}
{'id': 17, 'name': 'cat'}
{'id': 18, 'name': 'dog'}
{'id': 19, 'name': 'horse'}
{'id': 20, 'name': 'sheep'}
{'id': 21, 'name': 'cow'}
{'id': 22, 'name': 'elephant'}
{'id': 23, 'name': 'bear'}
{'id': 24, 'name': 'zebra'}
{'id': 25, 'name': 'giraffe'}
{'id': 27, 'name': 'backpack'}
{'id': 28, 'name': 'umbrella'}
{'id': 31, 'name': 'handbag'}
{'id': 32, 'name': 'tie'}
{'id': 33, 'name': 'suitcase'}
{'id': 34, 'name': 'frisbee'}
{'id': 35, 'name': 'skis'}
{'id': 36, 'name': 'snowboard'}
{'id': 37, 'name': 'sports ball'}
{'id': 38, 'name': 'kite'}
{'id': 39, 'name': 'baseball bat'}
{'id': 40, 'name': 'baseball glove'}
{'id': 41, 'name': 'skateboard'}
{'id': 42, 'name': 'surfboard'}
{'id': 43, 'name': 'tennis racket'}
{'id': 44, 'name': 'bottle'}
{'id': 46, 'name': 'wine glass'}
{'id': 47, 'name': 'cup'}
{'id': 48, 'name': 'fork'}
{'id': 49, 'name': 'knife'}
{'id': 50, 'name': 'spoon'}
{'id': 51, 'name': 'bowl'}
{'id': 52, 'name': 'banana'}
{'id': 53, 'name': 'apple'}
{'id': 54, 'name': 'sandwich'}
{'id': 55, 'name': 'orange'}
{'id': 56, 'name': 'broccoli'}
{'id': 57, 'name': 'carrot'}
{'id': 58, 'name': 'hot dog'}
{'id': 59, 'name': 'pizza'}
{'id': 60, 'name': 'donut'}
{'id': 61, 'name': 'cake'}
{'id': 62, 'name': 'chair'}
{'id': 63, 'name': 'couch'}
{'id': 64, 'name': 'potted plant'}
{'id': 65, 'name': 'bed'}
{'id': 67, 'name': 'dining table'}
{'id': 70, 'name': 'toilet'}
{'id': 72, 'name': 'tv'}
{'id': 73, 'name': 'laptop'}
{'id': 74, 'name': 'mouse'}
{'id': 75, 'name': 'remote'}
{'id': 76, 'name': 'keyboard'}
{'id': 77, 'name': 'cell phone'}
{'id': 78, 'name': 'microwave'}
{'id': 79, 'name': 'oven'}
{'id': 80, 'name': 'toaster'}
{'id': 81, 'name': 'sink'}
{'id': 82, 'name': 'refrigerator'}
{'id': 84, 'name': 'book'}
{'id': 85, 'name': 'clock'}
{'id': 86, 'name': 'vase'}
{'id': 87, 'name': 'scissors'}
{'id': 88, 'name': 'teddy bear'}
{'id': 89, 'name': 'hair drier'}
{'id': 90, 'name': 'toothbrush'}

Google Object Detection

前言