OVIC@NIPS description
THIS IS AN EXTERNAL DOC, ANY PARTNER CAN ACCESS
v2:11/09/2018
ovic@google.com
Continuing the success of LPIRC at CVPR 2018, OVIC is hosting a winter installment with more categories and tasks. We are targeting NIPS2018 to announce the winners for three categories below:
A participant / team can submit to and win prizes in multiple categories. The submissions will be a single model in TensorflowLite format.
(new) A participant is also encouraged to contribute to the TensorflowLite codebase to support / expedite their models. Final scores will be computed using a stable build after submission closes[1].
Category 1) and 2) are based on ImageNet classification. Training data are available at the ILSVRC 2012 website. Participants are encouraged to check out this tutorial for training quantized Mobilenet models.
The models must expect input tensors with dimensions [1 x input_height x input_width x 3], where the first dimension is batch size and the last dimension is channel count, and input_height and input_width are the integer height and width expected by the model, each must be between 1 and 1000. The output must be a [1 x 1001] tensor encoding probabilities of the classes, with the first value corresponding to the “background” class. The list of the full labels is here.
The participants can convert their Tensorflow model into a submission-ready model using the following command:
bazel-bin/tensorflow/lite/toco/toco -- \ --input_file="${local_frozen}" --output_file="${toco_file}" --input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE \ --inference_type="${inference_type}" \ --inference_input_type=QUANTIZED_UINT8 \ --input_shape="1,${input_height},${input_width},3" \ --input_array="${input_array}" \ --output_array="${output_array}" \ --mean_value="${mean_value}" --std_value="${std_value}" |
where local_frozen is the frozen graph definition;
inference_type is either FLOAT or QUANTIZED_UINT8;
input_array and output_array are the names of the input and output in the tensorflow graph; and mean_value and std_value are the mean and standard deviation of the input image.
Note that:
The input type is always QUANTIZED_UINT8, and specifically, RGB images with pixel values between 0 and 255. This requirement implies that for floating point models, a Dequantize op will be automatically inserted at the beginning of the graph to convert UINT8 inputs to floating-point by subtracting mean_value and dividing by std_value.
Submissions are evaluated based on classification accuracy / time while focusing on the real-time regime (defined below) running on Google’s Pixel 2 phone.
Figure above illustrates Pareto frontiers estimated from baseline models (shown as dots) in the two latency buckets, and how the test metric is computed for a submission (star) as the offset from the estimated frontier. Illustration only, no real data point used.
Items a) and b) allow the participants to debug runtime errors. Submissions must pass the validator and the test, and must be compatible with the benchmarker app in order to be scored.
Item d) allows the participants to measure latency of their submissions on their local phone. Note that latency obtained via d) may be different from the latency reported by the competition’s server due to language differences, device specs and evaluation settings, etc. In all cases the latency reported by the competition’s server will be used.
Category 3) is based on COCO object detection. Training data are available from the COCO website. Participants are encouraged to check out Tensorflow’s ObjectDetectionAPI tutorial for training detection models.
Submission
For category 3), the instructions for the inputs are the same: the submissions should expect input tensors with dimensions [1 x input_height x input_width x 3], where the first dimension is batch size and the last dimension is channel count, and input_height and input_width are the integer height and width expected by the model, each must be between 1 and 1000. Inputs should contain RGB values between 0 and 255.
The output should contain four tensors:
The recommended way to produce these tensors is to use Tensorflow’s object detection API. Let config_path points to the TrainEvalPipelineConfig used to create and train the model, and checkpoint_path points to the checkpoint of the model. Participants can create a frozen tensorflow model in directory output_dir using the following command:
bazel-bin/tensorflow_models/object_detection/export_tflite_ssd_graph \ --pipeline_config_path="${config_path}" --output_directory="${output_dir}" \ --trained_checkpoint_prefix="${checkpoint_path}" \ --max_detections=100 \ --add_postprocessing_op=true \ --use_regular_nms=${use_regular_nms} |
Where use_regular_nms is a binary flag that controls whether the regular non-max suppression is used, with the alternative being a faster non-max suppression implementation that is less accurate.
Participants can convert their Tensorflow model into a submission-ready model using the following command:
bazel-bin/tensorflow/lite/toco/toco \ --input_file="${local_frozen}" --output_file="${toco_file}" \ --input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE \ --inference_type=${inference_type} \ --inference_input_type=QUANTIZED_UINT8 \ --input_shapes="1,${input_height},${input_width},3" \ --input_arrays="${input_array}" \ --output_arrays=\ 'TFLite_Detection_PostProcess',\ 'TFLite_Detection_PostProcess:1',\ 'TFLite_Detection_PostProcess:2',\ 'TFLite_Detection_PostProcess:3' \ --change_concat_input_ranges=false --allow_custom_ops --mean_values="${mean_value}" --std_values="${std_value}" |
where local_frozen is the frozen graph definition;
inference_type is either FLOAT or QUANTIZED_UINT8;
The images will be resized to these dimensions but it is up to the participant to pick dimensions that are not too small to adversely impact accuracy or too large to adversely impact model run-time.
input_array and output_array are the names of the input and output in the tensorflow graph; and mean_value and std_value are the mean and standard deviation of the input image.
Note that:
The input type is always QUANTIZED_UINT8, and specifically, RGB images with pixel values between 0 and 255. This requirement implies that for floating point models, a Dequantize op will be automatically inserted at the beginning of the graph to convert UINT8 inputs to floating-point by subtracting mean_value and dividing by std_value.
Submissions are evaluated based on detection mAP and time while focusing on the interactive regime (defined below) running on Google’s Pixel 2 phone.
Item d) allows the participants to measure latency of their submissions on their local phone. Note that latency obtained via d) may be different from the latency reported by the competition’s server due to language differences, device specs and evaluation settings, etc. In all cases the latency reported by the competition’s server will be used.
All submissions, along with the empirical Pareto frontier, will be re-computed after submission closes using the same codebase version. Regressions / improvements may happen as a result of versioning difference between the time of submission and the time of evaluation. In case of a significant regression the organizers may consider using the better measurement between the two.
Participants should submit their own work and develop innovative solutions. Please do not submit released tflite models or solutions from the previous OVIC competition, or else the submission may be disqualified.
Latency measurements of all submissions, reference models and the empirical Pareto frontier will be recomputed using the codebase on Nov 30, 2018. Regressions / improvements may happen as a result of versioning difference.
The first and second place teams for each category will be awarded $1,500 and $500, respectively.
A participant must be at least 13 years old, not a citizen of US embargoed countries, and not affiliated with the organizers or sponsors (Purdue University, Duke University, University of North Carolina Chapel Hill or employees of Facebook or Alphabet Inc.)
Registration open | Oct 15 2018 |
Submission open | Nov 1, 2018 |
Submission closed | Nov 30, 2018 |
Winner announced | Dec 5, 2018 |
[1] In case of significant latency regression in the final build, the latency measured at the time of submission may be considered at the discretion of the organizing committee.