Running object detection using Tensorflow.js

Question

I am working on object detection using Tensorflow.js. I am trying to run custom object detection tensorflow.js model in a browser. I could able to convert tensorflow model to tensorflow.js model (in google colab) using the following command:

!tensorflowjs_converter \
--input_format=tf_frozen_model \
--output_node_names='detection_boxes,detection_scores,detection_classes,num_detections' \
/content/frozen_inference_graph.pb \
/content/web_model

I am sharing the code snippet of inference.html file [Updated]:

<html>
<head>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@latest"> </script>
<!--<script src="webcam.js"></script>-->
      <img id="img" src="257.jpg" width="300" height="300"/>

</head>
<body>
    <button type="button" id="startPredicting" onclick="startPredicting()" >Start Predicting</button>
    <button type="button" id="stopPredicting" onclick="stopPredicting()" >Stop Predicting</button>
    <div id="prediction"></div>
</body>

<script src="index.js"></script>
</html>

The code snippet of index.js file is as follow [Updated]:

let model;
let isPredicting = false;

async function init(){
        try {
            model = await tf.loadGraphModel('http://127.0.0.1:8887/uno_model/model.json');
        } catch (err) {
            console.log(err);
        }
}

async function predict() {
        console.log("executing model");
        const img = document.getElementById('img');

        tf_img = tf.browser.fromPixels(img);
        tf.print(tf_img)

        tf_img = tf_img.expandDims(0);

        console.log(tf_img.shape)  // Image dimension is  [1, 300, 300, 3]

         let output = await model.executeAsync(
        { 'image_tensor' : tf_img},
        [ 'detection_boxes','detection_scores','detection_classes','num_detections']);


        for (let i = 0; i < output.length; i++){
            console.log(output[i].dataSync())
        }

 }

init()


function startPredicting(){
    isPredicting = true;
    predict();
}

function stopPredicting(){
    isPredicting = false;
    predict();
}

It produces following output [Updated]:

I looked at the above output but I couldn't get class labels etc. How can I extract detection_classes, detection_scores, and detection_boxes? This model works properly with python code.

[Updated]: It seems like, I am getting the output after providing [1,300,300,3] image as input to the model.

Could you please guide me? Am I missing something?

Only tensorflowjs_converter code is written in Google Colab. — Saurabh Chauhan
– Saurabh Chauhan, Commented Jan 13, 2020 at 17:00

edkeveked · Accepted Answer · 2020-01-14 10:22:33Z

1

Though the python model is not added to the question, but the extracted nodes of the model, the size of the tensor and their type provide an insight to identify the detection_classes, detection_scores, and detection_boxes.

The first tensor has the size 400 and corresponds to detection_boxes. The dataSync operator returns a completely flatten array. The size 400 most likely corresponds to the shape [100, 4]. This is corroborated by the shape of the other tensors that is discussed below. [100, 4] means that there are 100 bounding boxes in the input - most likely an image. Concretely, it means that the first four elements corresponds to the first bounding box, and so on...

The second tensor corresponds to the detections_scores. There are 100 detections scores for the 100 bounding boxes. The first element of this array correspond to the first four elements of the first array (detection_boxes array)

The third array corresponds to the detection_classes. It is an array of 100 integers where each value is the index of the matched label.

The forth array correspond to the num_detections. It contains how many detections there are: 100

I looked at the above output but I couldn't get class labels

To get the label (string), the index taken from the detection_classes should be used against the json (dictionary in python) or array that contains all the labels and their indexes.

It is noteworthy to indicate that for the js model to return the same output as the python model, all the processing done to the image in python prior to feeding the model should be replicated in js.

edited Jan 14, 2020 at 10:22

answered Jan 13, 2020 at 21:10

edkeveked

18.4k10 gold badges60 silver badges95 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Saurabh Chauhan Over a year ago

I got your view as pointed in the previous question stackoverflow.com/questions/59575812/…. But I am not getting is: what is wrong with the current example and why it doesn't detect object as it should do? The same model runs perfectly using python code.

edkeveked Over a year ago

What do you mean by it runs perfectly in python ? The model is doing exactly what you asked it to do. Unless I see the python code, I can't tell what is wrong here

Saurabh Chauhan Over a year ago

The original model (using frozen graph) detects objects in an image or in a video. I using the same model to convert to json formatted model (as tf.js needs json fromatted model). So the json formatted model should produce the same output as the original model.

Saurabh Chauhan Over a year ago

I have also updated the question (removed webcam code snippet, the current code uses a single image and do inference. In addition to these, I make sure that image size is [1,300,300,3] before it goes to the model) along with google drive directory so that you can reproduce the same result. Please share your views!

edkeveked Over a year ago

Okay, great. It was because you were not feeding the model with the right input size. See my edited answer !

|

Saurabh Chauhan · Accepted Answer · 2020-01-15 07:41:23Z

0

Finally, I could figure out the problem and it was related to the size of an input frame.

SSD model needs shape of [1,300,300,3] image/frame as input. I added this in my code and got the solution. Using the following line (in inference.html), we can feed (300,300,3) shape of image as an input to the model:

 <img id="img" src="257.jpg" width="300" height="300"/>

Using the following lines in index.js:

 tf_img = tf_img.expandDims(0);
 console.log(tf_img.shape)  // Image dimension is  [1, 300, 300, 3]

We obtain image shape of [1,300,300,3] which is needed by SSD.

answered Jan 15, 2020 at 7:41

Saurabh Chauhan

3,2214 gold badges23 silver badges47 bronze badges

Collectives™ on Stack Overflow

Running object detection using Tensorflow.js

2 Answers 2

8 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

8 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related