Introduction
The main goal is to detect face and mask in the browser without using a Python backend. This is a simple WebApp / SPA which contains only JS code and can send some data to the back end for next processing. But the initial face and mask detection is done on the browser side and no Python implementation is required for that.
At the moment, the app only works in the Chrome browser.
In the next article, I will describe more technical implementation details based on our research.
There are 2 approaches to how you can implement this in a browser:
Both of them support WASM, WebGL and CPU backend. But we will only compare WASM and WebGL, since the performance on the CPU is very low and cannot be used in the final implementation.
TensorFlow.js
On the official website, TensorFlow.js offers several trained and ready-to-use models with ready post-processing. To detect faces in real time, we took the BlazeFace model, an online demo is available here .
More information about BlazeFace can be found here .
, , . :
WASM ( : 160x120px; : 64x64px)
WebGL ( : 160x120px; : 64x64px)
:
HTML API. . , . .
- , . requestAnimationFrame, 16,6 .
grabFrame() ImageCapture MediaStreamTrack Promise, ImageBitmap.
- . - , .
:
: < 6 fps , 7-12 fps , 13-18 fps , 19+ fps .
:
. , . "" , . - 5-10 , .
- 50.
BlazeFace TFLite Android IOS (~ 50-200 FPS).
( Google ).
BlazeFace 2 :
: 128 x 128px, , .
: 256 x 256px, , .
. , .
, BlazeFace , 128 x 128px. . , , 64 x 64px.
. . 64 x 64px , 32 x 32px .
?
TensorFlow.js , :
BlazeFace . (> 0,9), ( "" ).
BlazeFace , . 1, 1 ; 2 1- , / .
BlazeFace (, , , ). , , .
:
X%.
.
/ JS.
, , : . , , .
TensorFlow.js (<3Mb). , WASM . , .
WASM JS : ~60Kb
OpenCV.js: 1.6Mb
SPA (+TensorFlow.js): ~500Kb
BlazeFace : 466Kb
(TTI) ~3.5Mb JavaScript + JSON 1.5Mb to 6Mb >10 ; TTI - 4-5 .
Web Worker ( OpenCV.js ), 800-900Kb. TTI 7-8 ; <5 .
. . . , .
. , , . .
Web Workers
JS BlazeFace , Web Worker. . UX . - TensorFlow.js OpenCV.js, JS - TensorFlow.js. , , TTI . , FPS . , FPS . ~ 20%. , FPS .
- . , .
, . postMessage, - . - ( 200 ), , ( JS, React.js ).
Web Workers
, - / / . , callback- . , .
: mobileNetVersion=V3 mobileNetVersionMultiplier = 0.75 mobileNetVersionType = float16 thumbnailSize=32px backend = wasm
BlazeFace -.
Web Worker :
, Web Worker , TensorFlow browser.fromPixels Web Worker. , Mac OS ~ 27, - 5. 22 Mac OS 100โ300 Web Worker. , .
?
ยซยป, , . - ( , , ). ยซยป . FPS, 200โ300, . , :
:
, :
:
: > 30fps
+ : 45fps
:
: 2.5fps 12-15fps
+ : 2 12fps
, , 30 .
MobileNetV2 0.35, .
. , uint16 float16, TTI.
WASM WebGL BlazeFace. TensorFlow.js (<3Mb):
WebGL - WASM, WASM ( 3Mb 60 ). GPU WebGL.
TTI WASM WebGL .
TensorFlow.js , WASM SIMD, . , 2โ3 WASM. . , . .
~ 3,5Mb JS , 466Kb BlazeFace, 1,1Mb 5,6Mb . TTI > 10 ; - ~ 5.
- OpenCV.js -, TTI .
, . , , . .
, , . . USB , . .
, 4-5 , UX. , / :
, .
/ .
With such user interaction, the delays between real time and our metadata on the screen will be 200-300ms. Such values โโwill be considered by users of the system as non-critical delays.