In my previous article, I demonstrated how to implement a simple Android QR code scanner. Based on the codebase, I am going to do something more interesting - integrate the YOLOv4 tiny model that I trained for QR code detection into the Android project. Is YOLO tiny competent for real-time QR code detection? Is it better or worse than the traditional computer vision? Let's do an experiment.
Prerequisites
How to Use OpenCV SDK in Android Project
- Unzip opencv-4.5.5-android-sdk.zip.
- Copy the
sdk
folder to the root of your project. Rename it toopencv
. -
In
settings.gradle
, add:
include ':opencv'
-
Add the OpenCV dependency to
app/build.gralde
:
dependencies { ... implementation project(path: ':opencv') }
-
Load the OpenCV Android library in
onCreate()
function:
private BaseLoaderCallback mLoaderCallback = new BaseLoaderCallback(this) { @Override public void onManagerConnected(int status) { switch (status) { case LoaderCallbackInterface.SUCCESS: { Log.i(TAG, "OpenCV loaded successfully"); } break; default: { super.onManagerConnected(status); } break; } } }; private void loadOpenCV() { if (!OpenCVLoader.initDebug()) { Log.d(TAG, "Internal OpenCV library not found. Using OpenCV Manager for initialization"); OpenCVLoader.initAsync(OpenCVLoader.OPENCV_VERSION_3_0_0, this, mLoaderCallback); } else { Log.d(TAG, "OpenCV library found inside package. Using it!"); mLoaderCallback.onManagerConnected(LoaderCallbackInterface.SUCCESS); } } protected void onCreate(Bundle savedInstanceState) { ... loadOpenCV(); }
How to Deploy the YOLOv4 Tiny Model to Android Project
- Copy
backup/yolov4-tiny-custom-416_last.weights
,yolov4-tiny-custom-416.cfg
, anddata/obj.names
to the assets folder of your Android project. -
Load and initialize the model:
private Net loadYOLOModel() { InputStream inputStream = null; MatOfByte cfg = new MatOfByte(), weights = new MatOfByte(); // Load class names try { inputStream = this.getAssets().open("obj.names"); try { BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream)); try { String line; while ((line = reader.readLine()) != null) { classes.add(line); } } finally { reader.close(); } } finally { inputStream.close(); } } catch (IOException e) { e.printStackTrace(); } // Load cfg try { inputStream = new BufferedInputStream(this.getAssets().open("yolov4-tiny-custom-416.cfg")); byte[] data = new byte[inputStream.available()]; inputStream.read(data); inputStream.close(); cfg.fromArray(data); } catch (IOException e) { e.printStackTrace(); } // Load weights try { inputStream = new BufferedInputStream(this.getAssets().open("yolov4-tiny-custom-416_last.weights")); byte[] data = new byte[inputStream.available()]; inputStream.read(data); inputStream.close(); weights.fromArray(data); } catch (IOException e) { e.printStackTrace(); } return Dnn.readNetFromDarknet(cfg, weights); } net = loadYOLOModel(); model = new DetectionModel(net); model.setInputParams(1 / 255.0, new Size(416, 416), new Scalar(0), false);
QR Code Detection with YOLOv4 Tiny on Android
Before getting started, you should have the concept that deep learning is an alternative to the localization algorithm of Dynamsoft Barcode SDK rather than an alternative to the whole barcode SDK. The QR code decoding still relies on Dynamsoft Barcode Reader.
Three Steps to Detect QR Code with YOLOv4 Tiny
-
Convert Android camera frame (YUV420 byte array) to OpenCV Mat. This question has been solved on StackOverflow:
public static Mat imageToMat(ImageProxy image) { ByteBuffer buffer; int rowStride; int pixelStride; int width = image.getWidth(); int height = image.getHeight(); int offset = 0; ImageProxy.PlaneProxy[] planes = image.getPlanes(); byte[] data = new byte[image.getWidth() * image.getHeight() * ImageFormat.getBitsPerPixel(ImageFormat.YUV_420_888) / 8]; byte[] rowData = new byte[planes[0].getRowStride()]; for (int i = 0; i < planes.length; i++) { buffer = planes[i].getBuffer(); rowStride = planes[i].getRowStride(); pixelStride = planes[i].getPixelStride(); int w = (i == 0) ? width : width / 2; int h = (i == 0) ? height : height / 2; for (int row = 0; row < h; row++) { int bytesPerPixel = ImageFormat.getBitsPerPixel(ImageFormat.YUV_420_888) / 8; if (pixelStride == bytesPerPixel) { int length = w * bytesPerPixel; buffer.get(data, offset, length); if (h - row != 1) { buffer.position(buffer.position() + rowStride - length); } offset += length; } else { if (h - row == 1) { buffer.get(rowData, 0, width - pixelStride + 1); } else { buffer.get(rowData, 0, rowStride); } for (int col = 0; col < w; col++) { data[offset++] = rowData[col * pixelStride]; } } } } Mat mat = new Mat(height + height / 2, width, CvType.CV_8UC1); mat.put(0, 0, data); return mat; }
-
Convert
YUV420
toRGB
:
Mat yuv = ImageUtils.imageToMat(imageProxy, yBytes); Mat rgbOut = new Mat(imageProxy.getHeight(), imageProxy.getWidth(), CvType.CV_8UC3); Imgproc.cvtColor(yuv, rgbOut, Imgproc.COLOR_YUV2RGB_I420);
In portrait mode, rotate the image 90 degrees clockwise:
Mat rgb = new Mat(); if (isPortrait) { Core.rotate(rgbOut, rgb, Core.ROTATE_90_CLOCKWISE); } else { rgb = rgbOut; }
-
Do YOLO detection and get detection results:
MatOfInt classIds = new MatOfInt(); MatOfFloat scores = new MatOfFloat(); MatOfRect boxes = new MatOfRect(); model.detect(rgb, classIds, scores, boxes, 0.6f, 0.4f); if (classIds.rows() > 0) { for (int i = 0; i < classIds.rows(); i++) { Rect box = new Rect(boxes.get(i, 0)); Imgproc.rectangle(rgb, box, new Scalar(0, 255, 0), 2); int classId = (int) classIds.get(i, 0)[0]; double score = scores.get(i, 0)[0]; String text = String.format("%s: %.2f", classes.get(classId), score); Imgproc.putText(rgb, text, new org.opencv.core.Point(box.x, box.y - 5), Imgproc.FONT_HERSHEY_SIMPLEX, 1, new Scalar(0, 255, 0), 2); } }
You can save the image to check the detection result:
public static void saveRGBMat(Mat rgb) {
final Bitmap bitmap = Bitmap.createBitmap(rgb.cols(), rgb.rows(), Bitmap.Config.ARGB_8888);
Utils.matToBitmap(rgb, bitmap);
String filename = "test.png";
File sd = Environment.getExternalStorageDirectory();
File dest = new File(sd, filename);
try {
FileOutputStream out = new FileOutputStream(dest);
bitmap.compress(Bitmap.CompressFormat.PNG, 90, out);
out.flush();
out.close();
} catch (Exception e) {
e.printStackTrace();
}
}
Decode QR Code and Draw Overlay
Based on the YOLO detection result, we can specify barcode format and expected barcode count to speed up QR code decoding with Dynamsoft Barcode Reader. By default, the SDK decodes all supported 1D and 2D barcode formats:
TextResult[] results = null;
int nRowStride = imageProxy.getPlanes()[0].getRowStride();
int nPixelStride = imageProxy.getPlanes()[0].getPixelStride();
try {
PublicRuntimeSettings settings = reader.getRuntimeSettings();
settings.barcodeFormatIds = EnumBarcodeFormat.BF_QR_CODE;
settings.expectedBarcodesCount = 1;
reader.updateRuntimeSettings(settings);
} catch (BarcodeReaderException e) {
e.printStackTrace();
}
try {
results = reader.decodeBuffer(yBytes, imageProxy.getWidth(), imageProxy.getHeight(), nRowStride * nPixelStride, EnumImagePixelFormat.IPF_NV21, "");
} catch (BarcodeReaderException e) {
e.printStackTrace();
}
Why not use the bounding box? Yes, you can set the bounding box as a decoding region. But you have to guarantee the model is robust enough. Otherwise, it will affect the decoding result.
Next, we draw an overlay above the camera preview. There is an excellent GraphicOverlay class provided by Android ML Kit demo project. We don't need to reinvent the wheel:
<androidx.camera.view.PreviewView
android:id="@+id/camerax_viewFinder"
android:layout_width="match_parent"
android:layout_height="match_parent" />
<com.example.qrcodescanner.GraphicOverlay
android:id="@+id/camerax_overlay"
android:layout_width="0dp"
android:layout_height="0dp"
app:layout_constraintLeft_toLeftOf="@id/camerax_viewFinder"
app:layout_constraintRight_toRightOf="@id/camerax_viewFinder"
app:layout_constraintTop_toTopOf="@id/camerax_viewFinder"
app:layout_constraintBottom_toBottomOf="@id/camerax_viewFinder"/>
GraphicOverlay overlay = findViewById(R.id.camerax_overlay);
In addition, we can modify the BarcodeGraphic class to make it compatible with our QR code scanning results:
BarcodeGraphic(GraphicOverlay overlay, RectF boundingBox, TextResult barcode, boolean isPortrait) {
super(overlay);
this.barcode = barcode;
this.rect = boundingBox;
this.overlay = overlay;
this.isPortrait = isPortrait;
rectPaint = new Paint();
rectPaint.setColor(MARKER_COLOR);
rectPaint.setStyle(Paint.Style.STROKE);
rectPaint.setStrokeWidth(STROKE_WIDTH);
barcodePaint = new Paint();
barcodePaint.setColor(TEXT_COLOR);
barcodePaint.setTextSize(TEXT_SIZE);
labelPaint = new Paint();
labelPaint.setColor(MARKER_COLOR);
labelPaint.setStyle(Paint.Style.FILL);
}
/**
* Draws the barcode block annotations for position, size, and raw value on the supplied canvas.
*/
@Override
public void draw(Canvas canvas) {
// Draws the bounding box around the BarcodeBlock.
if (rect != null) {
float x0 = translateX(rect.left);
float x1 = translateX(rect.right);
rect.left = min(x0, x1);
rect.right = max(x0, x1);
rect.top = translateY(rect.top);
rect.bottom = translateY(rect.bottom);
canvas.drawRect(rect, rectPaint);
// Draws other object info.
if (barcode != null) {
float lineHeight = TEXT_SIZE + (2 * STROKE_WIDTH);
float textWidth = barcodePaint.measureText(barcode.barcodeText);
canvas.drawRect(
rect.left - STROKE_WIDTH,
rect.top - lineHeight,
rect.left + textWidth + (2 * STROKE_WIDTH),
rect.top,
labelPaint);
// Renders the barcode at the bottom of the box.
canvas.drawText(barcode.barcodeText, rect.left, rect.top - STROKE_WIDTH, barcodePaint);
}
}
}
Here is the code for getting and drawing the QR code bounding box and text:
overlay.clear();
if (classIds.rows() > 0) {
for (int i = 0; i < classIds.rows(); i++) {
...
TextResult result = null;
if (results != null && results.length > 0) {
for (int index = 0; i < results.length; i++) {
result = results[i];
}
}
overlay.add(new BarcodeGraphic(overlay, rect, result, isPortrait));
}
}
overlay.postInvalidate();
Deep Learning vs. Traditional Computer Vision
We run the QR code scanner app to make a performance comparison between the two approaches: deep learning and traditional computer vision. Both of them use CPU.
YOLO (QR code localization) + Computer Vision (QR code decoding)
When mixing deep learning and traditional computer vision to scan QR code, YOLOv4 tiny takes more than 100 ms and computer vision takes less than 10 ms.
Computer Vision (QR code localization) + Computer Vision (QR code decoding)
In contrast, pure traditional computer vision only takes less than 20ms.
Conclusion
A good trained DNN model may do object detection, especially multiple object detection, better than traditional computer vision, but it takes more time when running on CPU. For scanning single QR code in real-time on mobile devices, the traditional computer vision is the winner.
You can try the code by yourself to verify the performance.
Demo Video
Source Code
https://github.com/yushulx/android-camera2-preview-qr-code-scanner
Top comments (0)