arutema47's blog

書いたり書かなかったり。

10 computer vision papers I enjoyed reading in 2019.

10 computer vision papers of 2019.

I would like to list my 10 computer vision papers I enjoyed reading this year. The papers are picked from favorites from Mendley app (great app!).

The papers are picked from my opinions, which I think many of these papers will become the mainstream of computer vision researches.

I would like to list some reasons why I picked these papers..

Paper list

Reasons I picked the papers.

Object as points (CenterNet)

f:id:aru47:20191230173913p:plain
before and after nms

As shown in the figure, suppressing all the predicted binding boxes to the final prediction was a pain in the xxx in object detection tasks, and had been the accuracy and speed bottleneck. Tuning the nobs of NMS thresholds was a lot of pain; everybody knew that, but not many techniques were proposed to fix this.

f:id:aru47:20191230162202p:plain
centernet

The authors came up with a simple and elegant idea, to simply regress the objects as center points. Without NMS, the proposed Centernet achieves state-of-the-art accuracy. Center-regression may become the mainstream of object detection task; this technique is efficient in software (fewer tuning parameters) and hardware (easy to multi-process).

Also a similar idea was proposed in the paper CornerNet, and Centernet uses a lot of techniques proposed in the cornernet paper (preprocessing, loss calculations). While CornerNet predicts the two edges of the binding box, CenterNet predicts only the binding box center point, which makes the entire network simpler.

Where is my mirror?

f:id:aru47:20191230162121p:plain
mirrornet

Detecting mirrors was an open problem in computer vision: mirror causes unintended predictions in LiDARs and may cause false alarms in self-driving tasks!

This paper 1) constructed a large scale dataset of mirrors and 2) proposed MirrorNet (straight forward..) which can segment mirror regions within an imaging space with high accuracy.

This helps the robot and self-driving engineers since mirror detection was a big problem.

Learning to see in the dark

f:id:aru47:20191230162600p:plain
learn

Can we fix under-exposed images with Deep Learning techniques? The authors proved that it can be done, with surprisingly high quality. The network itself is simple (U-net like), but the question they tackled was very interesting.

EfficientNet, EfficientDet

f:id:aru47:20191230163210p:plain
en

These two papers crashed the state-of-the-art race of image classification and object detection tasks.

The key methods they use in the paper are quite simple (resolution scaling, search-space efficient NAS, better FPN), the combined accuracy increase was huge. The papers are easy to follow and the experiments are conducted in a very nice way; a must-read for all researchers!

These methods are computationally efficient, and I see them being used in many competitions in Kaggle.

Point Pillers

f:id:aru47:20191230163649p:plain
pp

3D object detection with point clouds was somewhat chaotic in 2018: many new techniques were being proposed at CVPR, but many were useless.. It was similar to back then when people worked on weird CNN or Relu layer modifications and none seemed to be working.

I liked the idea of Pointpillars, which encodes the point cloud information as pillars, and then convert it to a multi-channel 2D image. The object detection task itself is conducted via 2D, and successful object detectors like SSD and frcnn can be used.

This 3D-2D fusion click for me, since 3D object detectors had a very bad accuracy compared to 2D detectors, but point cloud encoders like pointnet had a good performance.