Robust multi-resolution pedestrian detection in traffic scenes

Junjie Yan, Xucong Zhang, Zhen Lei, Shengcai Liao, Stan Z. Li

Research output: Contribution to journalConference articlepeer-review

164 Citations (Scopus)

Abstract

The serious performance decline with decreasing resolution is the major bottleneck for current pedestrian detection techniques. In this paper, we take pedestrian detection in different resolutions as different but related problems, and propose a Multi-Task model to jointly consider their commonness and differences. The model contains resolution aware transformations to map pedestrians in different resolutions to a common space, where a shared detector is constructed to distinguish pedestrians from background. For model learning, we present a coordinate descent procedure to learn the resolution aware transformations and deformable part model (DPM) based detector iteratively. In traffic scenes, there are many false positives located around vehicles, therefore, we further build a context model to suppress them according to the pedestrian-vehicle relationship. The context model can be learned automatically even when the vehicle annotations are not available. Our method reduces the mean miss rate to 60% for pedestrians taller than 30 pixels on the Caltech Pedestrian Benchmark, which noticeably outperforms previous state-of-the-art (71%).

Original languageEnglish
Article number6619234
Pages (from-to)3033-3040
Number of pages8
JournalProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2013 - Portland, OR, United States
Duration: Jun 23 2013Jun 28 2013

Keywords

  • DPM
  • Multi-Resolution
  • Multi-task Learning
  • Pedestrian Detection

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Robust multi-resolution pedestrian detection in traffic scenes'. Together they form a unique fingerprint.

Cite this