Crowd Counting Survey
This is a brief survey about crowd counting. Firstly, I will given some related papers with my comments from both deep learning and traditional perspectives.
Then I will list some public available datasets and benchmarks.
Finally, I will give the most feasible method to predict the crowd density and explain the reasons.
Paper List
Deep Learning based Methods
Dataset
There are different ways to count the crowd. One is to predict the crowd count
directly via body detection or face detection. Using this way need to train dataset with face annotation or body annotation (some boxes). This way is hard to count for some images with large crowd. Another way is to generate a heat maps of the people. Then, the count of the crowd is the integral of the heat map. Predicting in this way need to train dataset with the head center point annotations(some points). For both the two types, there some public available
datasets, links to these dataset are listed follows:
Datasets with head annotations
Datasets with person annotations
Dataset Preprocessing
In this section, I mainly focus on how to process the dataset with head annotations. The process consists of two steps, one is to generate ground
truth head maps. Another is data augmentation.
Generating Ground Truth Heat Maps
Here I describe the method used in 4 in paper list. If there is a head at
pixel \(x_i\) , we can denote this as \(\delta (x-x_i)\). Hence an image with \(N\) heads labeled can represented as a function:$$H(x) = \sum_{i=1}^{N}\delta(x-x_i)$$
To convert this to a continuous density function, we can convolve this
function with a Gaussian kernel \(G{\sigma}\) . So the density can be
formulated as \(F(x)=H(x)*G{\sigma}(x)\) . If you want to know more
details, please refer to my Project PageData Augmentation
For data augmentation, we follow the method used in 3 in paper list. Firstly, we generate different scales images. Then, we crop 225x225 patches with 50% overlap from those different scales images.
Some visualization: