Bag of Visual Words model
WHAT IT TESTS: classic image representation. OUTLINE: cluster many local descriptors (e.g. k-means) into visual words; assign each image's features to words; represent the image as a histogram of word counts for a classifier.
WHAT IT TESTS: how local features become a fixed-length image vector. ANSWER OUTLINE: extract local descriptors from a large set of images, cluster them with k-means so each cluster center is a visual word forming the vocabulary; for a new image, assign each descriptor to its nearest word and build a histogram counting word occurrences; this fixed-length vector feeds a classifier like an SVM. The order and position of words are ignored, hence bag. RED FLAG: claiming it preserves spatial structure.
Read the original → interview
- #bag-of-visual-words
- #k-means
- #visual-vocabulary
- #image-classification
- #histogram
Get five bites like this every day.
Tezvyn delivers a daily feed of 60-second tech bites with quizzes to lock in what you learn.