You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
Pull Request resolved: facebookresearch#379
1. Fix the gem post processing logic.
Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape:
```
if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem":
gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy"
train_features = torch.tensor(np.concatenate(train_features))
```
This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)`
The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image.
2. Transform before cropping to the bounding box (as opposed to after cropping).
The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44
```
Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41
Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39
```
f288434289
f288438150
Differential Revision: D29993204
fbshipit-source-id: d6d02b6b96d59b43a00a1d1e99f34c03ee8a85b2
0 commit comments