Home

/microsoft/ RegionCLIP: Region-based Language-Image Pretraining

Code Link
https://github.com/microsoft/regionclip
Description
However, we show that directly applying such models to recognize image regions for object detection leads to poor performance due to a domain shift: CLIP was trained to match an image as a whole to a text description, without capturing the fine-grained alignment between image regions and text spans. Code: https://github.com/microsoft/regionclip
Retrieved
2022/06/22
Stars
36
TOP