DINOv2 + text encoder model is accepted to CVPR.