Abstract: This paper improves upon the Pix2Seq object detector by extending it for videos. In the process, it introduces a new way to perform end-to-end video object detection that improves upon ...
DEIMv2 is an evolution of the DEIM framework while leveraging the rich features from DINOv3. Our method is designed with various model sizes, from an ultra-light version up to S, M, L, and X, to be ...
Abstract: Modern diffusion-based image generative models have made significant progress and become promising to enrich training data for the object detection task. However, the generation quality and ...