RF-DETR: A SOTA Segmentation Model with NAS

This title was summarized by AI from the post below.

Three big releases here RF-DETR segmentation is the best segmentation model available RF-DETR paper and repo gives a reproducible Apache 2.0 SOTA model to the community Neural architecture search is how RF-DETR is Pareto optimal over the previous state-of-the-art models More task types incoming soon!

View profile for Piotr Skalski

Open Source Lead @ Roboflow | Computer Vision | Vision Language Models

RF-DETR paper is out! 🔥 🔥 🔥 TL;DR: RF-DETR is a real time detection transformer built on top of DINOv2 and weight sharing NAS. One training run explores thousands of architectures and produces a full accuracy latency curve for both detection and segmentation. - DINOv2 backbone: DINOv2 brings strong visual priors, improves results on small or unusual datasets, and provides a solid foundation for the NAS search space. - NAS over ~6000 configs: Training samples a new architecture every step. Resolution, patch size, decoder depth, queries, and window layout shift dynamically while all subnets share one set of weights. - Detection: RF-DETR N hits 48.0 AP at 2.3 ms, matching YOLOv8 M and YOLOv11 M at about 2x their speed. - Segmentation: RF-DETR-Seg N reaches 40.3 mask AP at 3.4 ms, outperforming the largest YOLOv8 and YOLOv11 models. ⮑ 🔗 paper: https://lnkd.in/dNgSV4FH Huge congratulations to Peter Robicheaux, Isaac Robinson, and Matvei Popov for making it happen! #computervision #opensource #paper #transformers

To view or add a comment, sign in

Explore content categories