Samenvatting

Summary of paper End-to-end Object Detection with Transformers

0 keer verkocht

Vak
CS4245

Instelling
Technische Universiteit Delft (TU Delft)

This is a summary of the paper End-to-end Object Detection with Transformers for the course Seminar of Computer Vision by Deep Learning in TU Delft

[Meer zien]

Voorbeeld 2 van de 7 pagina's

Bekijk voorbeeld

Geupload op 5 juli 2024
Aantal pagina's 7
Geschreven in 2023/2024
Type Samenvatting

€7,16

Ook beschikbaar in voordeelbundel v.a. €44,99

In winkelwagen

Op verlanglijstje

100% tevredenheidsgarantie
Direct beschikbaar na betaling
Zowel online als in PDF
Je zit nergens aan vast

Ook beschikbaar in voordeelbundel (1)

Full Paper Summary for CS by DL

€ 64,44 € 44,99 9 items

1. Samenvatting - Summary of the paper visual prompt tuning
2. Samenvatting - U-net convolutional networks for biomedical image
3. Samenvatting - Summary of the paper supervised learning based on temporal coding in spiking neural n...
4. Samenvatting - Summary of paper quo vadis, action recognition? a new model and the kinetics dataset
5. Samenvatting - Summary of paper pointnet: deep learning on point sets for 3d classification and segm...
6. Samenvatting - Summary of on translation invariance in cnns: convolutional layers can exploit absolu...
7. Samenvatting - Summary of nerf: representing scenes as neural radiance fields for view synthesis
8. Samenvatting - Summary of paper masked autoencoders are scalable vision learners
9. Samenvatting - Summary of paper end-to-end object detection with transformers
Meer zien

End-to-end Object Detection
with Transformers
Abstract
This approach removes the need for many hand-designed components like
non-maximum suppression procedure or anchor generation DETR doesn’t need
that! The main ingredients of the new framework, called DEtection TRansformer
or DETR are a set-based global loss that forces unique predictions via bipartite
matching and a transformer encoder-decoder architecture.
Prior methods: Current object detection pipelines include hand-crafted
components like spatial anchor generation and non-max suppression (NMS).
Each of these components is tuned specifically for a given task. For example,
NMS is threshold-based and requires an IOU (intersection over union) and
confidence threshold tuning to be able to effectively discard the overlapping
bounding boxes.

Introduction
Modern detectors address this set prediction task in an indirect way, by
defining surrogate regression and classification problems on a large set of
proposals, anchors or window centers. Their performances are significantly
influenced by postprocessing steps to collapse near-duplicate predictions.

DETR directly predicts (in parallel) the final set of detections by combining a common CNN
with a transformer architecture. During training, bipartite matching uniquely assigns
predictions with ground truth boxes.

Our DEtection TRansformer predicts all objects at once, and is trained end-to-
end with a set loss function which performs bipartite matching between
predicted and ground truth objects.

End-to-end Object Detection with Transformers 1

, Compared to most previous work on direct set prediction, the main features of
DETR are the conjunction of the bipartite matching loss and transformers with
(non-autoregressive) parallel decoding.

Related Work
Set Prediction
A task where a model predicts multiple elements whose ordering is not relevant
for correctness. (Essentially predicting multiple objects in an image).
The way this is solved now however is by introducing relationship or pre
defined knowledge into the model. For instance, the predicted bounding boxes
should not overlap significantly and should cover all detected objects.
Avoiding Near-Duplicates: In object classification sometimes there are the
same bounding boxes for the same predicition, this is solved by using NMS
however set prediction is set to resolve that.

Transformers and Parallel Decoding
Transformers introduced self-attention layers, which, similarly to Non-Local
Neural Networks, scan through each element of a sequence and update it by
aggregating information from the whole sequence.

Object Detection
Set-based loss: Several object detectors used the bipartite matching loss.
Recurrent detectors: Closest to our approach are end-to-end set predictions
for object detection and instance segmentation. Similarly to us, they use
bipartite-matching losses with encoder-decoder architectures based on CNN
activation to directly produce a set of bounding boxes. These approaches,
however, were only evaluated on small datasets and not against modern
baselines. In particular, they are based on autoregressive models (more
precisely RNNs), so they do not leverage the recent transformers with parallel
decoding.

The DETR model
Object Detection set prediction loss

End-to-end Object Detection with Transformers 2

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper guillemribes. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €7,16. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 65507 samenvattingen verkocht

Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Start met verkopen

Populaire Universiteiten

Populaire Hogescholen

Populaire Scholen

Populaire samengevatte studieboeken voor Communicatie en Taal

Populaire samengevatte studieboeken voor Economie en Bedrijf

Populaire samengevatte studieboeken voor Exact en Informatica

Populaire samengevatte studieboeken voor Gedrag en Maatschappij

Populaire samengevatte studieboeken voor Gezondheid en Geneeskunde

Populaire samengevatte studieboeken voor Onderwijs en Opvoeding

Populaire samengevatte studieboeken voor Recht en Bestuur

De beste samenvattingen om je Wft-diploma te behalen

De beste samenvattingen om je theorie examens te behalen

De beste samenvattingen voor je cursus in de Veiligheidsbranche

De beste samenvattingen voor Gezondheid & Hygiëne cursussen

De beste samenvattingen voor zakelijke cursussen

De beste samenvattingen voor je PABO WisCAT cursus

Populaire vakken

Populaire vakken

Populaire vakken

Boekverslagen en samenvattingen

Verkoper

Samenvatting

Summary of paper End-to-end Object Detection with Transformers

Document informatie

Onderwerpen

Geschreven voor

Verkoper

Voorbeeld van de inhoud

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Snel en makkelijk kopen

Focus op de essentie

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?