标签:ICCV

Structured Visual Search via Composition-aware Learning

Structured Visual Search via Composition-aware Learning Mert Kilickaya, Arnold W.M. SmeuldersQUvA Lab, University of Amsterdamkilickayamert@gmail.com, a.w.m.smeulders@uva.nl ……

Improving Word Recognition using Multiple Hypotheses and Deep Embeddings

Improving Word Recognition using Multiple Hypotheses and Deep Embeddings Siddhant Bansal CVIT, IIIT, Hyderabad, Indiasiddhant.bansal@students.iiit.ac.in  Praveen Krishn……

Viewing instance semantic segmentation as generative adversarial networks GAN Mask R-CNN: Instance semantic segmentation benefits from generative adversarial networks

Viewing instance semantic segmentation as generative adversarial networks Quang H. Le   Kamal Youcef-Toumi   Dzmitry Tsetserukou   Ali Jahanian{quan……

[

[ [clee@atmosp.physics.utoronto.ca[Department of PhysicsUniversity of Toronto60 St. George St, Toronto, M5S 1A7, Canada   [james.hogan@mail.utoronto.ca Abstract Cra……

Transcription Is All You Need: Learning to Separate Musical Mixtures with Score as Supervision

Transcription Is All You Need: Learning to Separate Musical Mixtures with Score as Supervision Abstract Most music source separation systems require large collections of isolat……

Adaptive Structured Sparse Network for Efficient CNNs with Feature Regularization

Adaptive Structured Sparse Network for Efficient CNNs with Feature Regularization Chen Tang* Wenyu Sun Zhuqing Yuan Guijin Wang Yongpan LiuDepartment of Electronic EngineeringTsi……

An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale

An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale Alexey Dosovitskiy∗,†, Lucas Beyer∗, Alexander Kolesnikov∗, Dirk Weis……

WaveTransformer: A Novel Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information

WaveTransformer: A Novel Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information An Tran, Konstantinos Drossos, and Tuomas VirtanenAudio Resea……

UAV LiDAR Point Cloud Segmentation of A Stack Interchange with Deep Neural Networks

UAV LiDAR Point Cloud Segmentation of A Stack Interchange with Deep Neural Networks Weikai Tan,  Dedong Zhang, Lingfei Ma, Ying Li, Lanying Wang, and Jonathan Li,&……

Investigating Cross-Domain Losses for Speech Enhancement

Investigating Cross-Domain Losses for Speech Enhancement Abstract Recent years have seen a surge in the number of available frameworks for speech enhancement (SE) and recogniti……