标签:Language modelling

UniCase – Rethinking Casing in Language Models

UniCase – Rethinking Casing in Language Models RafaÅ‚ PowalskiApplica.airafal.powalski@applica.ai/AndTomasz StanisÅ‚awekApplica.aiWarsaw University of……

Self-alignment Pre-training for Biomedical Entity Representations

Self-alignment Pre-training for Biomedical Entity Representations Fangyu Liu†, Ehsan Shareghi†,‡, Zaiqiao Meng†, Marco Basaldella†, Nigel Colli……

Subtitles to Segmentation: Improving Low-Resource Speech-to-Text Translation Pipelines

Subtitles to Segmentation: Improving Low-Resource Speech-to-Text Translation Pipelines Abstract In this work, we focus on improving ASR output segmentation in the context of lo……

Multi-Stage Pre-training for Low-Resource Domain Adaptation

Multi-Stage Pre-training for Low-Resource Domain Adaptation Rong Zhang, Revanth Gangi Reddy1  , Md Arafat Sultan, Vittorio Castelli, Anthony Ferritto,Radu Florian, Efsu……

Recurrent babbling: evaluating the acquisition of grammar from limited input data

Recurrent babbling: evaluating the acquisition of grammar from limited input data Ludovica PannittoCIMeCUniversity of Trentoludovica.pannitto@unitn.it&Aurélie He……

Scaling Systematic Literature Reviews with Machine Learning Pipelines

Scaling Systematic Literature Reviews with Machine Learning Pipelines Seraphina Goldfarb-Tarrant  Equal contribution, order determined by coin flip.  Alexande……

Interlocking Backpropagation: Improving depthwise model-parallelism

Interlocking Backpropagation: Improving depthwise model-parallelism /nameAidan N. Gomez /emailaidan.gomez@cs.ox.ac.uk/addrUniversity of Oxford & Cohere/AND/nameOscar Key /ema……

BERTering RAMS: What and How Much does BERT Already Know About Event Arguments? – A Study on the RAMS Dataset

BERTering RAMS: What and How Much does BERT Already Know About Event Arguments? – A Study on the RAMS Dataset Varun Gangal, Eduard HovyLanguage Technologies InstituteCarneg……

  GLU Variants Improve Transformer  

  GLU Variants Improve Transformer   Noam ShazeerGooglenoam@google.com Abstract Gated Linear Units [Dauphin et al., 2016] consist of the component-wise produc……

Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking

Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking Samuel BroscheitData and Web Science Group, University of Mannheim, Germanybroscheit@informati……