GWU NLP at SemEval-2016 shared task 1: Matrix factorization for crosslingual STS

Hanan Aldarmaki, Mona Diab

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

We present a matrix factorization model for learning cross-lingual representations for sentences. Using sentence-aligned corpora, the proposed model learns distributed representations by factoring the given data into language-dependent factors and one shared factor. As a result, input sentences from both languages can be mapped into fixed-length vectors and then compared directly using the cosine similarity measure, which achieves 0.8 Pearson correlation on Spanish-English semantic textual similarity.

Original languageEnglish
Title of host publicationSemEval 2016 - 10th International Workshop on Semantic Evaluation, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages663-667
Number of pages5
ISBN (Electronic)9781941643952
DOIs
Publication statusPublished - 2016
Externally publishedYes
Event10th International Workshop on Semantic Evaluation, SemEval 2016 - San Diego, United States
Duration: Jun 16 2016Jun 17 2016

Publication series

NameSemEval 2016 - 10th International Workshop on Semantic Evaluation, Proceedings

Conference

Conference10th International Workshop on Semantic Evaluation, SemEval 2016
Country/TerritoryUnited States
CitySan Diego
Period6/16/166/17/16

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computational Theory and Mathematics
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'GWU NLP at SemEval-2016 shared task 1: Matrix factorization for crosslingual STS'. Together they form a unique fingerprint.

Cite this