Aralex: A lexical database for modern standard Arabic

Sami Boudelaa, William D. Marslen-Wilson

Research output: Contribution to journalArticlepeer-review

73 Citations (Scopus)

Abstract

In this article, we present a new lexical database for Modern Standard Arabic: Aralex. Based on a contemporary text corpus of 40 million words, Aralex provides information about (1) the token frequencies of roots and word patterns, (2) the type frequency, or family size, of roots and word patterns, and (3) the frequency of bigrams, trigrams in orthographic forms, roots, and word patterns. Aralex will be a useful tool for studying the cognitive processing of Arabic through the selection of stimuli on the basis of precise frequency counts. Researchers can use it as a source of information on natural language processing, and it may serve an educational purpose by providing basic vocabulary lists. Aralex is distributed under a GNU-like license, allowing people to interrogate it freely online or to download it from www.mrc-cbu.cam.ac.uk: 8081/aralex.online/login.jsp.

Original languageEnglish
Pages (from-to)481-487
Number of pages7
JournalBehavior Research Methods
Volume42
Issue number2
DOIs
Publication statusPublished - May 2010
Externally publishedYes

ASJC Scopus subject areas

  • Experimental and Cognitive Psychology
  • Developmental and Educational Psychology
  • Arts and Humanities (miscellaneous)
  • Psychology (miscellaneous)
  • Psychology(all)

Fingerprint

Dive into the research topics of 'Aralex: A lexical database for modern standard Arabic'. Together they form a unique fingerprint.

Cite this