English

Resources

CLARIN’s repository contains datasets, models and software. They are mostly products of the language technology program, but also various other data submitted by institutions and individuals. On malbankinn.is, the website of The Icelandic Language Bank (CLARIN-IS B-centre), most of what is found in the repository is listed in a structured way, which gives a good overview of the content of the repository.

News

September 4, 2025

The CLARIN B-Centre in Iceland, hosted by the Árni Magnússon Institute for Icelandic Studies, has changed its name and is now called The Icelandic Language Bank. To mark the occasion, a new website — https://malbankinn.is — was launched today, providing secure and accessible access to Icelandic language resources.

Everyone is welcome to download materials from the bank, but the main target groups are researchers and students in the humanities and social sciences who study Icelandic language and society, as well as developers who wish to access datasets, models, and tools related to language technology. The data continue to be hosted on CLARIN’s repository.

February 10, 2025

The Icelandic Gigaword Corpus (IGC) has now been expanded with data from 2022 and 2023. This additional data can be downloaded from the CLARIN repository and searched on the Corpora Website of the Árni Magnússon Institute. In addition, the Corpora Website has been updated and some minor flaws have been fixed.

The first edition of The Icelandic Gigaword Corpus was published in 2018 and new editions appeared every year for the first five years. Each time new data was added and tagging methods were improved. The first edition contained about 1,259 million running words, while the second edition contained 2,439 million running words. It was not considered necessary to publish the corpus in its entirety this time, as the methods of tagging and processing of texts have not changed since the last edition was issued. Therefore, an addendum with data in 2022 and 2023 was published, containing around 162 million running words. On the Corpora Website people can search in a new version of the corpus where the new data has been added to the 2022 edition.

October 9, 2024

The Árni Magnússon Institute's language processing website is up again, enhanced and improved. There you can use the following tools, both by pasting text into a form and by using an API:

Tokenizer - Tokenizer from Miðeind ehf
PoS Tagger - POS from The Language and Voice Technology Lab at the University of Reykjavík
Lemmatizer - Nefnir by Jón Friðrik Daðason
Hyphenation Tool - Skiptir from The Árni Magnússon Institute

News Archive

CLARIN ERIC

CLARIN ERIC is an ESFRI initiative, one of EU's ERICs – CLARIN stands for “Common Language Resources and Technology Infrastructure” and ERIC stands for “European Research Infrastructure Consortium”. CLARIN ERIC operates according to statutes that have been approved by the European Commission.

The main goal of CLARIN ERIC is that all digital language resources and tools from all over Europe and beyond are accessible through a single sign-on online environment to support researchers in the humanities and social sciences, and for use within language technology.

CLARIN-IS

Iceland joined CLARIN ERIC on February 1st, 2020, after having been an observer since November 2018. The Ministry of Education, Science and Culture assigned The Árni Magnússon Institute for Icelandic Studies the role of leading partner in the Icelandic National Consortium and appointed Professor Emeritus Eiríkur Rögnvaldsson as National Coordinator. Starkaður Barkarson replaced Eiríkur Rögnvaldsson in October 2021 and was replaced by Steinþór Steingrímsson in September 2025. Most of the relevant institutions participate in the CLARIN-IS National Consortium.

During the first years, The Árni Magnússon Institute ran a Metadata Providing Centre (CLARIN C-Centre) but was upgraded in 2023 to a Service Providing Centre (CLARIN B-Centre) which provides both service and access to resources and knowledge. The B-centre is called The Icelandic Language Bank and operates the website malbankinn.is.

CLARIN á Íslandi