Annif - tool for automated subject indexing

How to use Annif

Choose subject vocabulary

Prepare a corpus from training data

Load the vocabulary and train a model

Suggest subjects for new documents

Annif uses a combination of existing natural language processing and machine learning tools including TensorFlow, Omikuji, fastText and spaCy. It is multilingual and can support any subject vocabulary (in SKOS or a simple TSV format). It provides a command-line interface, a simple Web UI and a microservice-style REST API.

Get Annif

Code and documentation for Annif is available on GitHub (Apache 2.0 license). Annif can also be installed from PyPI and as a Docker image from Quay.io. Annif is mainly being developed at the National Library of Finland, but others are welcome to join in!

Latest releases

Models

There is a collection of downloadable Annif models in the 🤗 Hugging Face Hub.

Discuss Annif

The annif-users mailing list and web forum is available on Google Groups. The forum is meant for general discussion about Annif, asking for help, and announcements of new versions. All messages are public and anyone is welcome to join!

Please use the forum instead of sending personal e-mail to the Annif developers.

Current users

Finto AI - service for automated subject indexing.

Yle, the Finnish Broadcasting Company, uses Annif to assign tags to online news articles.

The German National Library uses Annif as the core of its automated subject indexing system Erschließungsmaschine (EMa).

National Library of Sweden uses Annif for automated classification of scholarly publications.

The National Library of Poland uses Annif as a part of the DESKRYPTOR service for automated subject indexing.

Storia Oy generates metadata about upcoming books with Annif.

On Data.europa.eu Annif aids in tagging datasets with EU vocabularies.

AATA, a research database by the Getty Conservation Institute, uses Annif in their indexing process to suggest terms from Getty's Art and Architecture Thesaurus.

In Jyväskylä University Digital Repository and in repositories of other institutes (Osuva, Trepo, Theseus, Taju, Lauda) Annif assists the subject indexing of theses and dissertations.

ZBW – The Leibniz Information Centre for Economics uses Annif as a part of their automated indexing service AutoSE (read more here).

More users of Annif and/or Finto AI

Publications

A paper on arXiv describes the Annif system which was used in participating to the SemEval-2025 Task 5: LLMs4Subjects. Our system was ranked 🥇 1st in the category where the full vocabulary was used, 🥈 2nd in the smaller vocabulary category and 🏅 4th in the qualitative evaluations.

The Annif user survey report (2025) includes opinions about Annif as well as insights and challenges related to its use.

An article that investigates the usage of Annif for Dewey Decimal Classification was published in 2024 in the Journal of Documentation.

An article about Annif and Finto AI has been published in 2022 in the peer-reviewed Open Access journal JLIS.it.

The software itself is also archived on Zenodo and has a citable DOI. See the README on the Annif GitHub project site for more details including BiBTeX snippets.

How to use Annif

Try the demo!

Suggested subjects