BETA
  • Data
  • Categories
  • People
  • Help
  • Help
  • About
  • Contact
  • Sign In
  • Data
  • Categories
  • People
  • Help

Data Set

Open Access Views icon 804

Russian state institutions full-text datasets

A collection of corpora based on contents extracted from the websites of Russian state institutions

Version 1.0, published: Oct. 29, 2024.

Giorgio Comai


Main category: Miscellaneous
Curated by: Eduard Klein
Giorgio Comai (2024): Russian state institutions full-text datasets – A collection of corpora based on contents extracted from the websites of Russian state institutions, v. 1.0, Discuss Data, https://doi.org/10.48320/0578D7FE-35F7-4E9E-A29D-926618A5C6BD

Description

This is a collection of full-text datasets based on contents extracted from the websites of Russian state institutions.

All datasets do not include items published after 31 December 2023.

These datasets have been introduced in the following book chapter, which offers additional context:

> Comai, Giorgio (2025, forthcoming), "Text-mining on-line sources from Russia openly", in *Autocracy, Influence, War: Russian Propaganda Today*, edited by Paul Goode

The name of each corpus is composed of the bare domain name, a two letter code of the main language of the contents, and the year of release of the dataset, separated by an underscore, e.g. `kremlin.ru_ru_2024` for the Russian-language version of Kremlin.ru.

This release includes the following websites:
- Russia’s president, kremlin.ru, in English, filename: kremlin.ru_en_2024, from 1999-12-31 to 2023-12-31. Items included: 33 165
- Russia’s president, kremlin.ru, in Russian, filename: kremlin.ru_ru_2024, from 1999-12-31 to 2023-12-31. Items included: 45 538
- Russia’s MFA, mid.ru, in English, filename: mid.ru_en_2024, from 2003-01-04 to 2023-12-31. Items included: 25 943
- Russia’s MFA, mid.ru, in Russian, filename: mid.ru_ru_2024, from 2003-01-02 to 2023-12-31. Items included: 56 203
- Russia’s government, government.ru, in Russian, filename: government.ru_ru_2024, from 2006-06-22 to 2023-12-30. Items included: 17 135
- Russia’s government (archived version), archive.government.ru, in Russian, filename: archive.government.ru_ru_2024, from 2008-05-07 to 2013-05-21. Items included: 7 103
- Russia’s prime minister (archived version), archive.premier.gov.ru, in Russian, filename: archive.premier.gov.ru_ru_2024, from 2008-05-07 to 2012-05-07. Items included: 3 323
- Russia’s Duma, duma.gov.ru, in Russian, filename: duma.gov.ru_ru_2024, from 2006-04-05 to 2023-12-30. Items included: 29 094
- Russia’s Duma (transcripts), transcript.duma.gov.ru, in Russian, filename: transcript.duma.gov.ru_ru_2024, from 1994-01-11 to 2023-12-15. Items included: 6 032

File formats: compressed csv files (.csv.gz); Open Document Spreadsheets (.ods)

A web version of the documentation accompanying this release is available online:
https://tadadit.xyz/datasets/2024/russian_institutions_2024/

Explore through a basic web interface:
https://explore.tadadit.xyz/2024/ru_institutions_2024/

Countries

Russia

Keywords

Russian President Text Mining Russian Institutions Parliament Government

Language of data

English Russian

Disciplines

Communication Studies Political Science

Methods of data collection

Text Mining

Methods of data analysis

Word Frequency

  • Terms and Conditions
  • Privacy Policy
  • Imprint
  • Disclaimer
© 2025 Research Centre for East European Studies at the University of Bremen and Göttingen State and University Library. All Rights Reserved.