;

kremlin_en - A textual dataset based on the contents published on the English-language version of the Kremlin’s website

A corpus in tabular format with all posts published on the official website of the president of the Russian Federation between 31 December 1999 and 31 December 2020

Version 1.0, published: May 9, 2021

Open Access

Giorgio Comai


Main category: Miscellaneous
Giorgio Comai (2021): kremlin_en - A textual dataset based on the contents published on the English-language version of the Kremlin’s website, v. 1.0, Discuss Data, <doi:10.48320/5EB1481E-AE89-45BF-9C88-03574910730A>.

Description

This corpus includes all posts published on the official website of the president of the Russian Federation between 31 December 1999 and 31 December 2020.

The corpus complies with the Text Interchange Format (TIF): https://docs.ropensci.org/tif/

The corpus is available as a csv file as well as a txt file.

This dataset includes word frequency counts aggregated by year.

For a full technical note, see:
https://castarter.giorgiocomai.eu/kremlin_en-about/

You can explore this dataset in an interactive interface at the following link:
https://castarter.giorgiocomai.eu/kremlin_en/

This dataset is available as a data package for the R programming language at the following link:
https://github.com/giocomai/tifkremlinen

Countries

Russia

Keywords

Russia Official Discourse Content Analysis Russian President

Methods of data collection

Text Mining

Methods of data analysis

Word Frequency