This dataset consists of two parts:
(1) A database of more than 10,000 transcripts published on the the official website of the President of Russia since 1999, December 31st (last update 2023, August 30th).
(2) A frequency analysis of the terms used by president Putin and president Medvedev in these documents.
This is the data used in the word counting tool at https://putin.dekoder.org/.
New in version 1.2:
This version has been updated to include texts up to 30 August 2023. Furthermore, it now also includes filtered versions of the original transcripts (with only Putin's and Medvedev's words left) and pre-processed wordlists (lemmatised versions of the filtered transcripts) as used in the analysis.