Journal Digital Corpus : Swedish Newsreel Transcriptions
The Journal Digital Corpus (JDC) is a corpus comprising transcriptions of Swedish historical newsreels, primarily sourced from the SF Veckorevy newsreels produced between the early 1910s and the 1960s. JDC includes transcribed speech from 2,553 newsreels (over two million words) and intertitles from 4,333 videos. Utilizing custom-built Python libraries, SweScribe and stum, the corpus facilitates u