- | International Freedom and Trade International Freedom and Trade
- | Policy Briefs Policy Briefs
- |
Predicting North Korea’s Policy Changes Through Propaganda: A New Open-Source Method
How advances in machine learning and AI can help inform the West about Pyongyang’s intentions and potential future policies.
The Democratic People’s Republic of Korea, or North Korea, which has been under the Kim family’s rule for three generations over nearly 80 years, remains one of the most repressive and secretive autocracies on earth. However, as North Korea’s nuclear and missile programs continue to advance, the stakes could not be higher for policymakers in the United States and its allies to better understand, or even anticipate, North Korea’s policy shifts.
In this policy brief, we introduce a new open-source method for analyzing North Korean state propaganda to gain insight into the North Korean regime’s priorities. Six years ago, one of us, Weifeng Zhong, cocreated the Policy Change Index (PCI) project for China, which predicts Beijing’s moves by analyzing the text of the People’s Daily, the Chinese Communist Party’s most prominent official newspaper.1 This study is an extension of that approach, this time using North Korea’s flagship propaganda outlet, Rodong Sinmun, or the Workers’ Newspaper.
Like its Chinese and Soviet counterparts, North Korean propaganda primes public opinion in preparation for policy implementation, often in an ideological and obscure style. Thanks to recent advances in machine learning and artificial intelligence, we were able to train a machine algorithm to mine the Kim regime’s propaganda for the subtle shifts that can signal Pyongyang’s potential future policies.
The “Value” of North Korean Propaganda
Western audiences often dismiss North Korean propaganda as synonymous with blatant misinformation and disinformation, understandably thinking of North Korea’s bizarre videos on social media,2 its anti-Western leaflets,3 or even its garbage-laden balloons flying to South Korea.4 But while Pyongyang’s propaganda is difficult for Westerners to interpret, it is in fact of great intelligence value.
The role, and therein the hidden “value,” of the Kim regime’s propaganda is to persuade the masses of its agenda in order to rule them more effectively. This role has a deep connection to China’s and the Soviet Union’s historical uses of propaganda. Kim Il Sung, Kim Jong Un’s grandfather, visited China over 40 times in his lifetime and modeled the Workers’ Party of Korea (WPK) on the Communist Party of China, which treats news media as a purely ideological tool. As early as the 1950s, Mao Zedong repeatedly impressed on the People’s Daily’s editorial staff that the newspaper must be run by politicians, not by intellectuals, and that it must sensitize the people to the party’s new policy ideas in a timely manner.5
The WPK also looked to the Soviet Union in developing their use of propaganda. Vladimir Lenin and Josef Stalin explained the persuasive nature of propaganda in the Soviet Union’s early years: The masses are “backward,” so the Communists need to convince the people that the government’s policies are sound and that they should follow the party’s lead.6 This was done by making all aspects of culture—from media and books to music and cinema—political.7
Like the Communist parties in China and the Soviet Union, the WPK assigns media the role of ideological “education,” and runs the party’s most prominent official newspaper, Rodong Sinmun, out of its central committee. The WPK also appoints a propaganda tsar who fully and carefully controls the government’s messaging at home and abroad. North Korea’s propaganda needs to explain the party’s agenda and mobilize the people toward it, and this process can provide some transparency about North Korea’s intentions to outside observers. Moreover, the more unexpected Kim’s policies are to the North Korean people—which is more the rule than exception in Pyongyang—the more “education” is required to bring the people onboard before implementation. This manifests as more space in Rodong Sinmun devoted to upcoming policy announcements. That is, words speak sooner than actions, which enables the anticipation of significant policy moves by the regime through its propaganda.
The WPK’s bureaucracy is tailor-made to implement this persuasive vision of propaganda. Just like their Soviet and Chinese models, the Kims have been closely involved in deciding what words should be uttered by party mouthpieces and how. In a manual on how journalists should do their jobs, Kim Jong Il—the editor of the volume—gave a detailed account of how, in October 1979, his father demanded that the Rodong Sinmun’s editorial staff cover a particular campaign on the newspaper’s second page, as opposed to on a less prominent page.8 The campaign, known as the “Three-Revolution Team Movement,” was meant to promote Kim Jong Il as his father’s successor, which was not an obvious choice given that Kim Jong Il had no public profile at the time and faced fierce competition from rivals such as his uncle and siblings.9 As a result of preparatory propaganda like this in those critical years, Kim Jong Il was able to consolidate support within the party and solidify his heir-apparent status in the early 1980s.
Despite decades and generations of leadership changes in the WPK, its propaganda tsar remains a core member of the party central, suggesting that the essential role of Rodong Sinmun has persisted. Kim Jong Il controlled the WPK’s Propaganda and Agitation Department (PAD) from the late 1960s to 1985, shortly before taking the party’s helm.10 The PAD was subsequently run for three decades by Kim Ki Nam, a close and prominent confidant of the Kim family, before it was taken over in 2015 by Kim Yo Jong, Kim Jong Un’s sister.11
The Policy Change Index for North Korea (PCI-NKO)
Capitalizing on the value of propaganda as signaling North Korean leaders’ plans, the Policy Change Index for North Korea uses modern AI tools such as deep learning and large language models (LLMs) to detect and interpret changes in the policies that Rodong Sinmun features and prioritizes. Because of the “educational” function of propaganda in the North Korean regime, we hypothesize that when the newspaper changes the emphasis of its coverage, changes in Pyongyang’s policies will follow once the party deems the North Korean public ready.
The PCI-NKO framework
We took the following steps to develop the PCI-NKO algorithm.12
- Collect the full text of Rodong Sinmun from January 2018 to February 2024 and label a set of essential metadata for each article, such as publication date, title, content, and page number. In particular, we focus on whether or not an article was published on the front page—a simple but effective indicator of the importance of the content.
- For every four years of data (such as from January 2018 to December 2021), train a deep learning model tailored to the Korean language to predict whether an article was published on the front page.13 The goal of this step is to characterize, to the best of the algorithm’s performance, how the newspaper prioritizes different content based on the party’s policy direction.
- Deploy the model to the three months following the four-year window and assess whether the algorithm’s performance (in identifying front-page articles) is significantly different from that in training. The goal of this step is to detect prioritization anomalies over the three-month window relative to the newspaper’s four-year baseline.
- Define the difference in the algorithm’s performance between the training period and the deployment period as the value of the PCI-NKO at the point of analysis. When the index is high, it suggests that the editorial priorities in those three months have shifted from the preceding four years.
- Use LLMs to interpret the anomalous articles detected in the three-month window. False positives—articles that were predicted to be on the front page but were not—represent policies with declining priorities. Similarly, false negatives—articles that were predicted to be off the front page but that were actually on the front page—signal policies that are becoming more prominent.14
Repeat the analysis every month, resulting in a monthly PCI-NKO from April 2022, the earliest data point given the scope of the raw data, to February 2024.
To test the hypothesis that change in the PCI-NKO is predictive of a change in policies, we curated a list of major policy events in North Korea during the period of the analysis and examined whether their occurrences were preceded by significant movements in the PCI-NKO.
Figure 1 plots the monthly PCI-NKO from April 2022 to February 2024, with the value of the index measuring the difference in the algorithm’s performance between the four-year training period and the three-month deployment period. In the figure, we also overlayed the timing of major policy events on the index for comparison. The PCI-NKO does tend to spike before policy events take place.
Predicting North Korea’s nuclear arsenal expansion
The clearest example of a PCI-NKO spike preceding a major policy change in North Korea occurred in April 2023.
The PCI-NKO increased to and stayed at an elevated level in early 2023, registering a value of 0.1 on April 1, 2023. As outlined in step five above, we leveraged LLMs to interpret the changes in Rodong Sinmun’s coverage that were registered by the spikes of the PCI-NKO. We then assessed whether the elevated level of 0.1 was indeed predictive of the regime’s policy change. The April 1 spike represents a major editorial change in Rodong Sinmun between the four years from January 2019 to December 2022 and the three months from January to March 2023. Using LLMs, we categorized sampled articles from the training and deployment windows into 10 policy areas. We then examined the areas where the algorithm makes more false negative mistakes (measured by the false omission rate) in deployment than in training, indicating emerging priorities compared to the baseline.
Figure 2 compares the false omission rate (false negatives as a percentage of predicted negatives) across policy areas, and it shows that “defense and national security” has the biggest training-to-deployment contrast in false omission. Specifically, there were 10 articles in February and March 2023 about defense and national security that were incorrectly predicted by the model to not be on the front page. Yet, they were prominently featured on the front page. Most of these misclassified articles highlighted advancements in intercontinental ballistic missile (ICBM) technology and military readiness, reflecting Kim Jong Un’s focus on military exercises and counterattack training and suggested military progress. These anomalies detected by the algorithm highlight the increasing prominence of North Korea’s weapons program in Rodong Sinmun’s coverage.
Consistent with this prediction, on April 13, North Korea conducted the inaugural launch of the Hwasong-18, its first long-range, solid-fueled ICBM, potentially putting the entire continental United States within Pyongyang’s weapons’ range.15 This marked yet another weapons program advancement, followed by other North Korean military events, including the launches of its spy satellites, which initially failed in May but succeeded in November of that year. While weapon developments are concerning and often surprising, they may well be foreseeable when one knows how to look at Pyongyang’s propaganda.
Limitations
One important limitation of the PCI approach is that, while LLMs can assist with interpreting anomalies in propaganda, the policy implications of those anomalies are not always as straightforward as in the April 1, 2023 episode. For example, the PCI-NKO recorded a value of 0.14 on March 1, 2023, the highest in the entire timeframe. An analysis of false negatives similar to what is shown in figure 2 suggests that “social policy and ideology” was the most significant emergent theme. However, as far as we know, there were no major events in that policy area around that time aside from a prolonged absence of Kim Jong Un in the public eye in January 2023. But reading into intricate political subtleties, such as the absence of a leader, is beyond the algorithm’s current capabilities.
We also acknowledge that the algorithm’s training performance still has room for improvement in terms of gearing it toward the specific context. The deep learning model we used—and pre-trained language models more generally—may not take into account the difference between North and South Korean languages, which have diverged significantly after decades of separation.16 Additionally, the uncertain policy directions of the North Korean government may also have contributed to the algorithm’s performance issues. After all, the PCI approach is at best only as predictive as the government is predictable. We relegate these opportunities for improvement to future studies.
Conclusion
As the North Korean regime turns more aggressive yet secretive, Pyongyang’s intentions become a bigger unknown to policymakers in the West. This is a challenge that calls for new thinking and methods. North Korean propaganda may be the door to understanding the unknowns of North Korean policy, and modern AI technology is our key to unlocking it.
Appendix
Acknowledgments
We would like to thank Nick Eberstadt for an inspiring conversation when the idea for this project was first conceived and Tiago Ventura for his helpful advice on the language models for Korean text used in this study. All errors are our own.
About the Authors
Zhiqiang Ji holds an MS in data science for public policy from Georgetown University and a PhD in political philosophy and American politics from Claremont Graduate University. His current research interest is applying data science, natural language processing, and artificial intelligence to the study of ideology and political propaganda. Ji is also an external contributor to the open-source Policy Change Index project (policychangeindex.org).
Weifeng Zhong is an affiliated scholar at the Mercatus Center at George Mason University and a senior advisor to the Office for Fiscal and Regulatory Analysis at the America First Policy Institute. Zhong has a PhD in managerial economics and strategy from Northwestern University. His work bridges the fields of machine learning and artificial intelligence and public policy studies. He is also a core maintainer of the open-source Policy Change Index project (policychangeindex.org).
Notes
1. Julian TszKin Chan and Weifeng Zhong, “Reading China: Predicting Policy Change with Machine Learning” (AEI Economics Working Paper No. 2018-11, American Enterprise Institute, 2019).
2. Dasl Yoon, “This Song Is Catchy and Going Viral. It’s Also North Korean Propaganda,” The Wall Street Journal, July 1, 2024.
3. “North Korea Is Dropping Leaflets on the South—What Do They Say?” BBC, January 19, 2016.
4. Hyung-Jin Kim, “North Korea Flies Trash-Carrying Balloons to South Korea in Another Retaliation Against Leafletting,” Associated Press, June 8, 2024.
5. Mao Zedong, Long Live Mao Zedong Thought (Wuhan University Press, 1968).
6. Alex Inkeles, Public Opinion in Soviet Russia: A Study in Mass Persuasion (Harvard University Press, 1950), 17–18, 162.
7. Peter Kenez, The Birth of the Propaganda State: Soviet Methods of Mass Mobilization, 1917–1929 (Cambridge University Press, 1985).
8. Kim Jong Il, “Giving Prominence to the Three-Revolution Team,” The Great Teacher of Journalists (Pyongyang: Foreign Languages Publishing House, 1983), 37–39.
9. Ra Jong-yil, Inside North Korea’s Theocracy: The Rise and Sudden Fall of Jang Song-thaek (State University of New York Press, 2019), 35.
10. Jae-Cheon Lim, Leader Symbols and Personality Cult in North Korea: The Leader State (Taylor and Francis, 2015), 10.
11. “Kim Yo Jong in De Facto Power of PAD,” Daily NK, July 20, 2015, https://www.dailynk.com/english/kim-yo-jong-in-de-facto-power-of-p/.
12. This section describes the algorithm at an abstract level. For more details, please see the PCI website at https://policychangeindex.org/ and the open-source code repository at https://github.com/PCI-ORG/PCI-NKO.
13. The model used here is known as RoBERTa, a robustly optimized pre-trained language model developed by Meta AI. See “RoBERTa: An optimized method for pretraining self-supervised NLP systems,” Meta AI, July 29, 2019, https://ai.meta.com/blog/roberta-an-optimized-method-for-pretraining-se….
14. See the next subsection for an analysis of an example episode.
15. “The DPRK’s First Solid-Propellant ICBM Launch,” Open Nuclear Network, April 14, 2023, https://opennuclear.org/open-nuclear-network/publication/dprks-first-so….
16. “North and South Korea Through Word Embeddings,” Digital NK, December 23, 2017, https://digitalnk.com/blog/2017/12/23/north-and-south-korea-through-wor….