- | Monetary Policy Monetary Policy
- | Mercatus Original Podcasts Mercatus Original Podcasts
- | Macro Musings Macro Musings
- |
Andrew Martinez on the Art of Forecasting
What does the NGDP Gap reveal about monetary policy?
Andrew Martinez is a former Treasure economist and currently is an assistant professor of economics at American University. In Andrew’s first appearance on the show, he discusses his career as a forecaster, the current state of forecasting, the intersection of AI and forecasting, the role of the SEP and monetary policy surprises, his work with David on the NGDP Gap measure, and much more.
Subscribe to David's new Substack: Macroeconomic Policy Nexus
Read the full episode transcript:
This episode was recorded on January 13th, 2025
Note: While transcripts are lightly edited, they are not rigorously proofed for accuracy. If you notice an error, please reach out to [email protected].
David Beckworth: Welcome to Macro Musings, where each week we pull back the curtain and take a closer look at the most important macroeconomic issues of the past, present, and future. I am your host, David Beckworth, a senior research fellow with the Mercatus Center at George Mason University. I’m glad you decided to join us.
Our guest today is Andrew Martinez. Andrew is a former Treasury economist and comes to us now from the economics department at American University. Andrew joins us today to discuss the art of forecasting, his work at the Treasury Department, measuring monetary policy shocks correctly, and more. Andrew, welcome to the show.
Andrew Martinez: Thanks for having me, David. A long-time listener, first-time participant. It’s great to be here.
Beckworth: Glad to have you on. This is a long time coming. We’ve actually worked together. I have visited you several times when you were at Treasury. I’ve been eager to get you on the show, but you’ve always been in a position where you can’t because of where you work. Now you are an academic, you can say what is on your mind, right?
Martinez: Right. I will note that I still am doing some part-time consulting work for Treasury, so I still want to issue a disclaimer that these are all of my views and not necessarily those of any of my employers.
Andrew’s Career
Beckworth: Okay. As I mentioned, I’ve actually done some work with Andrew. We’ll get to that later. Tell us about yourself, your career. How did you get into economics and into forecasting in particular?
Martinez: It’s an interesting story. My dad was in the military, and so I moved around the country and the world quite a bit. I spent a lot of time in Germany growing up, went to a German school. When it came time to go to college and what I wanted to study in college, I initially thought, “Oh, I want to be a diplomat, use this experience that I’ve had from living around the world.” I had been to these small schools. I went to high school in rural Maine. I wanted a small liberal arts school, so I chose this small liberal arts school in North Carolina, Guilford College, to go, started going there.
I think it was my spring semester of my freshman year, I took a skills course or a careers in international affairs course. One of the things they said or emphasized was, if you want to do the track for the State Department, the Foreign Service track, there’s economic affairs, political affairs, and they were saying the economic affairs track is almost always undersubscribed. There’s a lot of interest in people who have that economics interest but, at the time, there wasn’t a lot of people, so it made sense to start taking some economics courses. I did that. I started taking a macroeconomics course, and I just fell in love with it.
I think my second economics course was really about looking at transitioning from the Soviet Union into the Soviet economies versus market economies, and doing research on these former USSR or former Soviet economies. I ended up looking at comparing East Germany versus West Germany and their labor market developments, and really seeing how those differences, even after the wall fell in 1990, those persisted for decades, the differences in unemployment rates and the unification. That really sparked a real interest in me in research, and it was a bug that has kept on going. That’s what really sparked my interest in economics initially.
Beckworth: At Treasury, you were doing a lot of forecasting work, right?
Martinez: Right.
Beckworth: In fact, that’s where I’ve worked with you. You’ve been the forecaster in our group of researchers.
Martinez: Exactly.
Beckworth: Tell us how you become a specialist in that area.
Martinez: My career has taken a bunch of turns. In undergrad, I got really interested in international economics. Initially, was really interested in trade. Got really interested in trade, ended up doing a master’s program at GW in international economic policy. Right before I started that, I decided to take a year off and work in construction, and the financial crisis happened. I’m working in construction in rural Maine as the housing market and the financial crisis is imploding.
Beckworth: Wow. What timing.
Martinez: I go to GW to start my master’s program, and I’m like, “Trade is interesting, but I’m really interested in what’s going on, like financial crises,” and so started doing that. As I got more and more into that, I realized, “Oh, it’s really methodology. There’s a lot of methodology components to thinking about these things,” and then also got really interested in time series. I had taken a course from Fred Joutz at GW. He’s since passed away, but he was this really interesting macroeconomist, really applied work. The GW department had a lot of applied work.
At the time, it was Herman Stekler, Fred Joutz, Tara Sinclair, who I’ve known since then, and Neil Ericsson were these professors who all taught there. I took this course from Fred Joutz, and then even after graduating—I was working at the IMF at the time—I took a course from Neil Ericsson on forecasting and just completely fell in love with it. I think economics tries really hard to be this really rigorous hard science as best it can in this imperfect world, where forecasting, I think, is really honest about what it’s doing. It’s using theories, it’s using data, but it’s fundamentally making these guesses about the world.
Then, depending on what horizon you’re forecasting at, you get to see either immediately, or sometimes years later, whether your guess was correct. There’s this really interesting feedback pattern in terms of learning about how your guesses are doing, and making these guesses. That was really attractive to me. I really started that while I was at GW, then moved to Oxford. I wanted to do a PhD, so I went and did a PhD at Oxford under David Hendry, who’s a big-time—
Beckworth: Well known.
Martinez: —time series macro econometrician. He was my supervisor, and really got into and studied forecasting under him. Then that’s what led me to then come back. I’ve always had this interest in the interplay between economic policy and academic research and being able to apply those back and forth. I had interned during my PhD at the Cleveland Fed, so I’d had some experience there, but then started at the Treasury in the fall of 2019, working in their macro analysis group doing forecasting for them. That’s the circuitous way of my story.
Beckworth: That’s so interesting. When you went to Treasury, you had that Tara Sinclair connection, I guess. Was she there?
Martinez: No, Tara showed up later. I started in fall of 2019. Tara, I think, started in 2021, ’22.
Beckworth: You actually preceded her?
Martinez: Exactly. We had known each other since 2008, 2009, when I was at GW. They have a forecasting group that I’ve participated in. Our connection goes way back, yes.
Beckworth: We’ll come back to Tara a few times in this conversation, specifically, the big one being a paper that you have with her, working paper, and we’ll talk about that.
State of Forecasting
Let me go back to forecasting. If I can be so bold to say, what is the art or the current state of forecasting as it is in 2026? I want to maybe go down two paths. Let’s talk about AI’s interaction, but put that to the side for now. If I just said, what is the current state of the art in forecasting, what would you say, Bayesian forecasting? If you were to teach it, what would you say?
Martinez: I will be teaching it in the fall. I have been teaching a course. I had been teaching a course with Neil Ericsson in the past couple of years. I think it’s important to take a step back and think about when we’re coming from this perspective of forecasting, we’re thinking about macro forecasting. There’s a broader world of forecasting, weather forecasting or forecasting, all sorts of things.
Beckworth: Good point.
Martinez: If you think about macro forecasting, there has certainly been an evolution of that over the last several decades. Historically, in macro forecasting, it’s really hard to often beat very simple models. There’s the Meese and Rogoff example of forecasting exchange rates, where it’s really hard to beat random walks. In the, I think, ’70s and ’80s, you had these large-scale macro models that filtered into policy, and you had the Lucas critique out of that instability. Really, behind all of this, there was how well do these models forecast? In particular, during periods of crises and instability, how well can you forecast?
It’s pretty easy to come up with a model that forecasts well during periods of stability. If we’re sticking at a 2% or 3% growth rate, and I constantly just forecast a 2% or 3% growth rate, I’m going to do pretty well. You didn’t have to have a model, just have a simple mental model, to generate that forecast. Big data, I think, has come in and really revolutionized that art. Stock and Watson with these dynamic factor models, and that has really come across.
Then machine learning and model selection algorithms. David Hendry, my supervisor, really emphasizes these model selection or machine learning algorithms, where you can select across a large number of variables, and then also thinking about instabilities, such as structural breaks, both in-sample and out-of-sample. Thinking about time-varying parameters or thinking about instabilities, that can be done in thinking about, in a frequentist way, in a Bayesian way. Bayesian analysis has become very popular with computational power being extremely powerful.
Beckworth: Oh, interesting.
Martinez: Those are all ways to thinking about instabilities and thinking about really shrinking down the parameters instead of trying to reduce this risk of overfitting. There’s constantly this risk of, are you overfitting or are you not explaining enough? That’s this bias-variance tradeoff that you’re constantly facing, both as an econometrician and especially as a forecaster. That’s the econometrics methodological literature there.
In the policy world, as a forecaster, you don’t necessarily have all of those techniques at your disposal. Fundamentally, when you’re doing a forecast for policymakers, they don’t want to just know what the forecast is; they want to know why. You can’t just go and think about these huge factor models, or these huge selection models, and select down and be like, “Here’s the forecast.” You have to have a story. You have to come up with, say, “Why is this the way it is?” In that case, a lot of times you’re sticking to more simpler methods, or robust methods that work fairly well, to try and be able to tell this reasonable story where you might not have it quite exactly right, but hopefully you’re getting the main story there.
Beckworth: That is so interesting, and so many questions I could ask. The why part is really fascinating. Let’s start with that. I was at the George Washington’s forecasting conference we just had a few months ago. I participated in it. One of the presentations talked about these models, the machine learning models that do select, basically you throw every forecasting model under the sun into some system and you let the machine determine which are the best ones that fit. That was always the question. Okay, you get it, and then you take it to your boss, and you had to explain it, and you can’t. And so you have to come up with something that is digestible and maybe some causal story. You’re saying it’s easier for policymakers to get simpler models because you can tell the story more directly?
Martinez: Right. That’s the key role of economists in all of this because you can have a data scientist or a computer scientist, really, or even computers nowadays just spit out answers. The key role for economists here is to think about what is that economic mechanism. What is leading to what? If we’re thinking there’s going to be higher inflation in the future, what is driving that? What is at least the story that we think is going on, so that at least we have that story, then we can validate that story empirically and say, “Oh, we were wrong. Here’s why,” or, “We were right. Here’s what’s going to happen next.”
I think that for policymakers certainly to be able to act upon that information and to understand that, it’s important to have that story. As a policy economist who’s participating in that process and trying to provide the best advice, you’re necessarily using those simpler models. Now, that doesn’t mean that you only use those models. In the background, we might be running these other models that are more advanced to be able to understand them.---
We’re oftentimes comparing against what the private sector is forecasting or these consensus forecasts, and that has led to a lot of my research, or running a lot of different alternative models, some of which are interpretable, some of which might just be more of a black box, and seeing, is the story we’re telling consistent with what others are telling? Is it similar to what other methods and models are giving us, even though those methods and models might not be explicitly transparent in terms of where those results are coming from?
Beckworth: I want to go back to this comment you made about Bayesian models, or Bayesian statistic versus classical or frequentist. You mentioned that it’s become more popular, in part because it’s easier or cheaper to run these models. You would say the reason that we see so much Bayesian econometrics today is because people can do the computation.
Martinez: I think that’s partially true. Economics definitely goes through different cycles and fads in methods and methodologies, so you see that going on. There’s definitely, in statistics and also in econometrics, there’s always different schools of thoughts. Definitely, right now and certainly in the US, the Bayesian econometrics is a very popular school of thought. It’s quite nice. Implicitly, the value of Bayesian is that you’re able to impose these priors on the data.
Particularly, when you are in situations where you don’t have a lot of data, then you impose priors on the data, and that helps govern the data that you have. Now, it’s increasingly used in situations where you have tons and tons of data, so you’re able to still regularize and impose the priors. I come from a more frequentist school of thought. David Hendry’s [method is] based on this likelihood-based approach, which has similarities to the Bayesian school, but it’s much more let the data speak in terms of, you know, without imposing those priors, even knowing that, really trying to understand the data, where there might be measurement issues, and trying to deal with it directly rather than indirectly through priors.
Beckworth: Maybe we should define Bayesian thinking or Bayesian statistics. Let me do a really dumbed-down example for people like me out there who are listening who aren’t as technical as you. This is an example I’ve heard, and you tell me whether it makes sense or not. Imagine you want to come up with an estimate of the average potato in the world. You’ve gone out to your local shopping market and you’ve seen a few potatoes. That’s your prior knowledge, like, okay, a certain size, weight. You use that to form an opinion. That’s your prior estimate of what you think the world’s average potato is like. That’s a Bayesian. Then as you see more potatoes through your life, you update that, you bring more information in.
Whereas, a frequentist would say, “No, I don’t have enough information. I’ve got to actually have a good sample-sized drawing on laws of large numbers, get my normal distribution, draw the inferences, do hypothesis testing.” The frequentist would say, “No, you really can’t make that strong conclusion. You’re imposing a lot of prior belief structure on the data.” The Bayesian would say, “Look, I got to do something. I’ve got to make some informed choice.” Is that a fair reading?
Martinez: Yes, I think so. The risk there is that if you’re relying too much on those priors, if you’re coming into the world and saying, “I have this view that all potatoes are the Yukon gold potatoes.” Then I’m going to Peru and I’m saying, “Only Yukon gold potatoes,” and then I come across and I see these purple and other types of potatoes, and that’s not a potato, right?
Beckworth: Yes.
Martinez: The risk is if you’re imposing that structure in a world that’s subject to change, that you might be sampling that prior knowledge on something that is either based on the past that is subject to change or based on a sample that is different from the one you’re dealing with. I think that frequentists, really, that’s why they say, “We have to stick to the data as much as possible to not bias our results.” Now, it really comes down to how much you’re imposing the strength of your prior beliefs.
If you’re a good Bayesian, you’re usually comparing what the resulting distribution is to your prior distribution, and making sure that you’re not restricting the data too much. Those are different schools of thought.
Beckworth: Yes. There’s tradeoffs to doing both approaches.
Martinez: Exactly.
AI and Forecasting
Beckworth: A few more questions on forecasting, then we’ll get to your papers. The hardest part, as I understand it, in macroeconomic forecasting is the turning points. You mentioned the economy is growing at 2%, 3% a year. That’s easy. It’s predicting when we’re going to go into a recession or a massive boom. How do you get to those turning points? That’s where you, the forecaster, has to come in and play.
That leads me to this next question I want to ask, and the final one on our forecasting discussion here, and that is, what role is AI playing with forecasts? We mentioned Tara Sinclair. She has a paper that she discussed on the podcast previously where she used these AI agents. She feeds them information as if they’re members of the FOMC. She can do almost like counterfactual forecasts, or conditional forecasts, if they’d had an employment report before the meeting versus after the meeting, would they have done things differently? I’m wondering, do you see promise, hope, great innovations where they have the interaction of AI with forecasting?
Martinez: I think there’s two things in there. The question is, how do we think about turning points, and then what is the value of AI potentially around those turning points, or more generally, in forecasting? When we’re thinking about forecasting, and particularly forecasting turning points, it’s really about how much new information do we have? Can we predict that turning point based on the information we have at the time?
If you think about the Great Financial Crisis, or even to some extent the pandemic, but I think the Great Financial Crisis is easier because you can say, with hindsight, there were these subprime mortgages built up, these risks to the financial system. There were individuals making bets on the economy that the housing market would go into a downturn. There was information there that, with hindsight, if you had that and embedded that within a model, then you could forecast well.
That has been a lot of innovation since the Global Financial Crisis in adapting these models to get higher frequency information, or different types of incorporating financial market risk into models to be able to think about these tail risks and thinking about incorporating that into the forecast. That is something that has been going on prior to AI, but has been going on with other innovations, with scraping text data or big datasets. AI can be very useful in that sense in terms of thinking about taking all of that information. AI is very useful at basically really accessing all of our publicly, and sometimes not publicly, available information to be able to learn from that. That’s, I think, a benefit of AI and really where methods and models are trying to go.
There’s a tradeoff though, and that’s a tradeoff we found during the pandemic. As somebody who was working at Treasury during the pandemic, you had a lot of these measures initially were very good, initial unemployment claims coming out on a weekly basis, really exploding. Then you had a lot of these private companies making data accessible, hotel reservations or dinner table reservations, things like this. That became really useful initially during the pandemic when there was this huge change going on, and you really needed to just try and calibrate the magnitude of that change. That outweighed any other variation going on. After you move out of that phase where you get this huge shift, then you get to a period where a lot of that higher frequency data, really granular data, is starting to add a lot of noise.
There’s a tradeoff between when you’re going through the shift, these high-frequency measures, these really leading indicators, so to speak, can add a lot of value, but there’s a tradeoff where they can start adding noise when you’re not going through a major shift like that. How do you balance that? That’s something economists, econometricians, forecasters are constantly dealing with. In terms of AI, like I said, I think it’s very useful in terms of being able to sift through massive amounts of data and potentially get at things that might not be at the top of our minds or might not be thinking about.
The difficulty there, I think, is something that forecasters are constantly dealing with is real time. In real time, we’re dealing with the data we have access to now. We have this limited information set of what’s available to us now. Policymakers, forecasters, we’re all dealing with that. The markets are dealing with that. We’re having to make our forecasts based on that data. If I were to go back today and to make forecasts based in 2007, 2008, I have access to a lot more information, a lot more data. I can make much better forecasts than I would have been able to make back then knowing what I know now, even if I limit myself to the data that I had back then.
That’s the difficulty of evaluating these AI forecasts. These AI forecasts are generated on this huge amount of data. Going back in time, they might be able to produce these forecasts that look amazing, but how well do they do in a real-time data situation where we’re updating them with that information at the time and they don’t necessarily have this understanding of, “Oh, that’s how the pandemic worked.”
Returning to Tara’s paper, they really tried to make this effort of really looking at the July interest rate decision, and were really trying to only use information available up until then. So they could look at this one decision, and it looked pretty good, but it makes that process of understanding how well it would do in other situations much more difficult without constraining the models and methods to really only having the information that policymakers or forecasters would have had right at that time.
They really tried to do that in July, but there’s more work that needs to be done in terms of methods and analysis to really see how far we can push these AI methods in doing these counterfactual experiments.
Beckworth: Great points. We got to govern the AI agents to be realistic to know what they would have known had they started back then. The other point I think you’re alluding to as well is that data itself gets revised, so even if we had the real-time data, like GDP didn’t show the sharp contractions in 2008 until data revisions came out.
Martinez: There’s a really interesting paper—I’m going to blank on the authors—but it’s some Federal Reserve staff economists last fall came out and looked at how does AI perceive data revisions or data vintages.
Beckworth: Oh, interesting.
Martinez: They went back and asked AI, “What was GDP at this time, assuming you only have this information?” They found that there’s a essentially this weird smoothing going on where AI is sometimes biasing itself and pretending it doesn’t have that information, and sometimes it does, and so there’s these weird look-back, look-forward things going on, which make thinking about real-time data and real-time vintages really difficult in a setting, even when you’re telling AI explicitly, “Do not use this information.”
That’s partly because AI is trained on all of the data in the internet, and all of the data in the internet isn’t necessarily structured data or well-defined data that says, “This was generated on this particular date.” Some of that data is structured and well defined, but a lot of that data isn’t, and so you have a lot of these issues. There’s methods being developed to try and basically estimate these AI models in real time, and that’s certainly something that’s going on. There’s a lot of research that’s still going into that.
The SEP and Monetary Policy Surprises
Beckworth: Fascinating. Let’s move on to your papers. Let’s start with your paper that you have done jointly with Tara Sinclair, and it’s titled “When the Fed Reveals Its Hand: The SEP and Monetary Policy Surprises.” This speaks to this question of how much information do you have in real time. Walk us through it. Tell us what’s going on here.
Martinez: I think it helps to think about the motivation for this paper. I’d been at the US Treasury and following the macroeconomy, and what I had noticed, or what we had noticed, was that particular—and I think this was particularly true in the December 2024 meeting—is you saw a lot of market reaction to the FOMC meeting and the statement, even when the interest rate decision itself was very well telegraphed. There wasn’t a particular surprise about that meeting’s decision, but there were questions that came up about the FOMC’s outlook, so what’s contained in the summary of economic projections that they release every quarter.
The idea was how do we get at understanding what role these economic projections might play that the FOMC is releasing? Do these matter? How is the market reacting to them? What we did is looked at, really, this literature. The empirical macroeconomics literature is really interested in understanding exogenous monetary policy and understanding what role does monetary policy play in the economy.
One of the ways they have been doing this in recent years is looking at financial market reactions, high-frequency financial market reactions around FOMC statement releases. Once they make their decision, how does the financial market respond in this 30-minute window around that decision? There’s the structure of the way the FOMC releases their SEPs. They release them every other meeting, so four times per year, which allows you to think about how are markets reacting to non-SEP releases versus SEP releases? It gives you these interesting counterfactuals.
What we did is we first said, “How would we think about this in a theoretical framework,” and then, “What are the predictions from that theoretical framework that we can then test?” In a theoretical framework, we came up with a fairly simple model, which essentially says that during SEP releases, there’s a release of information that the market is getting above and beyond the interest rate decision. They’re getting to see what the Fed’s outlook for these variables are.
You can think of that as really a noisy measure of what the Fed’s true forecast would be because these are the medians, the medians of all the FOMC participants, not necessarily the voting members. You can think of this as a noisy proxy of the FOMC’s true information that they’re paying attention to. That gets released during these SEP release meetings. Then during the non-SEP release meetings, the idea that we came up with was that, really, markets are likely anchoring to that previous forecast that the SEP gave.
During SEP meetings, the market is formulating their expectations of what the SEP is going to be, they make their predictions of what it’s going to be, and then they get to evaluate those predictions against what the SEP actually is. Naturally, what you’re going to have is these bigger surprises during forecast release meetings when you have both the interest rate decision but also that new information about the forecast release, which markets can then use to update their expectations. That was the idea. The question was, “Do we see that in the data?”
We first started looking at just these measures of monetary policy surprises, which academics use as exogenous in their papers to try and understand the effects of monetary policy, so these high-frequency surprises. We looked at whether there was a difference in these surprises during SEP release meetings and non-SEP release meetings, and we find that there was. Looking at the absolute value, the surprises are significantly larger during SEP release meetings, suggesting that there’s more information release going on, and that it’s order of magnitude about two times the size for the sample that we look at.
Then, we also say, “Okay, there could be other things going on here. What is driving those differences?” Then Bloomberg has a really unique survey. Bloomberg conducts a survey roughly 10 days before every FOMC meeting, of what market participants, like 50 banks essentially, what they think the interest rate decision is going to be. On SEP meetings, they also ask, what does the bank think that the FOMC’s SEP, the median SEP is going to be for each of the individual variables?
So you get a measure of what market expectations for the SEP is going to be, then you can use that to look at the differential between that and what the actual SEP is, so you get what we call an SEP surprise. You get that for all the different variables at different horizons. Then we think of different ways of aggregating it to think of, does this actually explain the differential between these larger financial market surprises during SEP release meetings versus non-SEP meetings? And we actually find significant evidence that it does. That’s that in a nutshell, and then we do some additional analysis looking at whether individual banks are updating their own forecasts in response to these surprises as well, and find that they do.
Beckworth: Okay. Just to summarize, you take the widely used kind of event study work that looks at, let’s say, Fed Fund futures going into the meeting and see if it changes after the meeting. That’s the typical shock, right?
Martinez: Yes.
Beckworth: You’re like, “Hey, let’s look at SEP versus non-SEP,” and there is this statistically significant difference between the two.
Martinez: Correct.
Beckworth: And what explains it? You say it’s this surprise in the SEP, the summary of economic projections, the numbers that come out versus what was expected. Now, the Bloomberg measure, does that come out right before the meeting, so it’s pretty close?
Martinez: Exactly. The survey is conducted typically 10 days before the meeting, and then closed about seven days before the meeting. It’s closed basically during the blackout period. I mean, there’s still about seven days in between, and we try to do some robustness checks to see if there’s additional information that could come out between that. But yes, it’s about as close as you can get in terms of a survey.
Beckworth: That is interesting. Now, what would that mean for me, a researcher, if I’m plugging these shocks into a vector autoregression or some kind of analysis? Do I need to take account of these differences?
Martinez: Right. This really ties into the literature. There’s this longer literature on trying to understand these monetary policy shocks and thinking about the information effect of the central bank, when the central bank releases this information, and that is what others have argued leads to these puzzles. Nakamura and Steinsson have a 2018 QJE paper where they say it looks like there’s this information effect, and it leads to these puzzles where when you’re using this monetary policy shock, you get these unexpected responses where inflation might go up initially in your VAR, even when there’s a restrictive monetary policy shock. The idea there is that there’s this information effect where the central bank is releasing additional information above and beyond what is thought to be an exogenous monetary policy shock. That’s why we think of them as monetary policy surprises, because it’s not just the shock, there’s additional information being released in that.
There’s literature on this saying, oh, it’s information effects, then there’s others, Michael Bauer and Eric Swanson very popularly say, it’s not information, it’s news. It’s the response to news. And so we’re kind of adding within the context of even controlling for news, there seems to be these additional information effects, particularly through the release of the SEP.
Beckworth: Okay. We’re talking about high-frequency measures of monetary policy shocks, and you say there’s more information, call it a surprise. I’m just wondering, as I sit here and I think about the Romer and Romer measure, which isn’t high-frequency financial data, but it’s based off of forecasts from the Tealbooks, the Fed staff forecasts. I’m wondering if the Romer and Romer measure also should take into account more recently this development.
Martinez: Romer and Romer actually have a 2008 paper. It’s somewhat different, but they look at the differential between the staff forecasts and the FOMC forecasts at the time. They actually argue that the differential between the staff and the FOMC feeds into their types of monetary policy surprises. So, to the extent that markets are aligned with the staff, then there could be a similar dynamic going on there. But really, our approach is really unique in terms of thinking about what others have thought is this really narrow window in which not a lot of information can happen. We’re adding to this literature that’s saying, even after you’re controlling for as much as possible, it looks like there’s still these information effects going on.
Nominal GDP/Expectations Gap
Beckworth: We’ve been talking about forecasts, about measuring monetary policy shocks, time series analysis. As it turns out, we have collaborated on a project which I have called the nominal GDP gap. You’ve called it expectations gap. It’s an interesting story, because you were working with a friend of mine, Alex Schibuola, at Treasury. I had developed this idea of a nominal GDP gap measure, and I did it because I wanted to come up with a way to get a sense of how much excess or shortage of aggregate demand pressures there were in the economy based on what was expected at the time.
What I did is I went to a consensus forecast measure, the Survey of Professional Forecasters. You guys went to the Blue Chip when you did this later. I just took an average of expectations leading up to some point of what nominal GDP in level terms would be, dollar size, compared to what it actually turned out to be. That difference is the gap, or if you want to call it a shock—I know shock may be too strong of a term—but it’s definitely a difference than what was expected.
My thinking at least was, well, if households are making forecasts for, do I take a mortgage, do I do a car loan, do I invest in certain activities, they have to have some sense of where their nominal incomes or dollar incomes are going. Any divergence between that expectation and what actually occurs is a measure of how much excessive aggregate demand or stimulus there is in the economy versus if there’s a shortage. Similar to an output gap measure, but this is looking at more of a nominal demand versus real measures. I did that, and maybe if you want to flesh it out better than I just did, feel free to do so. But you guys picked it up as well, because Alex was thinking about it. Is that right?
Martinez: Yes. This all occurred, if I recall, during the early days of 2020, and so Alex and I were still at Treasury at the time, and we were both working from home. Your brief came out, and we were both reading it. I think Alex might have circulated it, but I may have seen it independently. Anyways, we both came across it and both got really interested in that measure, for slightly different reasons. Both of us had been working with the Blue Chip, because the Blue Chip forecast is something we followed very closely when we were working on the administration’s forecast. And it’s a monthly measure, so it was something we were very curious to see.
I think your measure only went back to 1990. I think the Blue Chip at the time had data back to or 1980s or something like that, so I think there was a real interest in seeing what it looked like as it went back. I know Alex started really pushing on looking at the Blue Chip. We were in the midst of this incredibly deep recession, really the early days of this incredibly deep recession, and so the idea was really understanding that slack.
The way I was tackling it was really trying to think about it from a methodological standpoint: How does that differ, this so-called NGDP gap measure, even just from its technical construction, from more empirically oriented output gap measures? Then also, can we think about uncertainty or thinking about sensitivity to that? You were working with the Survey of Professional Forecasters, which is publicly available, using the mean or the median measure. The question was, well, the Survey of Professional Forecasters has the individuals, so can we construct it from an individual basis and try to think of the range of what those individuals might be thinking to give you a range of that? That was something I had worked on, and something we implemented for the Survey of Professional Forecasters as well as the Blue Chip Forecasters.
Then also, looking at how does this compare to these other output gap measures? That’s why my notation is slightly more thinking about the expectations gap and the NGDP expectations gap, because I think you can create based on your construction, where implicitly, effectively what it is just looking at forecasts over the past five years of the current quarter growth or the current quarter’s level of GDP, and then comparing how the average of those forecasts compares against what the actual out-turn was. From an economic perspective, it seemed like that made a lot of sense in terms of setting expectations for your borrowing and things like that, but then the question is, how does this compare to the other measures?
In particular, what I found was there’s a measure that had become popular, the Hamilton filter, which is a way of getting an output gap empirically. Some economists, Josefine Quast and Maik Wolters, I think based in Germany, had done this extension of the Hamilton filter where the Hamilton filter is essentially just looking at a forecast, generating forecasts, and then looking at those forecasts based on a particular horizon. They had extended this to look at an average of those forecasts and looking at it over a particular window, and you can think about those windows. They emphasized that if you chose a window around five years, then that tends to align quite nicely with a lot of these other policy institution forecasts.
It seemed like, oh, here we have a measure that isn’t based on a particular model, but it’s based on forecasters’ models of what they think GDP, particularly NGDP, is going to be, and then you’re averaging over those. And so you can think of this as a model-free measure of this empirical output gap measure, at least in a level sense. You were applying that to nominal GDP, but one could also think about applying that to real GDP. Then the question was, how well did this perform? Really focusing on the NGDP aspect, which is something that people seem to have not been interested up until the time, it seemed to be very useful, really, particularly during the pandemic and post-pandemic.
Beckworth: Yes, and you took the initiative and ran with this. So, I did a policy brief, then you and Alex did a working paper at Treasury, we actually also published it here, then you took the lead. Even though my name is on this paper, you took the lead and went out and just did a very technical approach to this. It is now being published in the International Journal for Central Banking.
Martinez: Right.
Beckworth: I’ll mention our editor was Chris Waller, Governor Chris Waller, so shout out to you, Chris, for getting this in. Yes, that was great that you went ahead and pushed that forward.
Martinez: Right. I think the motivation for that particular paper is that we had done this working paper, it wasn’t clear how we wanted to push that forward in terms of a published paper, but I had been reading Ben Bernanke’s book at the time, his 21st Century Monetary Policy book. He had a chapter in there which talked about the 2011 framework review, and talked about the NGDP target proposal and how one of the concerns was this concern about its reliability, about whether if you started targeting NGDP, whether you, either due to data revisions, could lead to some pretty serious policy mistakes. That really made me wonder, well, have people looked at these measures and analyzed these measures, the reliability of these measures to the same extent that they’ve done with typical output gap measures?
There’s a lot of papers on looking at different measures of estimating the output gap and how reliable they are in real time. So we went back and looked at the transcript of that 2011 meeting and saw that that was indeed the case. There were a lot of concerns about the reliability of these measures. Then the question was, well, how reliable are they? In particular, how reliable is this new measure that we had been working on, which is essentially more of a forecast-based NGDP gap measure, which you can think of a variable growth rate targeting NGDP measure?
Yes, that was the approach we took, and got some really interesting results in the sense that we can apply the measure to the Survey of Professional Forecasters and the Blue Chip. You can also go back and essentially construct a similar measure based on Fed Reserve Staff forecasts in the Greenbook and Tealbooks at the time, so you can compare the reliability of this kind of forecast-based expectations gap, real or nominal, versus their actual published output gap. And you actually see that using that information, completely real time, purely based off of what they produced, that measure is much less volatile than the actual output gap measure, which has been critiqued historically.
Orphanides has pioneered this literature on the unreliability of these output gap measures. But actually, the expectations gap, real and nominal from the Fed staff forecast, is actually much less volatile, much less subject to revision. Then you actually see a similar thing for the SPF forecast and the Blue Chip forecast. It basically performs as well as the best empirical benchmarks of output gap statistical measures. That was really exciting to see, and then we did some other exercises as well.
Beckworth: Yes, it’s a paper. I think it might be useful to go back and look at the history of this project. Why did I get into it in the first place? One of the reasons was that people who advocated nominal GDP targeting, or at least were sympathetic to it during the years after the Great Financial Crisis, they would often draw this straight linear line of a slope with nominal GDP and show this big gap: “Hey, man, aggregate demand has collapsed dramatically, we’re going to have hysteresis. We’re going to lose potential real GDP if we don’t close it.”
It’s a great illustration. It’s very stark, very sobering. The problem is, you can’t just draw a straight line forever. Eventually, the economy adapts, and people’s expectations of where nominal GDP is going to go, their plans for their dollar incomes adapt, and so George Selgin told me, “David, you need to come up with a way where you can update that trend line. Don’t just draw a naive linear trend. Find a way that matches reality as people update. We know it’s got to curve at some point.”
I was like, “How do I do this?” And so what I did is I went and I was looking at these forecasts, and I was like, “Okay, if I take the average of forecasts from forecasters and assume it translates into what households are at least implicitly thinking about their dollar incomes, and then I just got to make it at a horizon.” I started thinking, like you’ve suggested, that maybe people do make plans a couple years out, three, four, maybe five years out. I started with those numbers, and then I started mapping it onto actual cyclical unemployment. I was like, man, that looks very similar, like the gap, or other output gap measures. The beautiful thing, as you noted, is this is simply averaging a bunch of forecasts. There’s no structural model. It’s theory-free, I guess, and that to me was like, wow, this is incredible. It shows something you can find through a lot harder work with a structural model.
Martinez: Right. As an econometrician—and we started the conversation talking about forecasting and structural breaks—it’s a model-free measure that is somewhat more robust to structural breaks because people, forecasters, are adapting their models, changing their expectations based on accounting for the pandemic or the Great Recession, and so without having to change the model, it’s naturally adapting based on what forecasters are using as their cutting-edge methods for forecasting.
Beckworth: Okay. With that, our time is up. Our guest today has been Andrew Martinez. Andrew, thank you so much for joining us.
Martinez: Thank you for having me, David.
Beckworth: Macro Musings is produced by the Mercatus Center at George Mason University. Dive deeper into our research at mercatus.org/monetarypolicy. You can subscribe to the show on Apple Podcasts, Spotify, or your favorite podcast app. If you like this podcast, please consider giving us a rating and leaving a review. This helps other thoughtful people like you find the show. Find me on Twitter @DavidBeckworth, and follow the show @Macro_Musings.