| India Political Economy Program India Political Economy Program
| Mercatus Original Podcasts Mercatus Original Podcasts
| Ideas of India Ideas of India
November 7, 2024

Aarushi Kalra on Digital Polarization and Toxicity, Understanding User Behavior, Social Media Algorithms, and Platform Incentives

Kalra and Rajagopalan examine user agency in India’s social media landscape

SHRUTI RAJAGOPALAN: Welcome to Ideas of India, a podcast where we examine academic ideas that can propel India forward. My name is Shruti Rajagopalan, and this is the 2024 job market series, where I speak with young scholars entering the academic job market about their latest research on India.

I spoke with Aarushi Kalra Ph.D. candidate in Economics at Brown University. We discussed her job market paper, “Hate in the Time of Algorithms: Evidence from a Large-Scale Experiment on Online Behavior.” We talked about the demand and supply of toxicity against minorities on social media platforms, user behavior, platform behavior, real world segregation due to ethnic violence, and much more.

For a full transcript of this conversation, including helpful links of all the references mentioned, click the link in the show notes or visit mercatus.org/podcasts.

Hi, Aarushi. Welcome to the show. It’s such a pleasure to see you again and speak with you.

AARUSHI KALRA: Thank you so much, Shruti, for having me. I’m really excited.

Exploring How Social Media Users Engage with Toxic Content

RAJAGOPALAN: You have a job market paper on something that’s super topical right now because we’re all talking about how a very polarized and toxic kind of politics and conversation is playing out on social media, and this is happening across the world. It’s happening in Brazil. It’s happening in the United States. Of course, it’s happening in India. In your job market paper, you are studying this in India through a very large-scale randomized control trial.

In this paper, you basically try to answer the big question of whether social media users engage with harmful content or toxic content because the recommendation algorithms are pushing users towards [that] or because users genuinely prefer that kind of content, and they might go seeking it. To test this, you ran this large-scale RCT with over a million users for a social media platform that should remain unnamed, but it is TikTok-like.

You basically look at Hindi-speaking or rather Hindi-language posts, and you find that without the personalization algorithm, users saw or viewed 27 percent less toxic content, but they also spent 35 percent less time on that particular social media platform overall. On the other hand, if they are personalized, this is not true. They see more toxic content, or they may see more toxic content, but they also engage more.

The even more interesting result I found was that the proportion of toxic posts that are shared for each toxic post that is viewed actually increases by 18 percent. Even though if they are randomized, the total number of toxic posts viewed is fewer, but the number that’s shared goes up. Basically, showing that in some sense those with a stronger preference for this kind of toxic or polarized content are still seeking it out and still want to share it with others. While these algorithms may amplify certain types of engagement, there is also a deep-seated preference amongst users who may be drawn towards that content. That’s what’s going on with the users.

The second, or rather the third result that I found interesting is, there’s a trade-off at the platform level because the more randomized the content, which means the average individual is exposed to less toxic content, they also engage less on the platform, which means that the platform may not do as well if their revenues are coming from engagement and sharing.

Is this a good summary of what you’ve tried to do? Because it’s quite a complex setup, and you’re studying something that is super nuanced, so I want to make sure we get this set up right.

KALRA: Yes, I think this was a really nice summary. I will just say one thing about the result on toxic shares. Basically, I do find that the total number of toxic posts shared and viewed both go down.

But as a proportion of exposure, the number of toxic posts that are shared actually goes up. In a relative sense, sharing is actually worsening.

RAJAGOPALAN: Before we get into that, maybe this is a good time for you to tell us why you ran this experiment. Then we can get into what these results mean.

Understanding the Drivers of Toxic Speech on the Internet

KALRA: I think there’s a big narrative in India about how the IT cell in India is leading people down these rabbit holes. There were also some stories in the media about how there’s a big infrastructure that leads people into engaging with this kind of content. I always felt like this was a very one-sided story because this was focused entirely on the supply side.

What I wanted to see is, what is going on in the demand side that we think that people are these passive users? Whereas what I find—and what my hunch was going into this—is that people are active users who are going to seek out the content that they want to engage with. That was the starting point for this experiment and project.

The second thing, I think a deeper motivation is also, like you pointed out, the trade-off for the platform itself. I think while the paper does suggest that platforms cannot be expected to self-regulate, it’s also a larger question of whether we can design a system where the values of a platform can ever be aligned with the values of a benevolent social planner because users seek it out, and the platform clearly does benefit, in a certain sense, from creating these filter bubbles. It’s also about, how do we think of policy that’s cross-platform so that we can come up with systems that are safer for everyone to use, including minority communities.

When I first started this project, the initial motivation was just to understand what are the drivers of hate speech in India on social media. I was looking at various changes in incomes, in political power, to see if what we are seeing in the unfortunate situation in India right now—is it a product of backlash? I think that is very hard to study because changes in the real world are much slower than changes on online platforms. So I started thinking about, what are the randomized control trials that I can run on the platform itself? People identify these algorithms as public enemy number one in terms of driving this kind of behavior, so that is where I wanted to intervene and see how true that is.

Definitions of Toxic Content

RAJAGOPALAN: You define toxic content as posts that are hateful, harmful, or likely to make someone feel unwelcome or leave a discussion. This can be against any group. It can be against a majority group, a minority group. It could be across religious, ethnic, gender, caste, linguistic lines, so it’s across the board. You don’t sneak in any particular bias of what you may find toxic. To assess this, you actually use Google Perspective API to look at 20 million posts. Did I get that right?

KALRA: Yes.

RAJAGOPALAN: I found that extraordinary that you were able to—not you personally, but through machine learning—you were actually able to extract and study 20 million posts.

At the baseline level, when you say toxic, you mean quite generally toxic. Like if there’s someone out there who thinks it’s okay to kill all people who own horses, let’s say. That’s very targeted toxic content, though that might not be very particularly on caste or linguistic lines. You also, of course, more specifically look at the question of minority rights in India, or rather minority groups in India more than minority rights.

Did I get that part right? Because I want to make sure our setup is right because there’s so many moving parts to this.

KALRA: Absolutely. I think this is a very important point, so I think you’re right. At baseline, I do consider toxicity in general, like it could be sexist posts. Unfortunately, in that definition, even sexist jokes like “wife beating up husband” kind of a thing would also just be included in this definition. There’s not much I can do about that because the perspective of API technology is basically the best out there. I can’t write an algorithm that does better than that. What I do is that I focus on political posts that are toxic, and that intersection takes me tospecifically anti-minority posts.

Scale of Data and Choice of Language

RAJAGOPALAN: The sample is enormous. You’re talking about a platform that has about 200 million users. You are treating only 20 million in the experiment, and about 4 million are selected overall, so it’s pretty large scale. Was there a reason you chose only the Hindi language users? They’re about a fifth of the total number of users, so it’s not a narrow sliver that you’re picking. It’s a pretty large number. Was there a reason for it? Is it because that’s what the IT cells are supposed to be targeting or was it just for linguistic consistency and to be able to use the API? What is the reason for that?

KALRA: It was mostly linguistic accuracy. I am most proficient in Hindi out of all the more than a dozen languages on this platform. I think it’s a very interesting question just about language. This scholar at Rutgers, Kiran Garimella , has a paper comparing hate speech in different languages. It’s very interesting how it plays out in different contexts and languages. But I thought that given that I’m machine learning a bunch of stuff, I wanted to be confident in what I’m doing, so I chose to stick with Hindi for this project.

Impact of Recommendation Algorithms on User Engagement

RAJAGOPALAN: The way I understand it is, there are different kinds of users on social media. I follow a lot of dog-related posts because I am a dog crazy person, and I have two dogs. Even my dogs have Instagram accounts. I follow a lot of dog content. The way it goes is the more dog content you follow, the more of that kind of content is fed to you, or rather personalized for you. Maybe “fed” makes it seem like there’s a puppet master pulling the strings, but it’s basically personalizing it based on the preferences.

I think I have recently made the mistake of seeing too many memes on the Tauba Tauba song which means more memes on the Tauba Tauba song will show up on my feed. That’s part of it.

Now, there are going to be all kinds of users. There are some folks like me who are basically really not looking for political content. We’re looking for dog content or funny memes. There are people who are very specifically looking for user content. When you turn off personalization and replace it with randomization, what is the effect on these very diverse groups of people? Because you are looking at a particular posts which is the political posts. Is it a homogenous effect? Is it a heterogeneous effect? Maybe you can shed some light on the behavior of users on social media.

KALRA: Yes, that’s a great question. I think for me, for instance, there’s a lot of cat videos that I see. If I were treated, I might just start getting dog videos, and it’s debatable whether I would like that or not. Basically, the big decrease in total viewers’ viewership or total engagement on the platform is essentially coming from the fact that people are not seeing this particular kind of niche content that they’re used to seeing. The treatment in itself is not targeting toxic content because it’s hard to identify what is toxic, especially for the platform. They don’t want to take a stand on what is toxic and what is not.

The treatment intensity in itself is different for different kinds of users. For instance, someone who is not seeing anything political at baseline is likely to start seeing some political posts. On the other hand, if I was someone who sought out this kind of nasty content on the internet, I’m going to see a lot less of that content. It is the change in treatment intensity for this particular group that I’m going to be mainly exploiting in order to draw my conclusions.

In the theoretical part of the paper, I try to think of an intervention that basically targets these toxic posts but is something that is very hard to implement in the real world, so I do that theoretically.

RAJAGOPALAN: Basically, one possibility is that someone like me who usually only sees dog posts could end up seeing more toxic posts with the randomization, right?

KALRA: Yes.

RAJAGOPALAN: While there are going to be some people who seek it out, who will see fewer toxic posts, I may also get turned off from the platform because I wasn’t looking for that. So I’m going to see more toxicity, and I want to step away from it. Did I get that right? Are both these effects true? Because when sometimes we look at only the results of the experiments, we’re talking about the average treatment effect, and we don’t look at these very specific things going on.

KALRA: Yes. In some sense, the average treatment effect is a little misleading because the treatment intensity is different depending on who you are. This doesn’t fit into a framework where we can have the assumption that each unit is being treated by a similar treatment or the same exact treatment. You are right that users who weren’t seeing a lot of toxic content at baseline are people who start seeing more toxic content.

Key Findings on Toxic Content Exposure and Sharing

Interestingly, what I find is that these are the people who don’t change their behavior at all. Even though they’re shown more toxic content, I find that they don’t share more or less toxic content than before, than the control group in fact. Interestingly, these are also people who don’t end up leaving the platform either, so it is the more users who were seeing more toxic content at baseline, I find that they are the ones who are driving the aggregate change in total usage. The users who are not really moving are the users who weren’t seeing much toxic content at baseline.

I try to dig further into what is the kind of content that these users are using or seeking out. It is very interesting that a lot of them are seeking out “good morning messages,” which is something my grandmother used to send me, and we taught her how to use WhatsApp. We used to get a big heart or a big rose in a morning message every day. This is a little speculative because I don’t know what other things these people are doing. I do have some survey data, so I can measure whether these people are using other apps, but all of that is self-reported data.

Even in the survey data, I find that users who were seeing more toxic content at baseline are people who actually increased the amount of time they spent on other apps when they were treated. But users who were engaging with “good morning messages” or religious messages, they didn’t actually go anywhere else in the survey data too. It seems that there might be some sort of cross-platform substitution going on for particular types of content, and I think toxic content is one of them.

In terms of elasticities, just the percentage change in the number of toxic posts shared divided by the percentage change in the number of toxic posts viewed—I find that users who are seeing more toxic content at baseline are inelastic, but users who were seeing no toxic content at baseline are even more inelastic.

RAJAGOPALAN: Can you tell us what you mean by inelastic? This is a little less intuitive. I want to separate two things. You’re talking about viewing posts versus sharing posts.

KALRA: Exactly.

RAJAGOPALAN: Now, what you find is the people who are treated are spending less time on the platform. They’re viewing fewer posts, but they may also be sharing fewer posts, or they may be sharing more posts given the proportion of what they’re viewing. Would elasticity of toxic sharing be defined as the ratio of the percentage change in toxic posts shared to the percentage change in toxic posts viewed?

KALRA: That’s right, yes.

RAJAGOPALAN: Did I get that right?

KALRA: Yes.

RAJAGOPALAN: You have a great example in there. Maybe you should talk about that.

KALRA: Yes. The illustration on user behavior. I think it’s basically got to do with the fact that when I’m treated, I do see [fewer] toxic posts. You’re right that the distinction between viewing and sharing is that I’m thinking of viewing as being more passive, that is the stuff that the algorithm decides, and the user doesn’t decide. But the sharing behavior part is something that users actively choose to do, so I’m basically analyzing behavior by looking at sharing data.

Back to the illustration on user behavior. When a user is treated, they see a lower number of total posts just because they might disengage from the platform because they’re now seeing cat videos when they wanted to see dogs, which is, again, completely fair.

RAJAGOPALAN: I like both. I’m willing to watch both, but I know there are dog people and cat people, but sure.

KALRA: I personally prefer the ones with both dogs and cats getting along. I think those are really the best kind of videos on the internet.

There are people who see less of everything, but when people are treated, the average user is also likely to see less toxic content. Now, as a result of the treatment, they may lower the total number of toxic posts that they share just because they’re seeing less, but because of the disengagement effect, they will share even less content of all other varieties. That effectively means that as a proportion of toxic content that I have been served, I’m going to end up sharing more of it just because I have a proclivity towards that kind of content.

One way of maybe thinking about this is that maybe if I were treated, I’m scrolling down. I’m not seeing dogs and cats, but then I see a toxic post, and when I see it, I share it. That increases the probability with which I’m going to share a toxic post when I see it.

Interpreting How Personalization Shapes Engagement in Toxic Social Media Content

RAJAGOPALAN: This is what fascinates me. How much is this about preferences versus salience of something on a particular platform? I have a preference for dogs versus cats or other kinds of animals, though I love all these animals. I even follow elephant accounts. I’m a little bit of an animal nut. Let’s say I have a preference for one. When it’s personalized, then, obviously, that’s what keeps getting fed to me.

Now, when it is not personalized, let’s say I’m getting dog videos, cat videos, elephant videos, horse videos, and then a bunch of dogs and cats getting along, et cetera, et cetera. Now suddenly because I had this preference for dog videos, that’s what I end up sharing more with other people. Now is that happening because I had a preference for it or is it just standing out more because now when I see it compared to all these other animal videos, I’m like, “Oh, that dog video is really cool,” and that’s some deep inner preference and becomes even more salient given the comparison with other things I may not be that interested in.

KALRA: That is an excellent question, and I think there’s no easy answer to that question, which is why I ended up in the territory of behavioral modeling, unfortunately. I think the way that I think about your comments is that, one, there’s preferences which we are trying to measure with how much toxic content people are viewing at baseline, assuming that the algorithm actually learns preferences, and it’s not garbage, which is probably true because people are leaving when the algorithm is turned off.

The second thing about salience is, I would think of it in terms of decreasing returns to exposure or to sharing compared to exposure. When I see a lot of it on the margin, I share one more post. When I don’t see a lot of them, which is what is happening with this particular group of users, I get really excited and the return to sharing that one thing that I just saw is really, really high.

What I do in the behavioral model is I try to figure out which one of these effects is more dominant because, in economics, the answer to any trade-off is, both of these things are important. Maybe we can see a little bit about how much do each of these things matter.

In fact, even there, I find that the elasticity or the change in sharing based on baseline views is very high. It does turn out, in my view, it is the case that a lot of it is just being determined by people’s priors. Even when there is a change in exposure, that does move things a little bit. People aren’t entirely mechanical, but that change is actually very small, and it’s mostly like their baseline preferences that are driving this.

How Recognizing the Agency and Sophistication of Users Shapes Interpretive Models

RAJAGOPALAN: Humor me on this other possibility. This is really coming from maybe someone who’s not like you who really understands the backend of these social media platforms. It’s a less sophisticated user who thinks social media is just like television or news or newspapers or something like that because, you know, there’s a very large proportion of the population who do believe that. Now, there’s one other theory I have for why they are sharing more when it’s randomized, and they see fewer posts that are toxic.

Now when there’s a lot of toxic posts, the assumption might be, “Oh, everyone’s seeing this.” Because I’m getting eight out of 10 posts that are toxic, everyone must be seeing eight out of 10 posts that are toxic. Now I don’t need to share it with these other people, but now when it gets randomized, I’m seeing very little of it. Now it’s come down to 10 or 15 percent, maybe 20 percent, so maybe others missed it. Because of the assumption, many people genuinely think Instagram and Facebook and YouTube shorts are like television. There’s programming going on as opposed to something which is highly personalized.

I thought maybe that might be another reason why people share posts. This is not exactly salience. It’s almost like if you think everyone else is seeing the same stuff you are seeing, then everyone’s drinking out of the fire hose, so there’s no point sharing it. It’s a little socially awkward to share 20 posts a day with other people, but if there’s only one or two, you’re like, “Oh, maybe they missed it.”

KALRA: I think from a theoretical perspective, I am in fact assuming that users are sophisticated in the sense that I am assuming that they think that other people in their social group are also seeing similar things. The distinguishing feature of this platform is that people share. When I say share, they go off the platform.

RAJAGOPALAN: They share on WhatsApp.

KALRA: Yes. WhatsApp, sometimes even Facebook or Instagram, but mostly on WhatsApp. I think it’s in that sense, if it were true that people were thinking that oh, everyone’s seeing the same thing, then there would have been no shares to begin with.

We do see that there is some sharing that is happening. That means that people think very differently about these two things. One is sharing off the platform and the other is viewing, which, at least, we assume that when they’re viewing something, they think that everyone else is also viewing the same thing.

RAJAGOPALAN: How do you really think about the results that you find? I’m very interested in your interpretation of it.

KALRA: I think, first of all, it’s very depressing to have this finding. I think in my PhD journey, I’ve only come up with results that are very depressing and unexpected in that sense if we talk about the other paper also.

But I also think it’s important to realize that demand is very important, and that people aren’t just passively consuming content, and they’re active agents. These are people who are probably going out and voting for particular candidates. I think that a lot of these narratives, especially among social scientists, can be a little bit disrespectful towards people’s agencies. “It’s just the IT cell. We just need to have these supply-side interventions, and it will all be okay.”

RAJAGOPALAN: I want to spend a minute on this because the way we normally view this marketplace of ideas is: There are these passive supply side bots or IT cells, and people are paid a few thousand rupees to put out all this content and then make it go viral. On the demand side also, they’re basically sheep or demand-side bots. That’s the assumption, right?

KALRA: Yes.

RAJAGOPALAN: The sophisticated people in this model with agency are either the people who are giving content to the IT cells and driving it or the advertisers, which is odd. These are the groups of people who stand to benefit, but we never truly think about the preferences of the supply side or the demand side.

You’re absolutely right that there are people who go looking for certain things. I go looking for dog videos. There are people who go looking for music or humorous content or devotional content, and there is a lot of political content. It should not shock us that people are going looking for it.

KALRA: Absolutely. I think similarly there’s work in the US by Matt Gentzkow and Jesse Shapiro. I really love this paper about what drives media slant. Their conclusions are similar, that it’s really the demand because even in the US, assuming that most of us are liberal-leaning, a lot of people think that it’s Fox News that is driving people crazy. It’s just Fox News—they watch it, and if we were able to ban Fox News, everything will be okay, and Kamala Harris is going to become president.

RAJAGOPALAN: People on the other side think the same thing. They think that MSNBC is the devil.

KALRA: Yes, it’s driving people nuts.

RAJAGOPALAN: And it’s driving people towards a particular kind of candidate. I think on both sides of the assumption it’s the same. I don’t think one side is assuming worse than the other.

KALRA: Exactly. It’s belittling the intelligence of who you don’t agree with, which is bizarre, but also common, especially among liberal social scientists. I think that is something that I really want to put across that these are people who are demanding this thing. That actually explains a bunch of other things that we see in the real world also like voting patterns, et cetera.

The Challenges of Platform Regulation

I think the other big thing that the results have helped me think about is regulation in the sense of, it’s clear that the regulatory institutions in India are basically non-existent at this point. There is the IT Rules Act, which basically tries to infringe upon people’s free speech rights more than anything because it gives the government a lot more control on what people can say and do. On the other side, if someone were to say something in support of the farmers’ movement, that is the kind of stuff that could still be censored. On the contrary, if you were to say something nasty about a particular group of people, it could be Muslims or it could be Christians.

RAJAGOPALAN: Or Sikhs in the case of farmers’ protests.

KALRA: Or Sikhs, yes, in the case of farmers’ protest, there is nothing. Unless someone in the post says, “Let’s go kill these people,” then maybe the platform is going to make an effort to ban that particular comment. Otherwise, there’s just no incentive for a particular platform to stop this kind of hatred on the web.

RAJAGOPALAN: While you are right, that part is a little bit depressing. I think the optimistic part is even if they manage to regulate this in a major way, I don’t think we can solve it because this is more about deep-seated preference that people hold in general, not necessarily just on a social media platform, that a particular platform may make it slightly easier to indulge in particular preferences or make it easier to share particular things. But this seems to be a broader cultural issue in terms of how we speak with each other and engage with each other. That seems to be the bigger problem.

I actually find this quite optimistic because the standard narrative is all these social media platforms are just out there to make money, and they’re not stopping this and I’m like, “Actually, I’m not sure this can be stopped based on what you find.”

The Challenges of Creating Interventions to Address Toxic Content

KALRA: Exactly. I think in that sense, the thing about thinking about correct interventions in this case, it is really like, how do we intervene on attitudes and make people more humane, I guess. Those are, I think, really hard questions.

I think there’s been some work on contact hypotheses and so on. It’s not clear to me what that would do, either because in a population as dense as India, again, maybe citing myself and going to the other paper on residential segregation, what does segregation really mean in India? Is it that people don’t have contact with their neighbors? They clearly live in very close quarters with their neighbors who might be from a different religion or caste, but they fear them in ways that we don’t really understand. I think there’s a huge scope for thinking about even going beyond theories of contact and so on.

RAJAGOPALAN: Yes, a broader civic engagement project, right?

KALRA: Yes, for sure.

RAJAGOPALAN: That I think is the bigger question: What used to be civic engagement and civil society, and what has broken down on the platforms and outside? I know that sounds depressing, but it almost makes me feel better in some sense.

KALRA: I think the reason that I say it’s depressing is because I have a sense that it was always this way. People just didn’t have an outlet to express themselves.

Social Media as Normalizing Toxic Speech

RAJAGOPALAN: Or maybe they did. It just wasn’t public. They would say this among their very close friends. They would say this in their card games or close satsang group or if they have a girl gang that they hang out with, but they’re not really expressing this beyond that.

KALRA: Exactly. I think that is the biggest criticism of social media, which I think is fair, that social media seems to make it okay to say certain things that weren’t okay to say before. This goes back to my assumption on users being sophisticated enough to think that everyone in their social group is getting the same kind of content. This implies that other people in my social group are seeing similar content when I see it on my app, then I think that it’s okay to say this particular thing. That expands the range of socially acceptable ideas in public discourse, which political scientists, I think, call the Overton window.

In that sense, I think social media does contribute to making these things more public. It’s hard to say how deep-seated or for how long this has been going on, how long have people been talking like this about minorities behind closed doors, and now it’s just okay to say this on the family WhatsApp group because everyone else is also saying the same thing.

RAJAGOPALAN: That normalization, I definitely agree, is one depressing part of social media. There I think the depressing news is, what you find, about how social media platforms lose engagement, that is, people just leave if they try to randomize and reduce this personalized echo chamber, which keeps feeding a particular kind of content—they actually lose the users.

There’s always going to be some other social media platform which is willing to indulge that particular preference, and then people will end up there. Then, it’s just a question of how do we better moderate whatever is the relevant public square, the way we might think of all the really big platforms.

KALRA: Right. It has to be across the board in that case. The current regulatory framework just does not address it. It’s not interested in addressing that.

The Route of the Ram Rath Yatra As Lens on Segregation

RAJAGOPALAN: Yes, this is definitely a big one. Now, if you have the time, I’d really like to talk about your other paper because I love the research design of that paper. We talked about this before. In this paper, you are basically looking at the Ram Rath Yatra in 1990 that was led by L. K. Advani and trying to understand some of the consequences of ethnic violence and segregation in India, but using that as one of the instruments. Right?

If I had to give away the reductive conclusion, you’re basically looking at post-ethnic violence and segregation by studying the route of the Ram Rath Yatra. You find that areas closer to the Yatra route are more segregated. This has consequences on outcomes for Muslim households and their education and their income levels and so on.

What did you find? How did you go about finding it? Why did you look at this particular question?

KALRA: The motivation for this project was to understand the effects of residential segregation on education outcomes to address questions of intergroup inequality. There, I was definitely motivated by the whole research agenda of Raj Chetty, John Friedman, and Opportunity Insights where they see that in the US, residential segregation is out there for everyone to see. I live in Providence, and the east side of Providence, which is where Brown is, is completely different from the west side, which is where everyone speaks Spanish.

RAJAGOPALAN: It’s like two different cities, actually. It’s a little alarming.

KALRA: It should be called two different cities because that’s what they are. It’s not surprising that given that the public school system is so linked to where you live that they find all these negative effects. I was motivated by this and wanted to see if something like this is also true in India. Of course, like every starting PhD student, I went out looking for instruments because residential segregation is obviously going to be correlated with school choice and education levels, and where people want to live may be correlated with what schools are there in a neighborhood.

I thought that one way to identify this would be to look at communal violence and find exogenous variation in communal violence. That in itself pushed the paper in a very different direction where it became less about residential segregation, per se, and more about the effects of communal violence.

It seems that residential segregation is in the mix, but it is not very clear whether residential segregation was true always or did it happen after the Yatra because we just don’t have that kind of data in India.

RAJAGOPALAN: The underlying mechanism for that kind of segregation is when there is ethnic conflict or ethnic violence, those groups that were either unsegregated and living in a mixed way or were sort of semi-segregated, they had their own pockets or hamlets, suddenly segregate a little bit more. It’s a way of protecting territory and people of a particular ethnic group because that’s much easier than when people are in mixed living. It’s just easier to police a single apartment building that might comprise of a particular kind of ethnic group.

We’ve seen this in Bombay when there was religious violence there following the Yatra and the bombings and things like that. Is that the underlying mechanism? Did I get that right?

KALRA: Yes. I think so. I’m not entirely sure if that is the only mechanism because there’s just so many things that are happening there. Now, we know from ethnographic work, and Rohini Pande also had a paper, where they looked at where are riots more likely to happen, and they found that riots were more likely to happen in more integrated neighborhoods. One of their conclusions is that even though they don’t have definitive evidence, then the riot probably led to more segregation of these integrated neighborhoods.

Then there’s ethnographic work outside of Ahmedabad where there is a large concentration of displaced Muslim families. In this ethnographic work, especially by Christophe Jaffrelot, there is a lot of talk about how elders of the community thought about rebuilding.

I think the last time we discussed mobile human capital hypothesis among Jewish communities. It could be that instead of investing in physical capital for their children, the community elders thought that it was better to invest in human capital for their children so that is something they have on their body and is something that would help them assimilate into the mainstream more easily.

One question I get a lot is, are these madrasas then? They’re not. These are English medium schools. I’m trying to be a little careful here in terms of exact mechanisms.

RAJAGOPALAN: No. I appreciate that because what you are saying is something beyond just safety and territory. You’re talking about [how] there is a collective action problem across the board on different things, whether it’s a particular kind of public good, whether it is safety and policing, but it could also be the kind of school you wish your kids to go to and whether your children feel welcome or marginalized in a particular school.

There are all these collective action problems, and the underlying mechanism might be the broader question of being able to solve these collective action problems better if they are closer together, more dense, more segregated, because there are economies that come from that. That could be the simple non-ethnic-driven preference. That could just happen to people of different income groups. It could happen to people who speak different languages. In this case, it’s different religions. There’s of course definitely some effect of violence but it may not just be the violence. Is that a better way of putting it?

KALRA: Yes, I think so. I think just because this bigger project, which eventually became my job market paper, came along, there’s one thing that I couldn’t do during my PhD, which is do more fieldwork. I’m hoping if and when someone gives me a job, that I’m able to also, while working, finish working on the job market paper, also go back to this question because there’s very little data on education choices in urban India. There’s a big literature on school choice in rural India, but there’s hardly anything we know about public and private school mix in India. That is definitely something that I would encourage other people to look at, and I would also want to jump in at some point.

RAJAGOPALAN: One thing I love about the paper is the research design of using the Rath Yatra. When I was reading the paper, I was fascinated by this, and now the memories are coming back of reading newspapers as a child when all this was going on.

There’s this question of endogeneity. Did the Yatra go to places which were already rife for conflict, like a tinderbox that you can trigger quite easily, or did the Rath Yatra actually cause this in some way? To what extent does the research design solve for that? Because you look at both the planned route and the actual Yatra that was taken. What’s the difference between the two and how do we solve for this?

KALRA: The research design addresses this reverse causality in riots and the Yatra by looking at, as you said, the plan and the actual route because it’s totally fair to conceive that the Yatra actually went to places that were more vulnerable, that were more on the edge or even more segregated, to begin with. What happens when L. K. Advani is exiting Bihar is that the chief minister of Bihar at that time, Lalu Prasad Yadav arrested LK Advani, and ended the Yatra.

The exact timing of where the Yatra was stopped gives us some exogenous variation in exactly where the actual route starts becoming different from the planned route. In the paper, what I do is I also show that I do not find the same patterns being replicated with respect to the planned or the placebo route as the same patterns in the dataset.

RAJAGOPALAN: As where he actually went. These are places that he had intended to go to.

KALRA: Yes.

RAJAGOPALAN: Which means if this was driven by selecting certain places for certain characteristics, then they should also have this kind of segregation. They should also have particular kinds of educational segregation, female segregation, and the outcomes related to that. Now, those places were never affected, and they don’t have it.

KALRA: I am able to show a first stage that there was basically no correlation between the placebo route and the riots that actually happened.

There is a very high degree of correlation between the actual route and places where there were riots in the lead-up to the demolition of the Babri Masjid. Sounds so weird because now it’s Ram Mandir.

RAJAGOPALAN: Yes. We’re old enough to remember some of this crazy history. But you know this is a weird thing? There’s a weird way in which these things keep coming back. Like now it is the Ram Mandir. The physical structure might have disappeared, but there’s a lot of latent emotional trauma associated on both sides with what happened in this Yatra and the riots that followed and the demolition and so on. It’s really fascinating, this particular project. Thank you so much for doing this, Aarushi. This was such a pleasure. Looking forward to more of your work, especially the fieldwork you were talking about, and good luck with everything on the job market.

KALRA: Thank you so much, Shruti.

About Ideas of India

Hosted by Senior Research Fellow Shruti Rajagopalan, the Ideas of India podcast examines the academic ideas that can propel India forward.