Deepti Sharma on Survey Methods and the Hidden Biases in Economic Data

Sharma and Rajagopalan examine why household surveys often get work patterns wrong

SHRUTI RAJAGOPALAN: Welcome to Ideas of India, a podcast where we examine academic ideas that can propel India forward. My name is Shruti Rajagopalan and this is the 2024 job market series where I speak with young scholars entering the academic job market about their latest research on India. 

I spoke with Deepti Sharma, who's an Assistant Professor at Ahmedabad University. She completed her PhD in public policy from the Indian Institute of Management in Bangalore and was a postdoctoral fellow at the Center for Management of Health Services at the Indian Institute of Management in Ahmedabad. Her current research focus is empirical methods, applied microeconomics, public health and gender studies. We discussed her job market paper, Does it matter who you ask for Time Use Data? We talked about the systematic bias in proxy reporting when compared to self-reporting in time use surveys, some techniques used to fix those biases, the gendered nature of these biases, policy implications of using these time use surveys and much more. 

For a full transcript of this conversation, including helpful links of all the references mentioned, click the link in the show notes or visit mercatus.org/podcasts.

Hi, Deepti. Welcome to the show. It’s such a pleasure to have you here.

DEEPTI SHARMA: Thank you so much, Shruti. Nice to meet you.

Proxy-Reporting Versus Self-Reporting in Time-Use Surveys

RAJAGOPALAN: I looked at your job market paper. This is actually super interesting because this is not the sort of thing I come across normally. You look at self-reporting versus proxy-reporting, specifically for time-use data. For those who are not familiar with these survey methods, a proxy respondent is someone who provides information or answers on survey questions on behalf of someone else in the household, or the factory floor, or something like that, when the individual respondent is not able to respond themself.

You find that proxy respondents tend to overestimate the time spent on employment activities and underestimate the time devoted to household production: unpaid domestic work, care activities, and things like that. 

The reason I got so interested in this is this has such important implications for how we do any empirical social science and evidence-based policy because all of us are using these data sets. That there’s this big issue on self- versus proxy-reporting, especially time-use surveys, which are becoming more and more common—this seems to be this big issue I never thought of before. Can you tell us a little bit more about what got you started on this?

SHARMA: This paper, this entire data set, I got a hang of this in 2019 when I started my PhD. It’s new data in an Indian context. An earlier version of this data was collected in 1999 as a pilot, and it was not representative, and only a few states were included in this. Now, in 2019, when they released the data, and I was looking at it, there were questions on who reported this time use. Is it self or is it other members? This kind of a variable looked really new for such analysis. This got me into searching for what literature talks about self or proxy, whether this kind of a question is generally asked in other stories or not. 

I found that except for health, except for other experimental data, which researchers like you and me have collected, there are no such questions in the observational setting. I found that what we’ve been doing all this while is that we are relying on whatever is given to us, and we are not thinking about the source of the data. We just know that some knowledgeable person in the household, probably the head of the household, who is a male, who’s above 60 years old, is responding on behalf of other members, which to my mind is absolutely different when it comes to my own response versus my grandfather’s response. Because there are so many things which are unobserved, which I can hide strategically or unstrategically, or it’s simply a random chance that he happens to know something about me, and he doesn’t, for similar cases with other people in the family.

This particular idea started this question [of] when we are looking at the source of this data collection within the family, what should we do in terms of best practices in terms of the data collection? Then I started to look at and dig more into the data collection methodology literature, which is a very niche literature. I found really good papers, but they were experimental in their design. I thought that it would be a good contribution in terms of coming up with an observational study which has no experimental setup. Enumerators who were trained to collect this kind of data went with the objective that we will collect time-use data like any other NSSO, NFHS kind of a survey.

This is where we started. A lot of this research has benefited from experimental studies, but it’s quite different from experimental studies. I’ll walk you through how and what mechanisms we tested, how we actually identified a lot of things along the lines of this proxy identity, how it changes in different settings, and how culture affects [it], and what are peculiar things about India that normally affect this kind of a response. This is how it started.

Gender Bias and Systematic Bias in Proxy-Reported Data

RAJAGOPALAN: I have so many questions about the paper, but let me just start with one of your main findings, which is surprising when I’m thinking about a big-end-sample time-use survey dataset. You find that the gap in self- versus proxy-reporting is systematic, and rather systematically biased, and not random. What we mean by that is, normally in these things, even when there are errors, if there’s a difference between you reporting on your own time use versus your grandfather reporting on your time use, when you look at this across lots of families, the expectation is these errors cancel each other out.

Maybe in your case, you will underreport on some margins and your grandfather will overreport because he’s very proud of you. There might be another family where it’s exactly the opposite, so all this gets canceled out, but what you actually find is that that is not true. There is a systematic bias gap in one direction between self and proxy reporting. 

One, what is that bias? What is the magnitude of the bias, and how do you explain what’s going on? Because this is a little bit crazy, and it calls into question the entire data set, fundamentally.

SHARMA: Correct. It’s 40 percent in our data. Proxy reports are approximately 40 percent for a working group, which is [the] 15 to 64 years-of-age group. Otherwise, in the overall sample, it’s around 36 percent. 

This bias is systematic in the sense that it is correlated with age, sex. Sex, we were able to find a lot in our analysis, then education, then occupation—what is the relationship between respondent and the proxy, like husband and wife, non-couple sample, et cetera.

This systematic bias that we found plays a crucial role in estimating this. If it’s a couple sample, [it] is more likely information [is] being shared compared to a non-couple sample. In that sense, there are differences between self- and proxy-reported time use. However, they are less than a non-couple sample. In a household where people are in the same occupation, then their timely-investing rates are much closer when it comes to self and proxy, compared to people who are in different occupations.

There are several thoughts where we can reconcile, but at large, there are time-use reporting differences at an aggregate sample level, and these are driven a lot by gender. Why we focus in this paper a lot on gender is because, particularly when I map the proxy identity, I was able to identify who is responding - in terms of whether it’s a female or male  - more clearly than any other characteristics like education or anything because, in a household where six members are residing, multiple members can become a proxy.

How Cultural Norms and Gender Perceptions Shape Reporting

RAJAGOPALAN: Here I found one example really, really telling, and I’m quoting from your paper. You say one aspect contributing to the asymmetric measurement of error is differing perceptions of time spent on activities that are typically performed by men versus women. “For example, women might consider time spent taking care of livestock as a work-related activity. In contrast, men may classify it as household work, given the proximity of livestock to the residential compound.” That’s one issue.

Now there’s a second thing that’s going on. You say “an activity in which women simultaneously care for children and watch television might be classified differently by self and the proxy reporter. A woman might classify it mostly as childcare, whereas a proxy might classify it as leisure.” This is incredible to me that even within the household, even things that are so deeply observable, that is, it’s not like, oh, they’re all living far apart; they’re living in different parts within the same farm holding or a compound. This is literally people in the same house in the same room, but what they view is taking place in front of them is completely different.

SHARMA: It’s a perception. Time spent on different activities is more notional in this sense. It’s all based on the perception. I can consider myself spending 12 hours in nature and doing other activities but my grandfather may think that I’m taking care of him—it can count in his mind that he’s taking care of me. She’s doing some unpaid care work in the household whereas I’m just spending my leisure time with my grandfather.

Time has a very notional concept on how it is spent, and since it is so notional, in my own capability, in my own understanding, let alone what a proxy would think about, these differences, when it comes to how, activity-wise, time is spent between self and proxy, notionally, is driven by a lot of cultural issues, norms. And in this case, I’m talking about religion, caste, and other rural norms, which talk about the belief of the individual, and it gets translated into their reported beliefs.

RAJAGOPALAN: This is part of what’s going on with the gender biases. You find systematic biases. You find systematic biases across different indicators, and gender seems to be a particularly strong one. Is that what is driving all of the systematic bias? That is, the gender differences are just so big, and, typically, women are not going to be the ones who open the door and speak with a stranger who is surveying the household. Is that what is driving the largest part of the bias or almost the entire bias?

The second part of the question is, is that why it’s systematic? Because this is cultural. This is not about the idiosyncratic things going on within a particular household like the grandfather loses track of time or doesn’t know what’s happening in the next room or something like that. This is just across the board. People just think of certain things that women do as, “Oh, this has to be done by them and not really as work.”

SHARMA: It’s a combination of both the things, both points that you mention, that women will not just open up to an enumerator who is trying to capture what every member did in the last 24 hours, and also, it’s a cultural story. 

One important point that we found out is that most women report as a proxy in this entire cycle. Because they are the ones who are at home, they are more likely to be available, and they are more likely to know what other people are doing in the household.

The source of bias is also driven a lot by gender. When I say that female proxy-reporting members are much more deviant when they report for time use and employment and unpaid domestic work, it is a combination of both cultural things and their own perception, and also an observation that they are not able to observe everything. They don’t have market knowledge about who is going out and actually doing employment and spending time in different activities. Cultural-norms-related issues come in the reporting when there are regressive and progressive societies.

Even if a house has an egalitarian setup within their household, and they have a division of labor based on their bargaining power and mutual understanding, if they live in a regressive society, men will always report according to social desirability. They would never want any enumerator who’s just an outsider to know that, “Yes, we share an equal load of the work.” They would always report that, “Yes, I work in an employment-related activity. She doesn’t. I do not contribute to unpaid domestic work and care work while she does all those things.” Her time use, when a male is reporting—then all these things would get affected by this. Whereas women would also try to culturally embed their answers according to these beliefs. It has a cultural, religious, caste-based story in this.

RAJAGOPALAN: You know the reverse is also true, right? For instance, if you happen to ask the matriarch who is proxy reporting for others, she is not going to admit that her son is unemployed. In some sense, there’s no unemployment in rural India because they’ll say, “Oh, they do something on the family farm. They work on something.” It’s largely reported as underemployment, not because they are not unemployed or underemployed, it’s just it’s not polite to say this in public and certainly not about the male member of the household or the oldest son or something like that. Do you also see it flipped, which is not just the men reporting on the women, but the women reporting on the men is also systematically biased in this sense?

SHARMA: It is, and they tend to over-report time spent in employment activities. This story holds up when a female proxy is reporting for a male time spent in employment activity. She overreports for about more than an hour in some situations.

Challenges in Collecting Accurate Time-Use Data

RAJAGOPALAN: So you’ve identified there’s this big problem with proxy reporting and we understand now why. Are the biases systematic when they’re self-reporting in the same context or there are no biases, it’s just random, errors canceling each other out, or there are biases, but they are smaller in magnitude or different in direction, and so on?

SHARMA: We do not have a direct way to check this because only one individual’s response is captured. If I’m self-reporting, there is no other proxy reported so I can’t compare within-subject differences.

RAJAGOPALAN: Got it.

SHARMA: Again, going back to the experimental studies where they have looked at self-reports to be the gold standard, we have also considered self-reports to be more accurate because we are talking about time, and time spent is more likely to be known better by self than any other person. 

RAJAGOPALAN: You know, yes, and no. I understand why self-reporting is definitely better than proxy-reporting. You pointed out that there are two separate issues. One is, this is all a perception question: how someone is perceiving how they spend time versus someone else spends time. And that can never be perfect unless you literally put some kind of a motion detection device and actually measure where someone is going and how much time they spend on the farm and factory floor and how hard they work. It’s very difficult to parse this out.

On the other hand, you also mentioned that some of this is cultural, which is, how do they present what’s happening within the household to someone outside the household? I imagine that the former problem gets fixed when it comes to self-reporting, but the latter problem is still going to exist, right? The example you gave a few minutes ago is, there’s a male member of the family, and when an outside surveyor asks them, even though they know what they are reporting for themselves, they’re going to adjust the amount of time spent in employment versus unpaid domestic work because of whatever they think the surveyor or the evaluator wants to hear.

SHARMA: They do that. There is social-desirability bias even in the self-report, but we say that when we compare two contaminating responses, the contamination is higher in proxy responses, considering one as a benchmark, so I agree.

RAJAGOPALAN: Basically, it’s systematic bias, but it’s just smaller in magnitude, perhaps easier to adjust for, and so on.

SHARMA: Right.

Methodological Approaches to Working with Proxy-Reported Data

RAJAGOPALAN: Okay. Now this is the part which I can’t say I understand very well. Can you explain the method of how you do this for those who are not familiar with the propensity-score matching and inverse-propensity-score matching techniques and so on?

SHARMA: Sure. I will start this explanation by telling the readers of the paper and [those] who are going to listen to this podcast what this study is not. This is not an experimental study. It’s a nonexperimental study, and the objective was to understand how individuals are dividing their 24 hours into paid and unpaid activity. Given this objective, enumerators of the surveyor had no field design or sampling to choose randomly between self and proxy. It just so happened, in that data, some of the respondents they initially had heard were not present, and they had to collect proxy responses in order to avoid missing data or some other random data.

This is like a quasi-experimental study where proxy happens to be a part of [the] data set to complete responses and avoid missingness. We do not see any flipping of a coin and selecting between self and proxy groups. In this case, there is a huge endogeneity concern when it comes to respondent selection. This is embedded in the study itself. This is embedded in the dataset itself, so to mitigate this concern, what can we do? We can’t do something like go back to the data enumerators and ask them to collect two responses on every individual, one from self and the other from proxy. That’s not possible because this is a non-experiment issue.

What we can do is design this as if it is an experimental setup by virtually matching all the individuals on whom we have self-reporting data, including their time use, whatever time use they’re reporting, then their individual characteristics like age, sex, education, occupation, and other things, and match this with the proxy. Basically, what we are doing is matching individuals based on their individual and household characteristics as if they are reporting in two different dimensions. They’re reporting for themselves, and they’re reporting for the others.

How would they report for themselves? How would proxy members report for themselves is what we do with this map. We have everything in place. We did these matching method techniques, which are simply about creating a common balance area for two individuals to match and get responses. Whatever the deviation left is attributed to the systematic biases, which we then distill between gender, occupation, and so on.

RAJAGOPALAN: Okay. How do you compare these different matching methods? We are all using these surveys as given, right? A lot of us are not in a position to conduct these surveys. Also, these are done on a fairly large scale. For those who are looking to de-bias these data, what do you find to be the best of the different techniques and methods that are available?

SHARMA: Right. This literature on matching methods is also evolving. We see thousands of papers on different-in-difference. This literature is also evolving, and there are quite robust techniques in stata in our packages which can do the job fast. Across different matching methods, like IPW, augmented IPW, to take care of a lot of standard errors and other things, coarsened exact matching is what is new. It will become more robust. It will compare across all the matching techniques, starting from the propensity score, which is old, but still effective till, your coarsened exact matching is where a lot of robustness tests are taken care [of].

Across the board, if we are getting the answer, then the story goes through, and this is what we found out in our paper as well: that given different methodologies and different matching techniques, one is nearest neighbor, another is inversing the score of probability weighting, and another is coarsened exact matching. These are a few robustness across different matching methods that we did, and we found out qualitatively [the] same results across the board.

RAJAGOPALAN: That solves the problem of when we get these data as given, and now we have to fix this in some way. Basically, for those who are not familiar with these methods, you are just figuring out a way by which we compare apples to apples. It’s never going to be the same apple, but how close can we get to compare two things, which are sort of the same or almost the same, so that then they’re actually comparable, right? That’s where we’re going with this.

SHARMA: Correct.

Suggested Approaches to Conducting Time-Use Surveys

RAJAGOPALAN: Now, a second implication of your paper is how should we collect these data in the first place and what implications your paper has for how these time-use surveys and other surveys should be conducted. One thing that I can quite clearly gather is, we need to triangulate even within the household. Ask multiple people within the household so that you can triangulate slightly more accurate answers.

This is almost like Rashomon, the movie. You are looking at the same mother hanging out with the same child in the same room, and different people will have perceptions of how much time was spent in childcare versus leisure versus domestic unpaid work, and hopefully, you get closer to the truth.

What are the other things that surveyors and evaluators should keep in mind when they’re collecting these data in the first place?

SHARMA: Again, benefiting from experimental studies, what can be done in these sites of nationally representative datasets is that a combination of time diary and enumerator-assisted survey can be conducted. I am sure that these agencies are thinking about the time and cost of collecting this kind of data, but what they would be doing is to provide a lot of good data. A lot of experimental studies have looked at how short versus long surveys can help, time diary plus enumerator-assisted responses can help. Enumerator-assisted responses—we can see now through my study that it is biased.

Time diaries are hard to collect. Especially in India’s context where literacy is a problem, it is difficult to do that, and it is cost-ineffective. A combination of this can be done, and a very recent paper in the Journal of Development Economics is discussing this concept of time diary plus enumerator-assisted context where Rohini Pande and co-authors have experimented [with] this on a very small sample, and they found out that this is going to be a cost-effective solution in India’s context. That’s one thing. 

The other thing is that they can try out some long versus short surveys. The other very simple and very basic thing that these survey agencies can provide is to have weight on the nonresponses or adjustment towards proxy, which is going to be helping the econometricians and analysts in addressing these things, and there is no cost to it. 

A fourth solution: When I was reading a lot about time use and how other countries collect time use, like American time use, European time use, what they do is they set up a basic baseline survey face-to-face, but subsequent rounds they do online or [on the] telephone so that this problem of proxy and self is not there even in the telephone surveys. Again, this is a very cost-effective solution, but how successful it can be in the Indian context is a question.

RAJAGOPALAN: I would say there’s one other method for researchers, but this is not going to solve the scale, or it’s not going to be a substitute for the kinds of data that you’re talking about, so what can be done? I’m picking this up from some work that Ashwini Deshpande did with Naila Kabeer. We all know there’s this huge problem in India of low female labor-force participation, and then there’s a secondary problem of declining female labor-force participation, and the two are fundamentally different.

The declining part is an economic puzzle, but what I learned from their work is that the low female labor-force participation is not entirely an economic puzzle. A big part of it is a survey design, self-reporting, time-use kind of problem. I think this work was done in West Bengal, and they calculate the difference. Their survey in these districts of West Bengal and using the national sample survey, which we know all the problems and the good things about the NSS data. They found that from the participation rates in the teens, or at least definitely below 20 percent, they find that actually, for those districts, the labor force participation is about 50 percent. I don’t remember the exact number.

That’s the order of magnitude difference just based on, I don’t even want to call it reporting error, but I’d say reporting bias, which is entirely a consequence of the survey design. 

Another way that, potentially, social science researchers can solve this is, you take certain data as given and then maybe run a micro survey in that same space to figure out where there is a systematic error and so on, and that’s really hard to do. Maybe for those of us who are studying something very narrow, it may be done.

SHARMA: A lot of researchers have collected their own time survey in two or three villages from one or two districts. I think it resonates with the study Ashwini and Naila have done, and they found out that it’s mostly reporting errors. Let me tell you this, again, a very interesting point that time use study is much more beneficial if we connect it on top of the elective labor force participation survey because it’s not just a binary “participated or not participated” that they’re selecting. It’s how much we are participating, and what is the frequency of the participation?

If an individual had just participated in employment and done something to get paid in the last 24 hours, we can extrapolate it, and we can tell how much he works. I think the labor force survey they capture using principal activity, which is most of the activity done in the entire year. This can be a supplementary question on how much time you spent in the last 24 hours or a week or so in subsequent rounds of labor-force participation surveys.

RAJAGOPALAN: That’s super, super helpful. It was interesting reading your paper because there are all these things that one reads in the newspaper, one reads in other papers, and suddenly, everything came together when I read your paper. I was like, “Oh, this is what’s happening in this entire space.” What else are you working on right now?

Impact of Climate Change on Gendered Agricultural Work

SHARMA: As part of my postdoc, I started a couple of projects with several really good scholars at IIMA. One professor is located in the Center for Management in Agribusiness, professor Vidya Vemieddy. She has worked also on time-use surveys, she has collected her own time-use survey, but when I introduced my own paper to her, she got really interested and she said, “We can use this because it’s large-scale data, and we can look at how time use gets affected by climate change.”

We are particularly looking at the agricultural sector and how climate change, in terms of extreme heat, extreme rainfall, affects time spent in employment between men and women, and how do they compensate the time spent in agriculture in terms of increasing or decreasing their time use back at home by reducing or increasing their unpaid domestic work. 

We have found out some initial findings, and we have some preliminary analysis done. We are able to see that when there are extreme temperatures recorded in a particular area, women actually take up the role of the most important person in terms of spending more time in the fields and doing agricultural activities and they reduce their time in unpaid domestic work and care work. Whereas men, on the other hand, stop work in the agricultural field only in extreme temperatures, and they contribute towards unpaid care work, at least. This story is much more pronounced for married women, tribal women, low-income women. For them, it’s necessary to work and even if the temperature is extremely hot and extremely humid, they ought to work. This is what we have from that project.

Hysterectomy Rates and Health Insurance Policies in India

The other project, it’s again a very interesting project at the intersection of health and gender, is that we are looking at administrative data and some secondary data to look at how health insurance provided by the government of Andhra Pradesh had an ill effect on women’s hysterectomy rate. Andhra Pradesh, Telangana, Bihar, all these states together constitute the largest hysterectomy rate in India. Since the last NFHS round, we can see the rise in these states’ hysterectomies. Overall, these hysterectomy rates have risen, and these three states particularly contribute to 70 percent of hysterectomy rates in India. 

Why and how [has] this state of women has worsened over a period of time, and what is the role of state health insurance in this? There was a policy change in terms of removing the private hospitals from the social health insurance scheme in Andhra Pradesh, and since then, women are not able to avail of this facility [for a] hysterectomy so they are going to private hospitals. Now they’re uninsured because this particular ailment is not covered under insurance. They are genuinely looking for a hysterectomy or they are getting fooled by hospitals to get this done. They just promote this as a useless organ that once your childbearing age is over, you can remove ovaries.

RAJAGOPALAN: Oh, good God.

SHARMA: Or your uterus with the help of hysterectomy.

RAJAGOPALAN: That’s terrible.

SHARMA: Now the women are weak. Healthwise, they are also not well. They are uninsured, and they’re getting this done in private-sector hospitals in Andhra Pradesh and Telangana. This policy thing has led to this increase in the hysterectomy rates in Andhra Pradesh—more after 2011 where this policy was approved. A bad policy which was introduced by the government of Telangana gave the opportunity to private hospitals to profit from this. 

RAJAGOPALAN: This sounds fascinating, and I’m looking forward to reading these papers. Thank you so much for speaking with us. This was such a pleasure.

SHARMA: Same. It was my first experience, and I really enjoyed it. It was totally worth giving a lot of time and a lot of rethinking about what the implications are of this particular paper, and how I can discuss this to a larger audience.

RAJAGOPALAN: I’m so happy to hear that.

SHARMA: Thank you. Thank you.

About Ideas of India

Hosted by Senior Research Fellow Shruti Rajagopalan, the Ideas of India podcast examines the academic ideas that can propel India forward.