首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 23 毫秒
1.
Most evaluations are still quasi‐experimental and most recent quasi‐experimental methodological research has focused on various types of propensity score matching to minimize conventional selection bias on observables. Although these methods create better‐matched treatment and comparison groups on observables, the issue of selection on unobservables still looms large. Thus, in the absence of being able to run randomized controlled trials (RCTs) or natural experiments, it is important to understand how well different regression‐based estimators perform in terms of minimizing pure selection bias, that is, selection on unobservables. We examine the relative magnitudes of three sources of pure selection bias: heterogeneous response bias, time‐invariant individual heterogeneity (fixed effects [FEs]), and intertemporal dependence (autoregressive process of order one [AR(1)]). Because the relative magnitude of each source of pure selection bias may vary in different policy contexts, it is important to understand how well different regression‐based estimators handle each source of selection bias. Expanding simulations that have their origins in the work of Heckman, LaLonde, and Smith ( 1999 ), we find that difference‐in‐differences (DID) using equidistant pre‐ and postperiods and FEs estimators are less biased and have smaller standard errors in estimating the Treatment on the Treated (TT) than other regression‐based estimators. Our data analysis using the Job Training Partnership Act (JTPA) program replicates our simulation findings in estimating the TT.  相似文献   

2.
The doubly randomized preference trial (DRPT) is a randomized experimental design with three arms: a treatment arm, a control arm, and a preference arm. The design has useful properties that have gone unnoticed in the applied and methodological literatures. This paper shows how to interpret the DRPT design using an instrumental variables (IV) framework. The IV framework reveals that the DRPT separately identifies three different treatment effect parameters: the Average Treatment Effect (ATE), the Average Treatment Effect on the Treated (ATT), and the Average Treatment Effect on the Untreated (ATU). The ATE, ATT, and ATU parameters are important for program evaluation research because in realistic settings many social programs are optional rather than mandatory and some people who are eligible for a program choose not to participate. Most of the paper is concerned with the interpretation of the research design. To make the ideas concrete, the final section provides an empirical example using data from an existing DRPT study.  相似文献   

3.
This paper analyzes 12 recent within‐study comparisons contrasting causal estimates from a randomized experiment with those from an observational study sharing the same treatment group. The aim is to test whether different causal estimates result when a counterfactual group is formed, either with or without random assignment, and when statistical adjustments for selection are made in the group from which random assignment is absent. We identify three studies comparing experiments and regression‐discontinuity (RD) studies. They produce quite comparable causal estimates at points around the RD cutoff. We identify three other studies where the quasi‐experiment involves careful intact group matching on the pretest. Despite the logical possibility of hidden bias in this instance, all three cases also reproduce their experimental estimates, especially if the match is geographically local. We then identify two studies where the treatment and nonrandomized comparison groups manifestly differ at pretest but where the selection process into treatment is completely or very plausibly known. Here too, experimental results are recreated. Two of the remaining studies result in correspondent experimental and nonexperimental results under some circumstances but not others, while two others produce different experimental and nonexperimental estimates, though in each case the observational study was poorly designed and analyzed. Such evidence is more promising than what was achieved in past within‐study comparisons, most involving job training. Reasons for this difference are discussed. © 2008 by the Association for Public Policy Analysis and Management.  相似文献   

4.
Peers affect individual's productivity in the workforce, in education, and in other team‐based tasks. Using large‐scale language data from an online college course, we measure the impacts of peer interactions on student learning outcomes and persistence. In our setting, students are quasi‐randomly assigned to peers, and as such, we are able to overcome selection biases stemming from endogenous peer grouping. We also mitigate reflection bias by utilizing rich student interaction data. We find that females and older students are more likely to engage in student interactions. Students are also more likely to interact with peers of the same gender and with peers from roughly the same geographic region. For students who are relatively less likely to be engaged in online discussion, exposure to more interactive peers increases their probabilities of passing the course, improves their grade in the course, and increases their likelihood of enrolling in the following academic term. This study demonstrates how the use of large‐scale, text‐based data can provide insights into students’ learning processes.  相似文献   

5.
The (unheralded) first step in many applications of automated text analysis involves selecting keywords to choose documents from a large text corpus for further study. Although all substantive results depend on this choice, researchers usually pick keywords in ad hoc ways that are far from optimal and usually biased. Most seem to think that keyword selection is easy, since they do Google searches every day, but we demonstrate that humans perform exceedingly poorly at this basic task. We offer a better approach, one that also can help with following conversations where participants rapidly innovate language to evade authorities, seek political advantage, or express creativity; generic web searching; eDiscovery; look‐alike modeling; industry and intelligence analysis; and sentiment and topic analysis. We develop a computer‐assisted (as opposed to fully automated or human‐only) statistical approach that suggests keywords from available text without needing structured data as inputs. This framing poses the statistical problem in a new way, which leads to a widely applicable algorithm. Our specific approach is based on training classifiers, extracting information from (rather than correcting) their mistakes, and summarizing results with easy‐to‐understand Boolean search strings. We illustrate how the technique works with analyses of English texts about the Boston Marathon bombings, Chinese social media posts designed to evade censorship, and others.  相似文献   

6.
With an unrepresentative sample, the estimate of a causal effect may fail to characterize how effects operate in the population of interest. What is less well understood is that conventional estimation practices for observational studies may produce the same problem even with a representative sample. Causal effects estimated via multiple regression differentially weight each unit's contribution. The “effective sample” that regression uses to generate the estimate may bear little resemblance to the population of interest, and the results may be nonrepresentative in a manner similar to what quasi‐experimental methods or experiments with convenience samples produce. There is no general external validity basis for preferring multiple regression on representative samples over quasi‐experimental or experimental methods. We show how to estimate the “multiple regression weights” that allow one to study the effective sample. We discuss alternative approaches that, under certain conditions, recover representative average causal effects. The requisite conditions cannot always be met.  相似文献   

7.
Theory predicts that regression discontinuity (RD) provides valid causal inference at the cutoff score that determines treatment assignment. One purpose of this paper is to test RD's internal validity across 15 studies. Each of them assesses the correspondence between causal estimates from an RD study and a randomized control trial (RCT) when the estimates are made at the same cutoff point where they should not differ asymptotically. However, statistical error, imperfect design implementation, and a plethora of different possible analysis options, mean that they might nonetheless differ. We test whether they do, assuming that the bias potential is greater with RDs than RCTs. A second purpose of this paper is to investigate the external validity of RD by exploring how the size of the bias estimates varies across the 15 studies, for they differ in their settings, interventions, analyses, and implementation details. Both Bayesian and frequentist meta‐analysis methods show that the RD bias is below 0.01 standard deviations on average, indicating RD's high internal validity. When the study‐specific estimates are shrunken to capitalize on the information the other studies provide, all the RD causal estimates fall within 0.07 standard deviations of their RCT counterparts, now indicating high external validity. With unshrunken estimates, the mean RD bias is still essentially zero, but the distribution of RD bias estimates is less tight, especially with smaller samples and when parametric RD analyses are used.  相似文献   

8.
We believe that careful application of the logic of economics and public choice shines important light on regulation through litigation and can explain at least partly why regulators choose the litigation route, when they choose it, and how the choice may or may not achieve broad goals of efficiency and fairness. We present three case studies: heavy‐duty diesel engines, silica and asbestos, and the tobacco industry's Master Settlement Agreement (MSA).  相似文献   

9.
Partisanship often colors how citizens perceive real‐world conditions. For example, an oft‐documented finding is that citizens tend to view the state of the national economy more positively if their party holds office. These partisan perceptual gaps are usually taken as a result of citizens' own motivated reasoning to defend their party identity. However, little is known about the extent to which perceptual gaps are shaped by one of the most important forces in politics: partisan elites. With two studies focusing on perceptions of the economy—a quasi‐experimental panel study and a randomized experiment—we show how partisan perceptual differences are substantially affected by messages coming from party elites. These findings imply that partisan elites are more influential on, and more responsible for, partisan perceptual differences than previous studies have revealed.  相似文献   

10.
The Dictator Game, Fairness and Ethnicity in Postwar Bosnia   总被引:1,自引:0,他引:1  
This study considers the effects of ethnic violence on norms of fairness. Once violence is a foregone conclusion, will cooperative norms ever (re‐)emerge beyond ethnic boundaries? We use an experiment that measures how fairly individuals in a postconflict setting treat their own ingroup in comparison to the outgroups—in this case, examining the behavior of 681 Muslims, Croats, and Serbs in postwar Bosnia‐Herzegovina. To assess fairness, we use the dictator game wherein subjects decide how to allocate a sum of money between themselves and an anonymous counterpart of varying ethnicity. We find that the effects of ethnicity on decision making are captured by our experiments. Although results indicate preferential ingroup treatment, the incidence and magnitude of outgroup bias is much less than expected. We conclude that norms of fairness across ethnicity are remarkably strong in Bosnia, and we take this to be a positive sign for reconciliation after violent conflict.  相似文献   

11.
The sharp regression discontinuity design (RDD) has three key weaknesses compared to the randomized clinical trial (RCT). It has lower statistical power, it is more dependent on statistical modeling assumptions, and its treatment effect estimates are limited to the narrow subpopulation of cases immediately around the cutoff, which is rarely of direct scientific or policy interest. This paper examines how adding an untreated comparison to the basic RDD structure can mitigate these three problems. In the example we present, pretest observations on the posttest outcome measure are used to form a comparison RDD function. To assess its performance as a supplement to the basic RDD, we designed a within‐study comparison that compares causal estimates and their standard errors for (1) the basic posttest‐only RDD, (2) a pretest‐supplemented RDD, and (3) an RCT chosen to serve as the causal benchmark. The two RDD designs are constructed from the RCT, and all analyses are replicated with three different assignment cutoffs in three American states. The results show that adding the pretest makes functional form assumptions more transparent. It also produces causal estimates that are more precise than in the posttest‐only RDD, but that are nonetheless larger than in the RCT. Neither RDD version shows much bias at the cutoff, and the pretest‐supplemented RDD produces causal effects in the region beyond the cutoff that are very similar to the RCT estimates for that same region. Thus, the pretest‐supplemented RDD improves on the standard RDD in multiple ways that bring causal estimates and their standard errors closer to those of an RCT, not just at the cutoff, but also away from it.  相似文献   

12.
We develop an approach to conducting large-scale randomized public policy experiments intended to be more robust to the political interventions that have ruined some or all parts of many similar previous efforts. Our proposed design is insulated from selection bias in some circumstances even if we lose observations; our inferences can still be unbiased even if politics disrupts any two of the three steps in our analytical procedures; and other empirical checks are available to validate the overall design. We illustrate with a design and empirical validation of an evaluation of the Mexican Seguro Popular de Salud (Universal Health Insurance)program we are conducting. Seguro Popular, which is intended to grow to provide medical care, drugs, preventative services, and financial health protection to the 50 million Mexicans without health insurance, is one of the largest health reforms of any country in the last two decades. The evaluation is also large scale, constituting one of the largest policy experiments to date and what may be the largest randomized health policy experiment ever.  相似文献   

13.
Many researchers use unit fixed effects regression models as their default methods for causal inference with longitudinal data. We show that the ability of these models to adjust for unobserved time‐invariant confounders comes at the expense of dynamic causal relationships, which are permitted under an alternative selection‐on‐observables approach. Using the nonparametric directed acyclic graph, we highlight two key causal identification assumptions of unit fixed effects models: Past treatments do not directly influence current outcome, and past outcomes do not affect current treatment. Furthermore, we introduce a new nonparametric matching framework that elucidates how various unit fixed effects models implicitly compare treated and control observations to draw causal inference. By establishing the equivalence between matching and weighted unit fixed effects estimators, this framework enables a diverse set of identification strategies to adjust for unobservables in the absence of dynamic causal relationships between treatment and outcome variables. We illustrate the proposed methodology through its application to the estimation of GATT membership effects on dyadic trade volume.  相似文献   

14.
We analyze the first large‐scale, randomized experiment to measure presidential approval levels at all outcomes of a canonical international crisis‐bargaining model, thereby avoiding problems of strategic selection in evaluating presidential incentives. We find support for several assumptions made in the crisis‐bargaining literature, including that a concession from a foreign state leads to higher approval levels than other outcomes, that the magnitudes of audience costs are under presidential control prior to the initiation of hostilities, and that these costs can be made so large that presidents have incentive to fight wars they will not win. Thus, the credibility of democratic threats can be made extremely high. We also find, however, that partisan cues strongly condition presidential incentives. Party elites have incentives to behave according to type in Congress and contrary to type in the Oval Office, and Democratic presidents sometimes have incentives to fight wars they will not win.  相似文献   

15.
Nearly every aggregate study of minority legislative representation has observed outcomes of elections (officeholders), rather than the supply of minority candidates. Because of this, scholars have left a large amount of important data, the election losers, out of their models of minority representation. The evidence presented in this article demonstrates that voters in the United States cannot choose minority officeholders because there are rarely minority candidates on the ballot. I use state legislative candidate data from Carsey et al. ( 2008 ) and Klarner et al. ( 2012 ) to test models of Latino representation that correct for first‐stage selection bias. Once candidate self‐selection is taken into account, the probability of electing a Latino increases enormously. I then use data from 2010 to make out‐of‐sample predictions, which clearly favor the conditional model. Thus, our current understanding of Latino representation is significantly biased by ignoring the first stage of an election, a candidate's decision to run.  相似文献   

16.
Unelected officials with coercive powers (e.g., police, prosecutors, bureaucrats) vary markedly in the extent to which citizens view their actions as legitimate. We explore the institutional determinants of legitimate authority in the context of a public goods laboratory experiment. In the experiment, an “authority” can target one “citizen” for punishment following citizen contribution choices. Untargeted citizens can then choose to help or hinder the authority. This latter choice may be interpreted as a behavioral measure of the authority's legitimacy. We find that legitimacy is affected by how authorities are compensated, the transparency with which their decisions are observed, and an interaction between these. When transparency is high, citizens are more willing to assist authorities who receive fixed salaries than those who personally benefit from collected penalties, even when citizens' material incentives are controlled for. Lower transparency reduces support, but only for salaried enforcers.  相似文献   

17.
Despite declining memberships, labor unions still represent large shares of electorates worldwide. Yet their political clout remains contested. To what extent, and in what way, do unions shape workers' political preferences? We address these questions by combining unique survey data of American workers and a set of inferential strategies that exploit two sources of variation: the legal choice that workers face in joining or opting out of unions and the over‐time reversal of a union's policy position. Focusing on the issue of trade, we offer evidence that unions influence their members' policy preferences in a significant and theoretically predictable manner. In contrast, we find that self‐selection into membership accounts at most for a quarter of the observed “union effect.” The study illuminates the impact of unions in cohering workers' voice and provides insight on the role of information provision in shaping how citizens form policy preferences.  相似文献   

18.
Skeptics of school choice are concerned that parents, especially low‐income ones, will not choose schools based on sound academic reasoning. Many fear that, given choice, parents will sort themselves into different schools along class lines. How‐ever, most surveys find that parents of all socioeconomic groups cite academic aspects as important when choosing a school. Moreover, almost no parents refer to the social composition of the student body. Many advocates of choice hold up these results as proof that choice will produce desirable outcomes. However, these results may not be reliable because they may simply be verbal responses to survey items rather than indicators of actual behavior. In this research, we report on the search behavior of parents in the Metropolitan Region of Santiago, Chile, examining how they construct their school choice sets and comparing this to what they say they are seeking in choosing schools. The data indicate that parental decisions are influenced by demographics. Based on this evidence, we argue that unfettered choice may reduce the pressure on schools to improve their performance and could potentially increase stratification. © 2006 by the Association for Public Policy Analysis and Management  相似文献   

19.
In principle, experiments offer a straightforward method for social scientists to accurately estimate causal effects. However, scholars often unwittingly distort treatment effect estimates by conditioning on variables that could be affected by their experimental manipulation. Typical examples include controlling for posttreatment variables in statistical models, eliminating observations based on posttreatment criteria, or subsetting the data based on posttreatment variables. Though these modeling choices are intended to address common problems encountered when conducting experiments, they can bias estimates of causal effects. Moreover, problems associated with conditioning on posttreatment variables remain largely unrecognized in the field, which we show frequently publishes experimental studies using these practices in our discipline's most prestigious journals. We demonstrate the severity of experimental posttreatment bias analytically and document the magnitude of the potential distortions it induces using visualizations and reanalyses of real‐world data. We conclude by providing applied researchers with recommendations for best practice.  相似文献   

20.
Evaluations of the impact of social programs are often carried out in multiple sites, such as school districts, housing authorities, local TANF offices, or One‐Stop Career Centers. Most evaluations select sites purposively following a process that is nonrandom. Unfortunately, purposive site selection can produce a sample of sites that is not representative of the population of interest for the program. In this paper, we propose a conceptual model of purposive site selection. We begin with the proposition that a purposive sample of sites can usefully be conceptualized as a random sample of sites from some well‐defined population, for which the sampling probabilities are unknown and vary across sites. This proposition allows us to derive a formal, yet intuitive, mathematical expression for the bias in the pooled impact estimate when sites are selected purposively. This formula helps us to better understand the consequences of selecting sites purposively, and the factors that contribute to the bias. Additional research is needed to obtain evidence on how large the bias tends to be in actual studies that select sites purposively, and to develop methods to increase the external validity of these studies. © 2012 by the Association for Public Policy Analysis and Management.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号