Racial Disparities in IRS Audits

Following up on Les’ post yesterday about the upcoming symposium on race and taxes, I want to draw our readers attention to an important study released yesterday, on the racial impact of Internal Revenue Service (IRS) audits, titled Measuring and Mitigating Racial Disparities in Tax Audits.  Now, before I launch into what the study found, let me make my usual disclaimer: I am not an academic nor am I an economist; I have zero expertise in statistics or in computer modelling.  But what I read in this incredibly coherent and well-written study is accessible to and understandable by someone such as myself (mathematical equations notwithstanding). Kudos to the authors!  And everyone should read this.

The study estimated the audit rate for Black taxpayers is .81 to 1.24 percentage points higher than the audit rate of non-Black taxpayers, implying that Black taxpayers are audited between 2.9 to 4.7 times more than the rate of non-Black taxpayers.  Further, the study projected that Black taxpayers claiming the Earned Income Tax Credit (EITC) on their returns are 2.9 to 4.4 times more likely to be audited than non-Black EITC claimants. 

The IRS does not gather racial or ethnicity data of taxpayers, so the researchers had to impute race by various methods that have been used in other studies.  The paper lays out the methodology and identifies the drawbacks to this.  Then, by using various models, the researchers found that racial differences in income, family size, and household structure still don’t explain the large disparity in audit rates.  Instead, it appears that audit selection algorithms based on the mere existence of underreported tax (i.e., a yes/no binary selection) or underreporting of refundable credits such as the EITC, the Additional Child Tax Credit, and the American Opportunity Tax Credit (AOTC) appear to contribute to the racial disparity in audit rates, whereas audit selection algorithms focusing on the predicted total amount of underreporting resulted in Blacks being audited less than non-Blacks.  Thus, the study concludes that the objective of the predictive model underlying audit selection, along with operational considerations such as employee expertise, costs of audits, and congressional or other expectations, can be “critical drivers of disparity.”


Here are some of the fascinating data points from the study:

  • Black EITC claimants are audited at higher rates than non-Black EITC claimants within each decile of underreported taxes;
  • Black unmarried males with children claiming the EITC are audited at two times the rate of non-Black unmarried males with children claiming the EITC.  Similar audit disparities were found between Black and non-Black joint filers, unmarried females, and unmarried males with no dependents.  These disparities continued even after controlling for demographic characteristics such as filing status, household structure, or income, or the combination and interaction of these characteristics.  Thus, these characteristics alone don’t explain the racial disparity in audit rates.
Figure 6: Audit Rate Disparities by EITC Subgroup
  • Blacks and non-Blacks are audited at similar rates in field and office audits; thus the audit disparity is driven by the dominance of correspondence exams as the IRS tool of choice for individual audits.  As I’ve covered in other blog posts here and here, corr exam procedures and notifications are very problematic.  Add on to those problems the racial disparity, and you have a significant taxpayer rights issue.
  • Audit disparity between Blacks and non-Blacks are larger with respect to pre-refund audits, although the disparity exists in post-refund audits as well.  This is a significant observation; because the IRS is front-loading more compliance activity in the pre-refund environment, it is likely the racial disparity will be exacerbated going forward unless the IRS takes steps to correct for it.  

To test possible factors that might influence the audit disparity, the study’s authors built a risk-prediction model using National Research Program random EITC audit data (rather than the operational audits selected by the IRS automated algorithms).   Without going into all the details – I would probably butcher the description, and besides, you need to read the full study – they tested four methods of selecting taxpayers for audits:

  1. True Underreporting Amount:  returns were ranked by the amount of underreported taxes and then, in descending order, selected for audit until a pre-determined audit rate was achieved.  At each audit rate, the model selected Black taxpayers at a lower rate than non-Black taxpayers.
  2. Predictive Underreporting Amount:  Because at the time of audit selection the IRS does not know the true amount of underreported taxes, the authors built a model to rank returns for predicted underreporting based on what the IRS can see at the time of audit selection, and selected returns in a descending order until a specified audit rate was achieved.  This model detected twice the amount of underreporting as the “status quo” model (thus not sacrificing audit selection accuracy) while selecting Black taxpayers at a lower rate than non-Black taxpayers.
  3. Binary Classification Model:  Here, the authors compared the results of the predictive model (described above) with an algorithm that selected returns simply on whether underreported tax might exist.  The “classification” model selected Black taxpayers for audits at higher rates than non-Black taxpayers.
  4. Refundable Credit Model:  This model was trained to predict the total National Research Program adjustments for the EITC, AOTC, and ACTC, and then selected returns for auditing based on the descending order of the predicted overclaims, up to a specified “budget.”  Comparing this model to the baseline status quo model showed that the refundable credit model detects substantially less total underreporting (i.e., the EITC population has other types of underreporting) and selects Black taxpayers for audit at higher rates for non-Blacks for all budget categories.

Having demonstrated that the objective of the prediction model may be one factor contributing to racial disparity in audit selection, the authors considered what other factors/constraints might influence this disparity.  They identify limitations of employee expertise, the cost of audits, and certain policy goals.  So, for example, if underreported tax was attributable not just to EITC but also the reporting of business income, IRS employees may not be skilled in detecting this underreporting and such audits may need to be done in an office or field environment, which is much more costly than the “cheap” correspondence examinations.  The concern that business EITC audits require expensive and trained resources could push the IRS to conduct less expensive non-business audits.  Since Black taxpayers constitute 21 percent of non-business EITC returns and 11% of business EITC returns, these operational concerns would heighten racial disparity in audit selection.  In another simulation, the study found that operational constraints contributed to higher audit rates for Black taxpayers.

Another constraint is at the policy level — OMB’s classification of the EITC, ACTC, and AOTC as “improper payments,” Congress and the IRS itself have determined that to address these improper payments, IRS must conduct more EITC audits.  As the study shows, selecting based on EITC underreporting alone results in higher audit rates for Blacks.  (Note that nothing in the law governing improper payments requires audits.  Rather, it says agencies must come up with strategies to address these payments; those strategies can include education, outreach, soft compliance touches, and any number of non-audit approaches.) 

Now here is one of the key implications of the study – that even where audit selection processes are largely automated, so that there is no overt/intentional discrimination in audit selection, the focus or objectives of those automated processes can significantly impact racial disparity in audit selection.  By designing algorithms that select for underreported refundable credits rather than the totality of underreported tax, for example, the racial disparity continues. But what is really neat about this study is that it uses algorithms and machine-learning to ferret out what might be creating that disparity, which in turn can be used to improve the actual selection algorithm.

And one final note.  One thing that the study makes the case for and I have been harping about for a long time is the need to focus audit selection on amount underreported rather than specific issues like EITC.  The former approach accepts that dollars are fungible – i.e., a dollar of tax underreported because of unreported income or overreported expenses/deductions is the same as a dollar of tax underreported because of improperly claimed refundable credits.  It is only because of value-laden biases that we treat these two differently.  The study demonstrates the impact of those biases on differing racial groups.

I’ve only scratched the surface of this important paper.  I’m sure I’ve missed some of the points, and I intend to study it further.  There’s much more work to be done here, which is what the authors say in closing their report:

[O]ur analysis of counterfactual audit algorithms does not account for the full set of constraints facing tax authorities like the IRS, such as the types of compliance issues that can be explored through correspondence audit, or differences in audit response rates depending on whether the audit is pre- versus post-refund. A more complete optimal policy analysis would require accounting for these additional objectives and constraints. Finally, audit selection constitutes only one dimension in which tax administration may differently affect taxpayers by race. Disparities may also exist with respect to such processes as collections, appeals, settlements, and guidance (citations omitted). The approach described in this paper can serve as a foundation to explore disparities in these areas as well.