how do they know the race of the taxpayer?
...
We use total income, filing status, age, number of dependents, sex, first name, last name, and the ZIP Code Tabulation Area (ZCTA) of the residence as the explanatory variables to make inferences about the race and Hispanic origin category of the primary taxpayer of a filing unit or family. We apply Bayesian inference to estimate the probabilities that each taxpayer in our tax sample is in each of the 6 racial and Hispanic origin groups—White, Black, American Indian or Alaska Native (Native), Asian or Pacific Islander (API), multiple-race, and Hispanic— given the explanatory variables