Image: Big brother © Getty Images

The Internal Revenue Service is collecting a lot more than taxes this year -- it's also acquiring a huge volume of personal information on taxpayers' digital activities, from eBay auctions to Facebook posts and, for the first time ever, credit card and e-payment transaction records, as it expands its search for tax cheats to places it's never gone before.

The IRS, under heavy pressure to help Washington out of its budget quagmire by chasing down an estimated $300 billion in revenue lost to evasions and errors each year, will start using "robo-audits" of tax forms and third-party data the IRS hopes will help close this so-called "tax gap." But the agency reveals little about how it will employ its vast, new network scanning powers.

Tax lawyers and watchdogs are concerned about the sweeping changes being implemented with little public discussion or clear guidelines, and Congressional staff sources say the IRS use of "big data" will be a key issue when the next IRS chief comes to the Senate for approval. Acting commissioner Steven T. Miller replaced Douglas Shulman last November.

"It's well-known in the tax community, but not many people outside of it are aware of this big expansion of data and computer use," says Edward Zelinsky, a tax law expert and professor at Benjamin N. Cardozo School of Law and Yale Law School. "I am sure people will be concerned about the use of personal information on databases in government, and those concerns are well-taken. It's appropriate to watch it carefully. There should be safeguards." He adds that taxpayers should know that whatever people do and say electronically can and will be used against them in IRS enforcement.

IRS's big data tracking

Consumers are already familiar with Internet "cookies" that track their movements and send them targeted ads that follow them to different websites. The IRS has brought in private industry experts to employ similar digital tracking -- but with the added advantage of access to Social Security numbers, health records, credit card transactions and many other privileged forms of information that marketers don't see.

"Private industry would be envious if they knew what our models are," boasted Dean Silverman, the agency's high-tech top gun who heads a group recruited from the private sector to update the IRS, in a comment reported in trade publications. The IRS did not respond to a request for an interview.

In trade presentations and public documents, the agency has said it will use a massively parallel computer system that can analyze data from different networks to find irregularities and suspicious activities.

Much of the work already has been automated to process and analyze electronic tax returns in current "robo-audits" that flag unusual behavior patterns. With IRS audit staff reduced by budget cuts this year, the agency will be forced to rely on computer-generated audits more than ever.

The agency declined to comment on how it will use its new technology. But agency officials have been outlining plans at industry conferences, working with IBM, EMC and other private-sector specialists. In presentations, officials have said they may use the big data for:

  • Charting and analyzing social media such as Facebook.
  • Targeting audits by matching tax filings to social media or electronic payments.
  • Tracking individual Internet addresses and emailing patterns.
  • Sorting data in 32,000 categories of metadata and 1 million unique "attributes."
  • Machine learning across "neural" networks.
  • Statistical and agent-based modeling.
  • Relationship analysis based on Social Security numbers and other personal identifiers.

Officials have said much of the data will be used only for research. The agency's economic forecasts and data are a key part of Washington's budget infrastructure. Former commissioner Douglas Shulman said in an IRS statement that the technology will employ "billions of pieces of data" to target enforcement and to "detect and combat noncompliance."

U.S. Tax Court records show that information gathered from Facebook and eBay postings have been used by the IRS in defending tax challenges. Under a Freedom of Information Act disclosure obtained by privacy advocates at the Electronic Frontier Foundation, the group published the IRS's 38-page manual used to train auditors to search Internet addresses, Facebook postings and other social media to back audit enforcements.

In practice, the third-party data has been used only if the irregular returns merit more attention. In one much-cited example, IRS officials talk about prisoners who were filing false claims for energy tax credits for window replacements.

The agency, wary of public opinion about invasive audit practices, has pulled back from using so-called "social audits," which, for example, might single out horse-racing enthusiasts or sailboaters for special attention. But by screening existing data for one million unique attributes, the agency can quietly create a DNA-like code to understand the economic behavior of any individual.

The IRS last year used a profiling test model to study 1,500 tax preparers with histories of reporting deficiencies and managed to recover $200 million. It cited the experience as proof that its data analysis works. Early this year, however, a new set of rules it developed for tax preparers was thrown out by a federal court who said the agency had overstepped its mandate. The IRS would not comment on whether the rules were based on its new screening tools.

More from U.S. News & World Report: