Knowledge

“My AI told me so” — Why that won’t save you with the IRS

As artificial intelligence tools become fixtures in how people research and manage their finances, a question is quietly moving from hypothetical to urgent: what happens when a taxpayer relies on AI for tax advice, gets it wrong, and tries to argue reasonable cause? The answer, under current law, is not encouraging.

There is a long and not particularly glorious tradition of taxpayers arriving at an IRS examination with a variant of the same defence: someone else prepared my return, I relied on them, and whatever went wrong is therefore not my fault. The IRS has been hearing this argument for decades. So have the courts. The results have not, on the whole, been favourable to the taxpayer.

Now a new version of that argument is taking shape, one that will likely occupy tax professionals and courts for years to come. As AI tools become more capable, more accessible, and more confidently wrong, a growing number of taxpayers are using them not just to organise their affairs but to make substantive tax decisions. The question of what happens when those decisions turn out to be mistaken is one the legal system has not yet fully answered. But the existing framework gives a clear enough preview.

The foundational rule has not changed

Start with the bedrock principle, because it governs everything else. Signing a tax return is a legal declaration that you, personally, believe it to be correct. Authorising someone, or something; else to prepare it does not alter that obligation. The IRS Internal Revenue Manual is explicit: handing the preparation work to a representative does not relieve the taxpayer of their legal responsibilities.

The IRS’s own penalty relief guidance goes further, stating that reliance on a tax professional does not generally qualify as reasonable cause for a late or incorrect filing. You remain responsible for knowing the filing requirements, meeting the deadlines, and ensuring the numbers are right. This principle has withstood decades of creative argumentation, and there is no obvious reason why the identity of the “adviser”, human or machine, would change it.

The reasonable cause defence: what it actually requires

The law does provide a genuine defence. Under IRC §6664(c), an accuracy-related penalty will not apply if there was reasonable cause for the underpayment and the taxpayer acted in good faith. Reliance on a qualified adviser can satisfy that standard, but only under conditions that courts have defined with some precision.

The governing test comes from Neonatology Associates, P.A. v. Commissioner, 115 T.C. 43 (2000), and it is a three-part requirement. The taxpayer must show, first, that the adviser was a competent professional with sufficient expertise to justify reliance. Second, that the taxpayer provided all necessary and accurate information to that adviser. Third, that the taxpayer actually relied in good faith on the adviser’s judgment. All three elements are required. Falling short on any one of them defeats the defence.

Treasury Regulation §1.6664-4 adds further requirements: the advice must not have been based on unreasonable factual or legal assumptions, and the taxpayer must have disclosed all facts they knew or should have known to be relevant. The regulation also specifies that reliance on a professional does not automatically demonstrate good faith, it merely creates the possibility of demonstrating it, subject to all the circumstances.

Apply that framework to an AI tool, and the difficulties become apparent quickly.

AI and the three-part test: A poor fit

The first prong of the Neonatology test asks whether the adviser was a competent professional with sufficient expertise. An AI tool is not a professional in any legally recognised sense. It holds no qualification, carries no licence, is subject to no professional disciplinary regime, and owes no duty of care to the user. It cannot be sanctioned for bad advice, and it will not be available to testify about what it said and why. The Taxpayer Advocate Service has noted that AI chatbots from leading tax preparation providers gave inaccurate answers to complex tax questions roughly half the time. A 50% error rate on complex questions is not what “sufficient expertise to justify reliance” looks like in the case law.

The second prong; whether the taxpayer provided necessary and accurate information, introduces a different problem. AI tools operate on whatever inputs the user provides, with no mechanism to probe for missing context, request clarification on ambiguous facts, or flag that they are missing information they would need to give reliable advice. A human adviser, at least in theory, should ask follow-up questions. An AI tool will give an answer regardless. The taxpayer who receives a confident-sounding response without realising that the tool is working with incomplete information has not satisfied the requirements of informed reliance.

The third prong; actual good faith reliance, raises its own complications. For reliance to be genuine, it generally needs to be documented, specific, and traceable. The taxpayer needs to be able to show that they received particular advice, from a source they had reason to trust, that addressed the specific position taken on the return. An interaction with a general-purpose AI tool, even one dedicated to tax, does not produce the kind of record a court would recognise as the basis for a professional opinion.

The sophistication problem gets worse

The existing case law already applies a harder standard to sophisticated taxpayers. In Johnson v. Commissioner, the Tax Court held that a taxpayer’s five decades of experience in the real estate industry counted against him when he claimed not to have noticed significant errors on his own returns. The court’s position, essentially, was that a person of his experience should have spotted the problems — regardless of what his CPA had done.

That logic applies with particular force to AI reliance. The taxpayers most likely to use AI tools for complex financial research are not the unsophisticated, tax-unaware individuals who tend to receive the most sympathy from courts. They are high-income, financially literate individuals managing complex affairs across multiple jurisdictions, exactly the profile to which courts apply the most demanding standard of diligence. The argument that a sophisticated investor with an international portfolio relied on an AI chatbot for their Form 5471 positions and had no reason to question the output is not one that is likely to land well.

Where the stakes are highest

For taxpayers with international structures, the clients most likely to be considering AI-assisted preparation of the more demanding information returns, the penalty exposure is particularly severe. A missed or incorrect Form 5471 carries a $10,000 penalty per form per year. An untimely Form 3520 can trigger a penalty equal to 35% of the gross value of the reportable transaction. Where undisclosed foreign financial assets are involved, the accuracy-related penalty doubles from 20% to 40% under IRC §6662(h). These are not rounding errors. At this level of exposure, the question of whether a reasonable cause defence is available is one with real financial consequences.

International tax is also the area where AI tools are most likely to be wrong in ways that are hardest to detect. The IRS has been issuing Practice Units on international topics at a high rate — dozens in 2025 alone, covering areas such as BEAT, foreign tax credit computations and limitations, and treaty-based positions. These reflect the IRS’s current interpretive positions on evolving law. An AI model trained on data from even twelve months ago may not reflect any of this. The model may not know what it does not know, and nothing about a confidently-worded AI response signals that it is based on outdated or incomplete information.

The IRS is getting smarter at the same time

There is a further dimension that makes this more than a theoretical concern. At the same time as taxpayers are adopting AI tools, the IRS is deploying its own. Its Large Partnership Compliance model uses machine learning to screen complex returns for misreporting risk. Its audit selection models are becoming more sophisticated across multiple taxpayer segments. The agency has been transparent about its use of AI to detect patterns associated with incorrect reporting.

The practical implication is that the IRS’s capacity to identify errors, including the kinds of errors that AI-assisted preparation is most likely to produce, is growing. The tax gap the IRS is trying to close runs into hundreds of billions of dollars annually, and the agency is under sustained pressure to deploy its resources more effectively. AI-generated errors on international information returns, filed by high-income taxpayers, are precisely the category the IRS’s enhanced analytical tools are designed to find.

What a defensible position actually looks like

The existing reasonable cause framework, applied to AI, points toward a clear enough conclusion about what defensible practice looks like, even if the case law has not yet fully addressed the question directly.

AI can legitimately assist with tax preparation work. It can help normalise financial statements prepared under different accounting standards, draft work paper narratives, flag potential issues, and accelerate research. None of that is inherently problematic. The question is who reviews the output and what they bring to that review.

If a qualified specialist; someone who genuinely has the expertise the Neonatology test requires, reviews AI-assisted work product, applies professional judgment to it, identifies and corrects errors, and takes ownership of the positions filed, then the “competent professional with sufficient expertise” element of the test may be satisfied. The AI is functioning as a tool in the hands of a professional, in the same way that tax software has always been. The professional is exercising genuine judgment.

The risk arises when AI output is treated as a conclusion rather than a starting point. When the research is not reviewed by someone who knows enough to identify its gaps. When the work papers are accepted because the system produced them, without a professional checking whether the underlying analysis is correct. When the form numbers are filed because the software generated them. In those situations, there is no competent professional in the picture, and the reasonable cause defence, which requires one, is not available.

The argument that will not work

When the first significant penalty cases involving AI-assisted tax preparation reach the courts; and they will, the argument that will not succeed is some version of: “The AI is very sophisticated and widely used, and I had no reason to doubt it.” That argument would not succeed with a widely-used human preparer who got it wrong without having been given the relevant information. It will not succeed with an AI tool that was never in a position to know what it needed to know.

The law as it stands requires a competent professional, full disclosure, and genuine reliance. That requirement was designed for a world of human advisers, but it maps reasonably well onto the AI context too, because the underlying principle has not changed. The taxpayer is responsible for their return. Tools, however sophisticated, do not change that. What changes it is the presence of a qualified human who genuinely owns the analysis.

The IRS has heard every version of “it wasn’t really me” for decades. The AI version is new. The answer is likely to be the same.

This article is provided for general informational purposes and does not constitute legal or tax advice. Taxpayers with specific compliance concerns should consult a qualified tax professional.

Ready to turn complexity into clarity?

We’re here to help you make informed decisions, with confidence.

Get in touch