AI hallucinations aren’t just for high school students: It seems that federal judges are turning their jobs over to the chatbots — and suffering the problems that come with it.
In recent weeks two federal judges have had to withdraw rulings because of fabricated information that experts said likely came from faulty artificial intelligence queries.
The snafus have prompted tough new questions about judges’ work and what sort of accountability can be applied to officials with lifetime appointments and few outside checks on their authority.
U.S. District Judge Henry Wingate, a Reagan appointee in Mississippi, withdrew a July 20 opinion three days after he issued it, blocking a state law prohibiting Mississippi from sponsoring diversity, equity and inclusion and gender identity programs.
His opinion had referred to nonexistent state law and parties, and cited declarations that weren’t actually part of the case. He expunged the original temporary restraining order and replaced it with a new one on the docket.
Meanwhile, U.S. District Judge Julien Xavier Neals, a Biden appointee in New Jersey, withdrew an opinion that misstated case outcomes and included fabricated quotes attributed to both litigants and other court precedents.
While the judges haven’t publicly explained their errors, experts said they bear all the hallmarks of AI “hallucinations,” the name given to situations in which an AI query returns seemingly plausible but inaccurate results.
Those sorts of errors have popped up in the work of courtroom lawyers, but for judges it’s still a rare — and worrying — happenstance.
The judges themselves have not explained what happened, so why the errors occurred — and blame on AI — is speculation at this point.
Judge Neals’ office declined to comment and referred questions to Chief Judge Renee Marie Bumb. Her phone connects directly to the clerk’s office, which said it couldn’t comment or get a message to her.
“Chambers does not like to be contacted directly,” said the aide who answered the phone. “I don’t think I can help you.”
Judge Wingate’s office didn’t respond to an emailed inquiry, and has rebuffed other news outlets.
Experts said the judges must do more.
“AI-generated errors aren’t just simple mistakes that can be fixed on appeal — they represent a serious failure of basic judicial duties,” said Susan Tanner, a professor at the Brandeis School of Law. “When a judge cites fake cases, that’s not a minor oversight or a legal interpretation error. It’s a fundamental breach of competence.”
Ms. Tanner said the judges owe the public a full explanation of what went wrong and why, and should not be content merely to withdraw the erroneous work with little fanfare.
Christina M. Frohock, a professor of legal writing at the University of Miami School of Law, said it’s not yet clear what led to the errors in the two cases.
“Something happened, but we don’t know what,” Ms. Frohock said. “Most likely, when mistakes arise in court orders, lawyers will flag the mistakes, and the court will correct its order. Now that AI is everywhere, I hope these news reports serve as a blaring alarm for everyone: we all need to check the facts and the law, especially if relying on AI for assistance. Check everything.”
As for demanding accountability, litigants could file judicial complaints, which would go to higher courts to address. And federal judges could face impeachment, though that’s a stretch, suggested Amy J. Schmitz, a law professor at Ohio State University.
“Public pressure and media exposure often prove to be the most effective catalysts for accountability,” Ms. Schmitz said. “Again, it seems that public trust requires that there be consequences, and judicial conduct commissions may be a route to ensuring that judges follow ethical rules.”
In Mississippi, Attorney General Lynn Fitch has asked Judge Wingate to publicly explain what went wrong in his now-withdrawn temporary restraining order against the state’s anti-DEI law.
Ms. Fitch said the judge couldn’t re-cork the bottle merely by deleting the erroneous order. She included a copy of the bungled ruling in her own filing so it remains on the public docket for now.
Damien Charlotin tracks AI hallucination cases in the legal sphere. His global database has nearly 250 cases. Four of those involve judicial rulings, including the two new ones last month. Another was from a state court in Georgia and the fourth was a case from India.
He said there are also some judge rulings from Latin America that he hasn’t been able to run to ground yet.
He said the New Jersey errors came to light after lawyers tried to cite the opinion as precedent in another case and realized the opinion was wrong.
“In terms of systemic damage, it is immensely worse when a judge does it,” Mr. Charlotin said.
He said the AI questions get at an underlying debate in legal circles over how much judges should delegate to their aides in researching and writing opinions. It also challenges the longstanding practice of some judges and lawyers who pepper their writings with lists of case citations copied and pasted from previous opinions.
“Hallucinations just bring these tensions to the fore,” Mr. Charlotin said. “Which is why accountability will require more than just ensuring that clerks and everyone involved in creating an opinion check their cites.”
“As long as the workload does not attenuate, the reflex to delegate (to clerks, to AIs, or a mix of the two) will remain, and thus (as long as the technology improves) the potential for error,” he said in an email to The Washington Times.