Moral Machine – Navigating the Carpenter Trap: AI Data and Privacy Concerns

May 06, 2026

The Moral Machine host Chris D. Warren interviews Morristown, New Jersey federal criminal defense attorney Ernesto Cerimele about the New York Times v. OpenAI lawsuit and Judge Wang’s May preservation order requiring OpenAI to retain all U.S. user-submitted materials. They argue the order’s scale (billions of users/chats) creates an unprecedented pool of private, potentially confidential AI chat logs, raising Fourth Amendment concerns under Carpenter and limits on the third-party doctrine for modern digital data. Cerimele discusses risks of general warrants, overbroad subpoenas, and law enforcement access to preserved logs revealing politics, health, relationships, and legal strategy. Proposed guardrails include search-warrant requirements, particularity (timeframes/keywords/topics), minimization akin to wiretap rules, filter teams for privilege, and stronger judicial oversight, citing New Jersey’s State v. Benoa suppressing an overbroad four-year phone search.

Watch or listen to the podcast here:

Transcript:

**This transcript has been prepared automatically by AI and may contain inaccuracies**

Chris D. Warren [00:00:00]:
Thank you for joining the Moral Machine. I’m joined by my guest, Ernesto Cerimele. He’s an attorney and founding partner of Clingaman Cerimele Attorneys in Morristown, New Jersey. Ernesto focuses his practice on federal criminal defense, New Jersey and around the country. And he’s a frequent lecturer on new and novel issues affecting the defense bar and trial attorneys. So Ernesto talking about the New York Times vs OpenAI case, where OpenAI is suing or being sued by the New York Times for training its model on that data. In May, Judge Wang, the judge in that case, ordered a large preservation order that kept all materials submitted to OpenAI moving forward from every user in the United States of America. So let’s start talking about that case and how it affects individual rights under the Fourth Amendment, and let’s jump right into it.

Ernesto Cerimele [00:00:59]:
It’s a, it’s a complicated idea, for sure. You know, the preservation in civil cases is common, right? We know that. You know, the issue here is the sheer magnitude of what is being preserved. You know, according to OpenAI and ChatGPT, they have 4 billion users a month. There’s 2 billion chats going back and forth in a single day. And so the spirit behind the order makes all the sense in the world. But if the litigation goes on for another day or month, year, five years, as some of these complex cases could, the preservation will essentially be one of the biggest pools of private, potentially confidential information that, that we’ve seen as a society.

Chris D. Warren [00:01:58]:
So what you’re trying to say is that an AI chat log preservation of this magnitude is essentially a seizure under the Fourth Amendment. There’s a case that, that we had discussed earlier, Carpenter, where the Supreme Court found that the, the dossier that’s created under this civil court order would most certainly be an unconstitutional end run around the Fourth Amendment. What are your thoughts on that?

Ernesto Cerimele [00:02:28]:
Well, that’s the million dollar question, right? Because on the one hand, the traditional third party doctrine, the case law says that if you voluntarily share your information with a third party, you lose your expectation of privacy. And under that view, the government or the civil litigant would almost certainly take the position that once you voluntarily disclose information and logs to OpenAI, they can get access to that without a subpoena. Now, there’s also another line of cases, Carpenter, Rorschach, which essentially warn against uncritically applying the third party doctrine to modern digital data, recognizing that it’s just so different than evidence that we’ve seen in the past. And Chief Justice Roberts in that opinion called certain digital data a detailed, encyclopedic, and effortlessly compiled cross-chronicle. In essence, the more revealing, the more comprehensive and involuntarily retained the data is, the more the Fourth Amendment will kick back in. And so AI logs and conversations, which can expose private thoughts, sensitive conversations, legal research, you know, requests for information about your health, it looks a lot more like what we’re talking about in Carpenter than the, the third party doctrine. And that, that really should require a subpoena.

Ernesto Cerimele [00:04:11]:
Excuse me, a search warrant rather than a subpoena.

Chris D. Warren [00:04:15]:
So this, if AI logs are this Carpenter-grade material, this, these, these thoughts, these inner thoughts of, of, you know, as you’re saying, billions of conversations, what type of guardrails do you think should be put in place for the government in accessing essentially a database that they created through a civil court order?

Ernesto Cerimele [00:04:38]:
Yeah, the Fourth Amendment hates what’s called general warrants, broad warrants that let the government rummage through everything. It’s a license to see and take what they want. With AI logs, the risk is that the warrant could say, give us all conversations for this user for the past five years, which in essence would expose someone’s private life, politics, health, relationships, legal strategy, those types of things. So the court fortunately does have options. And you see this in, in the context of other search warrants, the court needs to be particular. The court needs to narrowly tailor the search warrant. It should specify timeframes and topics, keywords. So instead of saying that the government gets everything, it should be conversations between X and Y during March and April mentioning this specific event.

Ernesto Cerimele [00:05:47]:
So particularity is the single most important thing that can be done. The court could also borrow from federal wiretap statutes and order some type of minimization. So when law enforcement gets a wiretap and they start listening to the recordings of the conversations, they have an obligation to avoid and delete irrelevant material, privileged material, and maybe use a filter team to protect some of the privileged stuff. And lastly, you need judicial oversight to be candid, to provide some context. Search warrants are all done essentially ex parte, right? So a defense attorney isn’t seeing what the government is looking for before they get it. They apply to a magistrate or a superior court judge, and the superior court judge in, you know, nine out of 10 cases will rubber-stamp approve, including the scope of what the state is looking for or the government is looking for. And so with respect to this information, in the digital age, it’s incumbent on these magistrates to really police law enforcement and not rubber-stamp approve.

Ernesto Cerimele [00:07:16]:
This kind of brings me back to a case that came down about a month ago in the New Jersey Appellate Division. It’s spot on, to be candid. It’s State versus Benorah. It’s unpublished, but it was a case that was significant enough that the New Jersey Association of Criminal Defense Lawyers participated. And even though it’s unpublished, it’s worth having a conversation about. Just to kind of summarize without going into too much detail here, Chris, a woman was arrested for aggravated manslaughter, a very, very serious crime. She was alleged to have been under the influence of drugs at the time of the accident. And the state applied for a search warrant of her phone, which was granted.

What the state was looking to do, what its stated purpose was, was to review her information not for, you know, a day before the accident, an hour before the accident, a week before the accident, but for four years. And their hope was to find something in that four years that suggested that she was a drug user in order to support their case. And so last month, the Appellate Division reversed and they suppressed it. They basically said that you can’t do that. Warrants need to be narrowly tailored, particularly in the digital age. They can’t be overbroad. They need to be particular. And really, that’s, that’s what we’re talking about here with respect to ChatGPT user logs.

Chris D. Warren [00:08:58]:
So the order that was, the order from the Southern District was pretty clear that it was the preservation of everything, all material from, from any user, for any reason, for any purpose. And that’s a pretty broad category of documents. Everything that’s applying to everyone. And these conversations aren’t happening between the end user and someone else, they’re essentially happening with themselves, right? And a piece of software. So how would, how would you narrowly tailor a search in your practice? How would you fight a subpoena or a warrant that doesn’t have that type of particularity?

Ernesto Cerimele [00:09:42]:
Yeah, so it’s tricky, you know, and the tricky part is, you know, we just don’t have any case law about it right now, right? We’re relying on cases that involve other pieces of technology, other types of data sets. And this is extraordinarily unique. In order for, you know, AI to work, it needs to breathe, it needs to evolve, it needs to hear the user, right? I mean, the user needs to be as, as specific as possible with respect to, you know, the issues that person is having, whether it’s, you know, planning a flight to Hawaii or, you know, getting some type of advice about, you know, a skin condition, right?

And those are, those are extraordinarily private conversations. So, you know, as I said, law enforcement really should need to tailor them to specific times and specific subject matters. If they don’t, then what’s being created here is a pool of data. You know, some people who tried to intervene in that case with the New York Times and OpenAI referred to it as, you know, a large mass surveillance by the government. I wouldn’t go too far, I wouldn’t go that far. But realistically what it does is it creates one of the largest sets of personal and private data that’s ever existed.

Chris D. Warren [00:11:30]:
So let’s talk about the mass surveillance aspect of it. And yeah, that’s sort of an irreverent look at it. But is that not in effect what the government has done? Judge Wang, in her order in footnote two, indicates that this is definitely not a mass surveillance program, even though it’s the government creating this large Carpenter-esque dossier on every American that uses ChatGPT. And on the expectations of privacy, they’re pressing the delete button, but this order prevents that information from being deleted. So their expectation of privacy still is that, hey, I deleted that log, or this was a personal conversation, or I wasn’t using the training feature on GPT, I pay for the pro subscription or whatever it is, that they’re going to in their minds believe that this is private information. How is this order, a civil court order that is affecting the private thoughts of everyone who uses this software, not mass surveillance?

Ernesto Cerimele [00:12:38]:
Yeah, I don’t think that the judge intended that, and she was very clear in her order that she didn’t intend some type of mass surveillance, and she disagreed with the potential interveners’ characterization of it. I think the judge had the right intent. The judge intended to preserve evidence to be able to help the plaintiff, because if the evidence was destroyed, that plaintiff might not be able to use certain pieces of evidence to further their case.

So the spirit of the order makes all the sense in the world. The unintended consequences is really what we’re talking about here, Chris, because, you know, Judge Wang is not in cahoots with the FBI, right? She’s not in cahoots with the state or law enforcement. We know that. But law enforcement is aware or will quickly become aware that this information exists, that this potential evidence exists. So if they have a potential defendant who’s committed a crime, why wouldn’t they reach out to OpenAI or ChatGPT and try to collect that person’s communications with the AI bot? They should and they would.

And as more law enforcement agencies become aware of this pool, I refer to it as a pool, you could refer to it as a dossier or something like that, but it’s a massive pool of information and data about information and evidence that the user contemplated would be destroyed. And so the more that law enforcement keys on this issue, I would expect that they are going to try to access it in some way, shape, or form. The next question, I don’t have a crystal ball, but it’s coming, is when does that happen, in what form does that happen, and ultimately what does a magistrate do at that point in time?

Chris D. Warren [00:14:50]:
Yeah, I mean that’s, as a criminal defense attorney, I don’t envy your position in terms of defending these warrants or these subpoenas under different mechanisms that the government’s going to try to use as a third-party doctrine strategy. I mean, that is an exception here, is that there’s this no expectation of privacy when you’re sharing this information. I think there’s a whole line of cases on that exact issue.

But, you know, back to the underlying lawsuit. It has to do with the training of OpenAI’s LLM, ChatGPT, on material that belonged to the New York Times. So how would a forward-looking AI preservation order relate to information that was used to create the AI bot at that point in time? So, you know, thinking about it broadly, I don’t see a relationship between Wang’s order and the material that needs to be preserved for litigation in that particular case.

Ernesto Cerimele [00:15:57]:
When OpenAI actually opposed the order, they said that the actual relevant material in the information that would otherwise be deleted is something like 0.001%. It is so minuscule compared to what’s being preserved. So it’s certainly not narrowly tailored. It certainly could be.

Chris D. Warren [00:16:23]:
So how would your defense strategy change in light of this order? What would you do different, or how would you, without giving away the secret sauce, counsel your clients or prepare your defense knowing that this pool or dossier or this pre-surveillance of your client exists in some way?

Ernesto Cerimele [00:16:46]:
Yeah, you have to argue that this is a search and seizure under Carpenter requiring a search warrant. You know, the fear here is that because this information now exists, and law enforcement certainly knows that it does, could they tap into it with just a subpoena? And therein lies the issue, because, you know, a subpoena is, it’s not like a civil subpoena where litigants in a civil case are generally aware that another party is sending a subpoena. In a criminal case, subpoenas are usually sent long before someone is charged.

So for example, if someone is charged with bank fraud, a subpoena is issued to the bank long before the defendant is charged. So a defense attorney doesn’t typically have an opportunity to challenge the subpoena in real time. Ultimately, what you have to do once someone is charged is you have to move to suppress the evidence. If it was obtained only by way of subpoena and not by way of a search warrant, then you have a Carpenter-related argument.

You argue that the information is detailed, encyclopedic, effortlessly compiled, and just like emails or Carpenter records, a search warrant would be required.

Ernesto Cerimele [00:18:17]:
You know, the other thing you have to do is you have to make the argument that the records here are the fruit of a constructive seizure. The government or a civil entity caused the company to hold data, to create data that should have expired, that the users expected to be deleted. And so that makes the later subpoena or search warrant subject somewhat of a derivative constitutional violation. That’s something that hasn’t been decided yet, but it needs to be.

Chris D. Warren [00:18:54]:
So it’s going to be, it’s definitely going to be.

Ernesto Cerimele [00:18:57]:
It’s going to be.

Chris D. Warren [00:18:58]:
So it sounds like you’re going along my line of thinking here, that this sort of is a bit of a pre-surveillance or a mass surveillance, unintentional or not. That is what the danger is here. I don’t see a way that the government would be able to maintain these records without this order. That would be on every newspaper, on the front page of every newspaper and on every single news station for sure, if the government was keeping this type of record on every single American.

Ernesto Cerimele [00:19:34]:
That’s exactly right. If the government came out and said, yes, we’re recording your telephone calls, or we’re tracking your emails, or we’re tracking all of your ChatGPT conversations over the course of the past five years, you’d be concerned about it, right? And so, you know, this order, again, it wasn’t meant to create this massive pool of private information, but that’s the direct consequence.

Chris D. Warren [00:20:06]:
Ernesto, thank you so much for joining us today. Really appreciate your insight into this Carpenter problem, this Carpenter trap relating to ChatGPT’s requirement in order to maintain and preserve all logs of all inputs from users. So thank you so much again for joining us. We really appreciate it.

Ernesto Cerimele [00:20:23]:
Thanks, Chris.

Views expressed on Moral Machine are the author’s own and do not reflect those of the New Jersey Supreme Court Attorney Ethics Committee (District VI) or Falcon Rappaport & Berkman LLP.

Moral Machine – Navigating the Carpenter Trap: AI Data and Privacy Concerns

Watch or listen to the podcast here:

Transcript:

Nassau

Suffolk

New York City

New Jersey

Connecticut

South Florida

Mount Kisco

Dallas

Washington, D.C.

Metaverse

Practice Areas