What happens to your most intimate data

The previous piece argued that a companion feels alive because it remembers you. This one asks the question that follows directly from it: what happens to everything it remembers? These apps hold some of the most intimate data a person produces — and the honest answer to where that data goes is, by the standards of any other consumer category, alarming.

What you actually hand over

Most software collects data as a side effect of doing something else. A companion is different: intimate disclosure is the activity. People talk to these apps far more, and far more openly, than they talk to almost any other product — sharing private thoughts, relationship details, information about friends and family, photos, and sexual content, often for hours a day. The thing that makes a companion valuable is precisely that it invites you to reveal yourself. So the privacy question here is not about a careless leak on the side; it is about what a company does with a deliberately assembled record of your inner life.

What the research found

The most thorough look came from the Mozilla Foundation, which in February 2024 reviewed eleven romantic AI chatbots and gave every single one its "Privacy Not Included" warning label — placing them among the worst product categories the organization had ever assessed. Ten of the eleven failed basic security standards like requiring a strong password, and the researchers found tens of thousands of trackers firing within a minute of using one app, shipping data to advertising and marketing companies.

The nuance matters, and it cuts both ways. For an app like Replika, Mozilla's read was that the content of intimate chats probably is not sold — but behavioral data is shared and possibly sold to advertisers, cookies are hard to opt out of, and there is no guarantee your data is deleted even after you delete your account. So the worst-case fear (your confessions auctioned to advertisers) is not clearly proven; but the floor (a permanent, weakly secured record you cannot reliably erase) is bad enough on its own.

Does it train on you?

The question users ask most is whether their conversations are used to train the model, and the uncomfortable answer is usually "you can't be sure they aren't." As MIT Technology Review laid out, the default across these products is that users are opted in to data collection, and data already absorbed into a model's training is unlikely to ever be removed. Opt-out, where it exists, puts the burden on the user to understand a policy most people never read. There is real variation between apps — MIT Technology Review noted that one, Nomi, said it does not collect data for tracking — but the category-level default leans toward collection, not restraint.

It's a feature, not a bug

Here is the structural point that makes companion privacy genuinely hard rather than just badly executed. MIT Technology Review's framing is that the privacy risk is, in a sense, required: the product only works if it accumulates an intimate, persistent record of you, and that record is the liability. The very thing the last post praised — a memory that makes the companion feel like it knows you — is the same data store that makes these apps the most sensitive personal-data concentrations most people will ever create. You cannot have the feeling of being known without something, somewhere, holding the knowledge.

That is why "just be more careful" is not a real answer, and why the problem has started attracting regulators rather than only privacy advocates. The €5 million fine Italy levied against Replika's parent company was grounded substantially in data-protection and consent failures — an early signal that the intimate-data problem is being treated as a legal liability, not just a reputational one.

What responsible design would have to look like

If the risk is structural, then the only honest fixes are structural too — and they are worth naming, because they are the standard the category should be held to rather than a feature checklist. A companion that took this seriously would not train on your conversations by default; it would make deletion real and verifiable rather than a policy sentence; it would minimize what it collects instead of hoovering up behavioral data for advertisers; and it would give the user genuine control, including options that keep the most sensitive data off remote servers entirely. None of this is exotic — it is ordinary data-protection hygiene applied to an extraordinarily sensitive product. The fact that it is the exception rather than the norm is the story.

This also makes clear why privacy cannot be separated from the next question in the series. A product that holds this much intimate data, about people in vulnerable states, is not just a privacy problem — it is a duty-of-care problem. What responsible design looks like when users get genuinely attached, and what the new wave of regulation actually requires, is where this goes next.

What you actually hand over

What the research found

Does it train on you?

It's a feature, not a bug

What responsible design would have to look like

Sources

Related reading