Atlassian enables default data collection to train AI
427 points
9 hours ago
| 35 comments
| letsdatascience.com
| HN
martinald
7 hours ago
[-]
Atlassian just goes from misstep to misstep. I still use their products quite often. The amount of P0 bugs I experience is absolutely crazy:

- Bitbucket workers are hopelessly out of date (self hosted). We've had to put so many random workarounds in especially for Docker, as they don't keep them up to date enough

- I have had a bug in JIRA for years where I can't reorder a new ticket unless I refresh the page

- Every new feature they introduce into JIRA/Bitbucket over the past couple of years just doesn't work.

- I tried their AI stuff on the free trial, didn't work at all, tried to cancel, can't cancel the free trial online and had to write a load of support tickets (of which the support ticket contact form bugged out multiple times).

Anyone have any insight into why things have got so so dysfunctional? Tech debt? Talent leaving? Both? Even 'bad' enterprise software tends to be able to keep the most basic features running, but Atlassian is a whole new category. If you check their 'community' it is just hundreds/thousands of bugs with workarounds.

reply
rurp
6 hours ago
[-]
> I tried their AI stuff on the free trial, didn't work at all, tried to cancel, can't cancel the free trial online and had to write a load of support tickets (of which the support ticket contact form bugged out multiple times).

Absolutely insane that this is legal. The only reason to do this is to trick and abuse customers. It would be trivially easy to legislate away if our government cared to.

Atlassian seems like a typical entrenched big company, albeit an extreme example. They make money by selling to the bosses of their users and being the default name brand for many cases. Once a company gets to a certain size and doesn't directly compete much on quality internal corruption and incompetence can run rampant.

reply
HoldOnAMinute
4 hours ago
[-]
>> internal corruption and incompetence can run rampant

This affliction happens to almost every company, eventually. Nobody seems to have solved this.

reply
hungryhobbit
1 hour ago
[-]
Enshitification
reply
1minusp
4 hours ago
[-]
I generally agree with this comment, but what option does a decision maker have here? (apart from similar products that probably will end up doing the same things anyway). Are there equivalent scale/functionality products that can truly serve as an option?
reply
colechristensen
6 hours ago
[-]
It's explicitly not legal in California and some other places.
reply
pintxo
5 hours ago
[-]
Also for business customers? I would expect such regulations to only apply to b2c contexts.
reply
mhitza
7 hours ago
[-]
Featureatis. Just keep pumping out features with no thought. Today, probably also AI-coded .

Even in mid-sized projects if you keep pushing for only new features you'll get a similar system. At least my experience in 3 or so midsized projects that I've worked on where nothing else mattered than checking of features from a huge backlog.

reply
wsatb
7 hours ago
[-]
The search function in Jira has always been unusable. It’s perhaps the worst part of the entire platform, but nice to see they’re still focused on adding features I will never use.
reply
saganus
7 hours ago
[-]
I've always thought I was the only one experiencing this and felt like I was crazy.

I guess it's "good" to know that I'm not alone.

The amount of times I've searched for a ticket that I know it's there (because I either have it opened in a different tab, or because I just created it), but can't find, it's just way to many.

reply
wsatb
5 hours ago
[-]
The results usually seem completely random to me. It's like the feature never made it out of proof of concept territory. The only advantage of all the email noise Jira sends out is that I can usually search my email for what I'm looking for.
reply
jsk2600
4 hours ago
[-]
I've used JIRA back in 2009 and that is exactly what we did to work around shitty search function in JIRA.
reply
brobdingnagians
4 hours ago
[-]
YouTrack's search is one of the main reasons I use it. Nice query language to filter down on any fields, including custom fields, never had an issue finding things. It's great. With the number of useless search functions in so many products, I'm happy that at least my issue tracking does it right.
reply
pydry
5 hours ago
[-]
ironically it's the one place where an agent might be of some use and they created one and it's terrible.
reply
siva7
4 hours ago
[-]
at least they didn't break their pattern of disappointing users. consistency is key.
reply
zelphirkalt
37 minutes ago
[-]
You can add to this list: Every single input field they have in Confluence and Jira is misbehaving or broken. Apparently, we can't just have a text input widget that works well. Also apparently, this billion dollar enterprise cannot afford to write or use a proper markdown parser, and apparently we, the user lowlives, cannot be trusted with the full "pwer" of basic markdown laugh.
reply
ravenstine
6 hours ago
[-]
Jira is buggy as hell these days. Lots of desyncing that forces me to refresh the page. I can have a ticket open on a sprint board and the modal spontaneously closes after a while, forcing me to reopen it frequently. The other week there were tickets that simply refused to show up in their respective sprint board no matter what I did; later the epic magically appeared on the board out of nowhere, then finally the individual tickets themselves reappeared.

Gotta love the value that vibe coding has added to this world.

reply
mrweasel
2 hours ago
[-]
Atlassian also shutdown their self-hosted offerings. I'm not sure which version they were on with their datacenter edition, when that got cancelled. Part of it might also simply be a lax approach to QA, now that they don't have to support thousands of installations in on-prem environments. When you can just push out an update, your QA has to be much much better.
reply
myself248
5 hours ago
[-]
I'm sure Atlassian's shareholders appreciate your sacrifice.
reply
spike021
2 hours ago
[-]
Jira has vim-like bindings for navigating tickets on boards and years later the feature barely works. It has bugs like pressing the j/k keys changes the URL and random fields but stays on the same ticket or doesn't render the newly navigated-to ticket, etc.
reply
rdedev
2 hours ago
[-]
Here mine: you cannot link an existing branch to a jira issue. Maybe this is easier said than done but I can't find their reasoning anywhere
reply
xixixao
4 hours ago
[-]
Confluence is ok and has improved recently.

Jira is garbage (frontend, backend). Tough but true.

reply
pojzon
1 hour ago
[-]
Sounds like every other SaaS company that was bought by investment fund to milk it dry till everyone migrates off of it.

Not surprised. Quote „…with significant institutional ownership from Vanguard, BlackRock, and others”.

reply
ezoe
7 hours ago
[-]
Umm? Is there single step Atlassian did it right? It's a cancer of software development the suits force us to swallow while real development and useful documents are outside of their service because it's so stressful to use.
reply
taocoyote
3 hours ago
[-]
Stack ranking
reply
kevcampb
9 hours ago
[-]
I really wish I could find a better source to link to for this. By default, all free and paid customers are being opted-in to their data being used for AI training.

All your Confluence pages, Jira tickets, etc.

https://support.atlassian.com/security-and-access-policies/d... describes how to disable this, but it also appears that the setting to disable this doesn't exist (it's not visible on any of our instances).

reply
pryanbeng
5 hours ago
[-]
They said the opt out features will be rolled out to the Admin portal in May.

I got this info from an email they sent out

>To give you control over this change, we're introducing new in‑app settings that allow you to manage in‑app data contribution. Initially, these settings will apply to data in Jira, Confluence, and Jira Service Management, including data in your Atlassian Platform apps (Rovo, Home, Teams, Projects, Assets, Goals, Analytics, and Administration). We'll notify you when settings become available for additional apps you own, so you can review them in Atlassian Administration. Between today and May 19, 2026, we'll gradually roll out these settings in Atlassian Administration. We'll send you another email on May 19th as a reminder, so you have time to review and make any adjustments before August 17, 2026.

reply
carld
5 hours ago
[-]
I also do not see the setting to opt out. I'm at Atlassian Administration > Security, and I do not see Data contribution. I've looked at other, multiple setting pages and I do not see it.

So, is this an automatic opt-in without the ability to opt-out?

reply
somewhatgoated
5 hours ago
[-]
Opt out features will be introduced at a later time
reply
freakynit
3 hours ago
[-]
"In-app data covers user-generated content: page titles and bodies in Confluence, Jira issue titles, descriptions, comments, custom emoji names, custom status names, and workflow names" ... damn!!
reply
m4rtink
5 hours ago
[-]
What about really sensitive stuff like if possibly private tickets that have all kinds of stuff like customer data, embargoed CVE fixes or even sensitive health related data, are they just cobble that all into a model so it can leak out to random people ?
reply
rdedev
1 hour ago
[-]
There is a bunch of manufacturing related investigation reports written up in jira tickets or confluence pages at the pharma company I work for.
reply
kepano
6 hours ago
[-]
This seems to be the official description of the changes:

https://www.atlassian.com/trust/ai/data-contribution/faqs

reply
bradleyankrom
7 hours ago
[-]
reply
kevcampb
7 hours ago
[-]
Unfortunately that one has a subheading of "From August 17, the outfit will collect customer metadata by default unless you pay for the top tier"

It's not just metadata, it's all "in-app data"

reply
MagicMoonlight
5 hours ago
[-]
That's insane. Every single one of those things is highly sensitive and confidential information. How could you ever trust them after this? That information is priceless for shorting your company on the stock market.

Not that they'd ever do that of course. Nobody with highly sensitive information about rival companies would ever do that.

reply
tgv
6 hours ago
[-]
reply
itomato
5 hours ago
[-]
Opt-out at the Org level.

To get value out of Rovo, it needs detail. Your over-subscribed Jira power user/admin can't effectively make it happen. No guarantees Atlassian (Rovo itself) can make it happen either, but the patterns are going to develop and evolve closer and closer to the Agents that make the features.

They have a peculiar definition of Metadata, however. It's a proprietary data product derived from user content. It's a bit shit they way they sell it as metadata. It's a derivation. It's a product of Content, so it's Content - privacy safeguards cannot begin to cover the variation.

\"Metadata includes two data types referred to as content attributes and common patterns.

Content attributes are statistical characteristics, numeric fields, and derivatives of your in-app data. Examples of content attributes may include the number of story points assigned to a Jira work item or the complexity of a Confluence page. Common patterns are phrases, keywords, and topics we extract from search queries and results, Rovo Chat (conversations, prompts, and responses), and custom configuration data that are frequently seen across many customers, while omitting rare data that may be unique to your organization. Examples of common patterns may include common words, phrases, or Rovo Chat prompt topics that are frequently used by customers, such as “vacation policy” or “recap team activity.”\"

reply
Nathanba
6 hours ago
[-]
reply
kevcampb
6 hours ago
[-]
"Your available data contribution settings will be available no later than May 19, 2026."

So let me guess, they're hoping that we forget about this by then, so that they can scoop up our data? I can't think any other reason for it.

reply
atomic128
4 hours ago
[-]
Rumors that Anthropic is in talks to buy Atlassian, presumably for the training data. Data poisoning efforts are underway: https://www.reddit.com/r/PoisonFountain/comments/1sqrq24/atl...
reply
mrweasel
2 hours ago
[-]
I know at least two companies that won't be able to use Atlassian products anymore if that's the case. They really don't give a shit about privacy and regulatory requirements.
reply
svilen_dobrev
39 minutes ago
[-]
github etc hold source code -> scraped -> so AI may generate any of that.

And the specs become the new source (code).

fast forward..

Atlassian etc hold source specs -> scraped -> so AI may generate any of that.. then any of above..

the new source would be (?what? company missions? get-rich-quick-schemes?)

fast forward..

reply
zurfer
28 minutes ago
[-]
Hmm, if the stock keeps falling that might really happen.
reply
Bnjoroge
6 hours ago
[-]
Plenty of other companies enable this by default too, such as Github, Figma, Adobe, Vercel. I think it's fair to assume that if you ahve data stored within any company, they'll by default use it for training.
reply
tombert
5 hours ago
[-]
Maybe this will become The Year of the Self Hosted.

For stuff that I don't particularly care about privacy I've kept on the cloud (e.g. my blog, which is public anyway and as such is probably training bots regardless), but for stuff that I don't want to be used to train their models and/or sell to advertisers I have moved to be self hosted on my own network.

reply
pydry
5 hours ago
[-]
self hosting needs to be easier to set up for that to happen.

we're not far off it being good enough but it's not there yet.

reply
jsk2600
5 hours ago
[-]
Atlassian made self-hosting 'less easier' on purpose. They even discontinued their on-prem products.
reply
dreknows
6 hours ago
[-]
The opt-out-by-default pattern has been gradually normalizing in enterprise SaaS, but what makes this particularly egregious is the combination of two things: the data scope (not just metadata, but all in-app content per kevcampb's link) and the broken opt-out (the disabling setting not rendering on any instance).

One is a policy decision you can argue about. Both together suggest the friction is intentional.

The data residency point is worth flagging separately - a lot of enterprise buyers treat region-pinning as a privacy guarantee for everything in their contract. It was never that. Residency tells you where data is stored at rest, not who can access it for what purpose.

reply
tgv
6 hours ago
[-]
What makes this extra scummy is this:

“If customers were to right now terminate their contract, the new data contribution settings will not apply to them as these will not be enforced until August 17, 2026,” (from https://www.theregister.com/2026/04/18/atlassians_new_data_c...)

So you can't even take a bit of time to consider your options.

reply
huwsername
7 hours ago
[-]
If the rumours of an Anthropic acquisition are true, this makes a lot of sense. Anthropic are probably looking for a clean, high-signal dataset of metadata around business tasks that they can buy.
reply
m4rtink
5 hours ago
[-]
I'm thinking it would be ideal if Broadcom buys Attlassian instead and pulls another VMware. Problem solved - for ever. ;-)
reply
siva7
4 hours ago
[-]
Oh what the.. i can't pay for a 2000$ max sub :/
reply
mrweasel
2 hours ago
[-]
I know of a company that's stuck on the datacenter edition, because they aren't allowed by some customers to store their data in the cloud. I can't imagine how much they must pay for that.

Until they finish evaluating competitors, and eventually migrate to .... something, they are completely stuck. Jira is at the heart of all of their workflows and they cannot and will not move to cloud. This was an Atlassian partner, but they got screwed over on that part as well.

reply
ezoe
7 hours ago
[-]
I doubt data in Atlassian are anywhere close to clean or organic. It was designed by hell to swallow shit to real programmer who does real works outside of Atlassian.
reply
jerjerjer
6 hours ago
[-]
Programmer adjacent data can already be consumed from git repos. Atlassian has PM data.
reply
redwood
3 hours ago
[-]
Sounds very questionable, like boiler room is trying to do a pump n dump. I would not believe these rumors until we hear reputable sources outside of forum speculation
reply
jerhewet
6 hours ago
[-]
Will Atlassian be harvesting code and content from private Bitbucket repositories? The wording in their policies and FAQ's is vague, so I'd like to get a definitive (Yes / No) answer.
reply
zelphirkalt
14 minutes ago
[-]
If it is vague, then that probably is a very clear answer to your question.
reply
maxloh
5 hours ago
[-]
The adage was "If you're not paying for the product, you are the product." Now enterprises are paying to become the product. That's ridiculous.
reply
CobrastanJorji
2 hours ago
[-]
Microsoft, Amazon, Google, everybody else with both having-business-customers and also data-collecting businesses: "We swear that we absolutely will not collect/train our stuff on business customer data."

Atlassian: "Yolo!"

reply
zelphirkalt
28 minutes ago
[-]
They are lowering the threshold for this kind of shit for everyone else. We should kill it with fire, before this spreads even further. But I guess most businesses led by non-technical people will simply not care and give their customer data to the AI sharks at no additional cost.
reply
ai-tamer
54 minutes ago
[-]
Genuine question: how many agent-hours to rebuild Jira from scratch and migrate 100% of the content out? Split the work, pool our agents, ship by August 17. ;-)
reply
firesteelrain
6 hours ago
[-]
No wonder they wanted to stop supporting the Data Center versions for on prem.
reply
microflash
6 hours ago
[-]
I read this as "Stop using this product" toggle every time a company does this without consent. It has done a good amount of mental and financial improvements to me.
reply
deferredgrant
3 hours ago
[-]
Does anyone know what falls under "other cloud products" mentioned here?

Would that include something like Trello?

reply
reeseparker63
7 hours ago
[-]
Worth noting that Atlassian's data residency options don't exempt you from this—your data can still be used for training even if you've pinned it to a specific region.
reply
wingmanjd
4 hours ago
[-]
I made this a while back to move us off our on-prem Atlassian to Gitlab [1]. Maybe it'll help someone if they want something similar. Fair warning: I haven't tried this recently, so YMMV.

[1] https://gitlab.com/jeremygonyea/jira-to-gitlab-migration-too...

reply
kepano
7 hours ago
[-]
The official Atlassian FAQ on this change:

https://www.atlassian.com/trust/ai/data-contribution/faqs

reply
willis936
6 hours ago
[-]
Presumably the government and HIPAA carveouts are for legal obligations. Trade secret theft is illegal so I wonder why they're not considering this.
reply
danny_codes
4 hours ago
[-]
Maybe if you put your data in Atlassian the you failed to adequately protect your trade secret? IIRC you need to make a reasonable effort to protect the secret.
reply
willis936
1 hour ago
[-]
Establishing MNDAs is considered reasonable effort and this is a policy update that basically says "we are ignoring all MNDAs".
reply
dylan604
4 hours ago
[-]
Because nobody will prosecute them for violations
reply
everdrive
1 hour ago
[-]
Who wouldn't these days. Just assume if a company has your data it's training AI on it. No company cares about your privacy more than they do their profits. Not a one.
reply
yalok
5 hours ago
[-]
Does this include repos content in BitBucket?
reply
qsera
6 hours ago
[-]
I am wondering why not just rsyncrypt the source code before pushing to the repo?

>rsyncrypto is a utility that encrypts a file (or a directory structure) in a way that ensures that local changes to the plain text file will result in local changes to the cipher text file. This, in turn, ensures that doing rsync to synchronize the encrypted files to another machine will have only a small impact on rsync's wire efficiency.

https://manpages.ubuntu.com/manpages/focal/man1/rsyncrypto.1...

reply
RomanPushkin
4 hours ago
[-]
They're so desperate because their stock went down ~10 times in last 5 years or so
reply
fred_is_fred
4 hours ago
[-]
To anyone using a model trained on my company's Jira tickets, I apologize for the regression.
reply
nyellin
2 hours ago
[-]
Why does Atlassian need to train AI models?
reply
odie5533
1 hour ago
[-]
Rumor is they're being bought by Anthropic.
reply
an0malous
6 hours ago
[-]
We need to kill SaaS. Apps should be local-first and have peer-to-peer data sync. These companies won't stop until they use your data to replace you and enrich their owners.
reply
rogerthis
6 hours ago
[-]
Beautiful on paper. But it does not scale outside a certain type of tech people.
reply
an0malous
5 hours ago
[-]
What’s the scaling bottleneck? If you made a local-first, P2P version of Figma what would break first? For a company of like 50 people, I doubt you’d have more than 100GB of data so it should fit on everyone’s computers. The P2P syncing part seems solvable, even if you need a centralized handshake server somewhere. And from the user perspective I don’t see why the UX couldn’t be identical, so it’s all the same to them.

It seems like the real bottleneck is something else.

reply
moring
3 hours ago
[-]
> If you made a local-first, P2P version of Figma what would break first?

The guy who has to keep it running day by day, next to the other 30 local-first systems.

reply
an0malous
3 hours ago
[-]
What is there to run? There are millions of apps that don’t require maintenance, this was the default before SaaS.
reply
rsynnott
6 hours ago
[-]
Imagine an AI based on jira tickets. _That's_ the torment nexus.
reply
pkilgore
6 hours ago
[-]
Does this apply to Loom?
reply
itomato
5 hours ago
[-]
Loom isn't mentioned in the Partner materials I have read. That's about all I can say.
reply
zelphirkalt
1 hour ago
[-]
Oh another piece of the abysmal tools stack that should bite the dust. Maybe I will still see a software job without terrible tooling in the EU.
reply
titzer
5 hours ago
[-]
AI contributing to rising natural stupidity.
reply
jason_s
5 hours ago
[-]
I'm really tired of JIRA, to the point where I have expressed it publicly: https://www.embeddedrelated.com/showarticle/1772.php
reply
rvz
5 hours ago
[-]
No surprise here. It's by design.
reply
shadowgovt
5 hours ago
[-]
The only silver lining I can see in this is that if they replace their existing tooling with AI integration, we might actually get search and confluence that works.

I've lost count of how many times I search for a keyword and get no relevant results, but the document I'm looking for, which contains the keyword, is in my automatic pop-up of recent documents visited.

reply
arjunthazhath
5 hours ago
[-]
Omg
reply
RobRivera
1 hour ago
[-]
Yet another opportunity to provide an alternative that keeps data private
reply
oliver236
7 hours ago
[-]
genius move.
reply
tqwhite
7 hours ago
[-]
I don't see it as a misstep at all. The purpose of StackOVerflow is to share expertise.

I am 100% supportive of it being used for training... AI, you, everyone.

reply
UqWBcuFx6NV4r
7 hours ago
[-]
Dude, what?
reply
malfist
7 hours ago
[-]
What? Atlassian is not stack overflow.
reply