Or is it more a mix of lifecycle rules, cron jobs, and manual cleanup?
How are you doing this today? I feel like this is a blocker in enterprise deals when selling to regulated industries.
I used to work at a data platform team and built a cleaning service that used tags and object hierarchy trees to find and clean old PII data. Not an easy thing to do as our data analytics bucket had over 7PiB of data.
Overall the architecture was based of 3 components: detector, enforcer, cleaner. Detector sifted through the datalake to find PII datasets(llm based), enforcer tracked down ETL of the datasets in our VCS to set appropriate tags/metada(custom coding agent), finally cleaner used search to find and clean the data based on the metadata.
I've had some big enterprise deals fall through because of something like this - military, insurance, fintech, etc.