They find high accuracy in detecting many conditions: diabetes (83%), heart failure (90%), sleep apnea (85%), etc.
https://en.m.wikipedia.org/wiki/Receiver_operating_character...
So, 83% is actually not that great, given that you can achieve 50% by guessing randomly.
Had you merely called it an early instance of pretraining, I'd be fine with it.
https://stats.stackexchange.com/questions/185507/what-happen...
In fact, even when the wearable foundation model was better, it was only marginally better.
I was expecting much more dramatic improvements with such rich data available.
Sometimes you just have to use ultrasound or MRI or stick a camera in the body, because everything else might as well be reading tea leaves, and people generally demand very high accuracy when it comes to their health.
you can export your own apple XML data for usage and processing, but if you want to create an application and request apple XML data from users, that likely crosses into clinical research territory with data security policy requirements and de-identification needs.
- aidlab.com/datasets
- physionet.org
I don't even trust Apple themselves, which will sell your health data any insurance company any minute now.
The reality is that no matter how ethical the company you trust with that data is, you're still one hack or pissed off employee away from having that data leaked, and all of that data is freely up for grabs to the state (whose 3 letter agencies are likely collecting it wholesale) and open to subpoena in a lawsuit.
I have about 3-3.5 years worth of Apple Health + Fitness data (via my Apple Watch) encompassing daily walks / workouts / runs / HIIT / weight + BMI / etc. I started collecting this religiously during pandemic.
The exported Fitness data is ~3.5GB
I'm looking to do some longitudinal analysis - for my own purposes first, to see how certain indicators have evolved.
Has anyone done something similar? Perhaps in R, Python? Would love to do some tinkering. Any pointers appreciated!
Thanks!!
Bonus: when you’re done, you’ll have an app you can sell.
My sentiments, exactly.
Though I'm looking to scratch my own itch for now...
I am curious to do my own analysis, for two main reasons:
- some data is confidential (I'd hate for it to leave my devices) - wanna DIY / learn / iterate
Will ping you in any case. Thanks
For example, Apple Watch VO2Max (cardio fitness) is based on a deep neural network published in 2023: https://www.empirical.health/blog/how-apple-watch-cardio-fit...
Apple and Columbia did recently collaborate on a heart rate response model -- one which can be downloaded and trialed -- but that was not related to the development of their VO2Max calculations.
Apple is very shrouded about how they calculate VO2Max, but it likely is a pretty simple calculation (e.g. how much is your heart responding based upon the level of activity assumed based upon your motion, method of exercise and movements). The most detail they provide is in https://www.apple.com/healthcare/docs/site/Using_Apple_Watch..., which mostly is a validation that it's providing decent enough accuracy.
FWIW, the article above links directly to both the paper and a GitHub repo with PyTorch code.
Neat, though the paper and the Github repo have nothing to do with Apple's VO2Max estimations. It's related to health, and touches on VO2Max and health sensors, but the only source claiming any association at all is that Empirical site. And given that this research came out literally years after Apple added VO2Max estimates to their health metrics, it seems pretty conclusive that it is not the source of Apple's calculations. Neat research related to predicting heart rate response to activity (which might come into play for filling in measurement gaps which happen during activity when a device isn't tight enough, etc).
>What’s your source on Apple not using the neural network for VO2Max estimation?
You're asking me to prove a negative. Apple never claims that they do any complex math or deep neural networks to derive VO2Max, and from my own observations of its estimates of mine, it seems remarkably trivial.
Trivial can still be accurate. But it hardly seems complex. Like, guess people's A1c based upon age, body fat percentage, demographic and you'll likely be high-90s accurate with trivial algebra.
>even for seemingly simple metrics like heart rate
Deriving heart rate from a green light imperfectly reflecting off skin, watching for tiny variations in colour change, is actually super complex! Doing it accurately is actually pretty difficult, which is why wearable accuracy is all over the place, though Apple is one of the leaders and has been for years. Guessing a number based upon HR and activity level isn't quite as complex.
I wonder now if all of the derived metrics on Garmin (Training readiness, Training load, Training status)are purely statistical algorithms