FilterHN

Ask HN: Experience using your user's Google Drive instead of a database?

50 points

1 year ago

| 24 comments

There are several apps that will let you use their service without creating an account or storing any of your data with them. You sign up with google or dropbox etc, and the app will read/write everything to your storage provider.

Have any of you done this? What has your experience been like? Have you found a way to make this work with multi-player? What storage provider has the best API for this?

▲

whalesalad

1 year ago

[-]

I've often felt we need an abstraction for just this. "Bring your own storage" so that you can sign up and provide a "bucket", then the service will read/write to that. I think the difficulty lies in all the discrepancies between different storage mechanisms and normalizing the SLA there. If Dropbox is down, and your user is on Dropbox, you'll need a way to communicate that their particular "storage" is offline. The trade off is likely worth it - that is being in full control of your data and having a data export "for free" is worth it to a lot of folks.

OpenDAL was on HN recently and would be a pretty decent abstraction to use for this: https://github.com/apache/incubator-opendal

▲

mickael-kerjean

1 year ago

[-]

> I've often felt we need an abstraction for just this. "Bring your own storage"

I made exactly this: https://github.com/mickael-kerjean/filestash The public API enables programmatical access to S3, SFTP, FTP, GIT, WebDav, Samba, Local FS, NFS, Backblaze, Storj, Artifactory, .... and more. Under the hood, each storage implements this interface:

  type IBackend interface {
    Ls(path string) ([]os.FileInfo, error)
    Cat(path string) (io.ReadCloser, error)
    Mkdir(path string) error
    Rm(path string) error
    Mv(from string, to string) error
    Save(path string, file io.Reader) error
    Touch(path string) error
  }

There's some hook so you can define your own and do even more funky stuff like virtual FS wich compose onto anything you want/need. There's a bunch of weird implementation like the mysql storage, which represent first level folders as databases, second level folders as tables and files as the actual rows.

fun fact: it was born out of a reflection from the infamous top comment of the Dropbox launch on HN

▲

freedomben

1 year ago

[-]

Whoa, this is a solution I've looked for several times and can never find. Thanks!

▲

networked

1 year ago

[-]

> I've often felt we need an abstraction for just this. "Bring your own storage" so that you can sign up and provide a "bucket", then the service will read/write to that.

The remoteStorage protocol is one: https://remotestorage.io/. Sadly, it seems remoteStorage never took off. The first version of the spec was published in 2011, and the reference implementation reached version 1.0 in 2017, but if you look at the list of known apps that support it (https://remotestorage.io/apps/), there are few active projects.

▲

pradn

1 year ago

[-]

There's a bunch of S3-compatible bucket stores. I suppose one could ask their user for their S3 bucket key. It should be possible for an S3-like service to provide one-bucket-only keys w/ rate limits. The users knows where their data is and exactly when its being read/written. And when done using the app, the user can even revoke access.

Of course, having users manage a storage bucket and its key is a bit tedious. Browsers could generate a per-app bucket and provide the bucket key to the app, all with user permission (like when a site asks for location permissions).

These buckets can be shared between apps too. As long as apps can read the same data formats, data portability is stellar.

It would be great to have an ecosystem of apps that all use compatible buckets/formats.

But this is unlikely to happen bc this sort of transparency and data portability isn't currently a big demand by users / governments. Privacy activists have gotten a great number of wins over the past decade. We need to make progress in many more computing issues.

▲

spencerflem

1 year ago

[-]

RemoteStorage https://remotestorage.io/ seems to be trying to do this too

I also really like the https://sandstorm.io approach which goes a little farther beyond

▲

daveguy

1 year ago

[-]

https://rclone.org is a very effective storage shim. You'd still want to limit supported backends to provide a smooth UX. There are binaries for the major OSs (including BSDs, Plan9, and Solaris). They have Android builds but they are not first class (yet). Unfortunately I do not see any support for iOS.

▲

whalesalad

1 year ago

[-]

Good for sync but not at all the ask here

▲

gwbas1c

1 year ago

[-]

I was a lead for a major competitor for Google Drive for awhile.

Yes, using a cloud storage provider might work instead of running your own database...

> Have you found a way to make this work with multi-player?

... But this is where you will run into serious problems.

"Cloud drives" aren't designed for highly concurrent multi-user access. That is when you need an ACID-compliant database; (or otherwise understand this issue well enough that you know what you're doing.)

In our case, our API was based around light usage of files: If two people edited the same file at the same time, we couldn't resolve the conflict automatically, because these were files: We had no knowledge of the files' schemas.

Likewise, our APIs weren't designed for "live" high-performance usage. In our case, if it took under two minutes for an update to sync, we were happy. That's an eternity in a multi-player game.

In general, if your application is going to use a cloud drive as its data storage, you should target light usage of storage needs: IE, the user's data is generally private, the user is generally generating light changes, multi-user usage doesn't require heavy "live" data.

IE, this would work great for a password manager that allows sharing passwords via the cloud drive's existing sharing system. It would also work great for an online image editor, sheet music editor, anything where you could envision a desktop app that edits files.

▲

wanderingmind

1 year ago

[-]

I'm not sure if diagrams.net (drawio) people hangout here. But I use their webtool to create flowcharts and architecture diagrams that are then stored into Google Drive. They maybe able to shed some light into it. I must warn that using it in Firefox is a bit spotty compared to Chrome, though this is likely due to issues in Google OAuth with FF extensions.

▲

freedomben

1 year ago

[-]

Same, but I can never create a new file in a directory. It will get stuck in some weird auth loop. So I always create it in the "root" directory and then move it afterward.

I love diagrams.net (draw.io) though so I just appreciate that it exists

▲

ivanjermakov

1 year ago

[-]

They use Google Picker API and I think it utilizes `drive.file` scope to give app rw access to a single (picked) file.

▲

thebestmoshe

1 year ago

[-]

Originally, when I started working on [Fileshark](fileshark.app) I designed it around using your own Google Drive as the storage backend. I ended up going in a different direction. Here are some of the reasons:

- Having the entire product be reliant Google is just too big a risk. Google has changed apis, and restricted access to third parties.

- Lots of added complexity around search, pagination, etc.

- Due to the last point, all the files need to be tracked in an external database. This adds a requirement to keep everything in sync.

- Now that the files are not stored in Google Drive, I’m working on adding an option to support auto backups to Google Drive.

▲

tech234a

1 year ago

[-]

Aternos (a free/ad-supported Minecraft server host) backs up a user’s servers to a user-provided Google Drive account. Google Drive allows apps to store hidden data that the user cannot directly see or modify, which is useful from a security perspective on Aternos, as server backups can contain executable jar files.

▲

smallerfish

1 year ago

[-]

I am working on a side project that currently stores data as json-decorated markdown in localstorage, without any backend. It would be trivial to write a backend that oauths with dropbox/github/etc and sync data through it, but there's little point in doing so (assuming you allow data export, and unless you want to avoid storage costs), as once user data is going through a closed backend, you have to trust the provider.

A while back I looked into the feasibility of accessing various APIs from the browser directly, and I ran into a dead end. Everything I found has CORS restrictions that prevent this from working. If anybody knows of anything, I'd love to take a look at it.

▲

tonyarkles

1 year ago

[-]

>It would be trivial to write a backend that oauths with dropbox/github/etc and sync data through it, but there's little point in doing so

I don't know what your project is, but on any given day I'm switching between three laptops (one OSX, one Linux, one Windows), an iPad, and my iPhone. Localstorage is really great for prototyping but doesn't sync between machines. Export is ok, but sync is awesome. Plus having access to the raw data, for me, can be really beneficial for integrating with local tools. For example, I believe I once had a Python script that could parse out specific attributes out of a Drawio document; I wish I could remember more details from that.

▲

smallerfish

1 year ago

[-]

For sure, sync is another big challenge. Once you had "third party storage from the browser" you could manage sync.

I've been all over browser filesystem access too (you could e.g. use syncthing to sync from the filesystem) and it's just too clunky for users to use still.

Also tried building a chrome extension in an attempt to use the sync storage that's available to extensions (used to sync your settings across devices), but it's crippled by being maxed out at only 100kb.

▲

tonyarkles

1 year ago

[-]

Yeah, browser filesystem access is ridiculously clunky, especially if it’s a more of a “document you’re iterating on” where you have to remember to download it, make sure it’s in the right place to sync, and upload it again on the other devices. I think the thing that third-party storage really wins at is that it’s very low friction and pretty close to universally available for a semi-technical user (e.g. drawio/diagrams.net supports OneDrive, Google Drive, Dropbox, GitHub, and Gitlab). Chrome extensions are relatively high friction in comparison, plus… in my personal example, I don’t actually use Chrome on any of the devices. Safari on iPhone and iPad, Firefox on Windows/Linux/OS X.

▲

lostmsu

1 year ago

[-]

I have this storing stuff on OneDrive: https://h5reader.azurewebsites.net/

▲

seltzered_

1 year ago

[-]

Possibly related - Dave Winer's writing about using dropbox for apps:

- http://scripting.com/2015/06/25/dropboxCouldBeKingOfTheOnepa... ('Dropbox could be king of the one-page app', 2015)

- http://scripting.com/2023/08/05/130423.html ('Identity as a product', 2023)

▲

blackbear_

1 year ago

[-]

I developed a Telegram bot [1] that allows users to save text messages on Google Sheets, including simple text parsing to arrange the content into columns. Upon signup I automatically create a spreadsheet for the user so that they can start using the bot right away, but they can also change it to a personal spreadsheet later. And of course multiple users can add to the same spreadsheet if they know the link. Note that the bot uses a local SQLite for its own data.

Technology-wise it is just a Java app running on my raspberry and using Google's libraries for Drive [2] to create the spreadsheets and the actual Sheet API [3] to modify content. Authentication was a bit hard to figure out mostly because of the poor documentation. Each query to these APIs takes about one second, so it is definitely not suitable if you require low latency on the client's side or lots of concurrent accesses to the same spreadsheet. Not sure about pricing since my bot is not so popular and I am still in the free tier (:

[1] https://t.me/gsheet_notes_bot

[2] https://developers.google.com/drive/api/guides/about-sdk

[3] https://developers.google.com/sheets/api/guides/concepts

▲

yawnxyz

1 year ago

[-]

I'm doing something similar, but using SpreadAPI (https://spreadapi.roombelt.com/) which is free and uses Google Apps Script for myself.

I use GPTs with it to save ideas / thoughts / notes, but also to parse URLs with lists like "best burgers" and GPTs will just add details to the sheets.

Sheets is GREAT for personal / small group use, but if you're a massive company you'll probably run into limitations quickly (5 API calls/s; 200k cells; monthly limits) etc. but I've never even gotten close.

▲

newaccount74

1 year ago

[-]

I have an app that allows users to select data directories for storing config files. This allows customers to sync their settings using an existing syncing solution. Whether they use Google Drive, Dropbox, iCloud, SyncThing, or even Git is up to them.

The huge advantage is that most people already have some kind of syncing solution set up, and they can just use that. This is especially important for corporate environments where IT has to evaluate what's allowed and what isn't.

It comes with some drawbacks. Your code needs to be resilient enough to deal with inconsistent syncs, you have to use file system observation, stuff might be delayed etc.

But on the plus side, it seems to work very reliably, I've not gotten any complaints yet, and if something does go wrong savvy users can just fix it themselves. It's just a bunch of JSON files on disk, after all.

Most sync tools offer a way to share folders with other users, so if you do it right you get multiplayer for free.

▲

joshsabol46

1 year ago

[-]

Can you share any reference material / libraries that you used to build these integrations?

▲

chatmasta

1 year ago

[-]

There's probably a real business opportunity in selling a product that abstracts this plumbing into an SDK for developers, so they can give their users a polished flow for authenticating and configuring the storage on the frontend, and then interact with the user's chosen directory on the backend. I imagine it's been done before with various "file upload" products but I'm not familiar with the space.

As a user, I would love this kind of integration because it means data portability is built into the product.

▲

newaccount74

1 year ago

[-]

Sorry, it seems my comment was misleading. There is no integration with sync apps.

My app is a Mac app, so it just reads/writes JSON files to disk in a folder that the user selects. The user selects a folder in iCloud drive, Dropbox, etc.

There is nothing special about my code, except that it is written with the expectation that some service modifies the files on disk in the background. Every JSON file has a uuid, so I can track renames of the files. But for the most part it's just a lot of handwritten code that converts JSON into objects and vice versa.

▲

rco8786

1 year ago

[-]

The original YNAB (https://www.ynab.com/) used a JSON file stored in your Dropbox as the backend. It worked pretty well, but definitely lacked some professionalism and had a few rough edges (related to DB sync delays, mainly).

▲

eclipticplane

1 year ago

[-]

If you like that model, Tiller (https://www.tillerhq.com/) syncs your accounts to an Excel or Google Sheet that you own.

▲

ChuckMcM

1 year ago

[-]

Haven't done this. Curious though about "storage" (type 1) vs "non-voltatile data apis" (type 2). The former I think of as "open/read/write/close/delete" (and maybe truncate) the latter I think of as having things like "truncate/seek" which add state machine semantics (if you don't do the operations in the same order you get a different state of your store). The "hybrid" is to use type 1 storage as a log and use it to reconstruct a type 2 store. You still need need to insure the underlying type 1 log is written in "order" (vectored clocks help here) in order to preserve data integrity.

If you have a suitably robust "middle ware" layer this should be doable with any type 1 storage back end.

▲

bilater

1 year ago

[-]

I really wish there would be more industry standard solutions like this. There are lots of semi serious apps I build where I'd rather not worry about auth and storage but its part of the product so I'll do some hacky stuff with local storage (which only goes so far)

▲

lostmsu

1 year ago

[-]

I am in progress of doing just that for a cloud gaming platform of mine: https://borg.games

If you want to save your game progress, you just login with OneDrive.

I previously made a single-page EPUB reader using a similar technique with GDrive: https://h5reader.azurewebsites.net/ (that one does not work with 3rd party cookies disabled though because it uses iframe to login instead of redirects, and I never bothered to update it).

▲

Mastice123

1 year ago

[-]

What about using a solution like a personal decentralized storage? that will help to stay up all the time and ensure the ownership of data to the user, no? There is a cool project https://zyphe.com/ that does that, they are just at the beginning and they decided to run a decentralized KYC solution on top for now: https://www.togggle.io/

▲

base2john

1 year ago

[-]

Solid protocol is trying to do this too, https://www.inrupt.com/solid

▲

chatmasta

1 year ago

[-]

Is it possible to grant Google Drive OAuth permissions with scope limited to a single folder?

My experience with Drive OAuth is sketchy Colab notebooks asking me to authenticate with Drive, but in such a way that I'm apparently granting access to everything inside of it. I've never been sure if that's because of limitations on the OAuth API, the Colab client specifically, or just sloppy developers.

▲

willsmith72

1 year ago

[-]

What are the main reasons products would do this?

Is it user-privacy focused? Product cost-savings focused? Something else?

The privacy discussion would be interesting. On one hand, users "own" their data in the sense that they can delete it themselves at any time, but you also allow a product (hopefully limited) access to your own storage provider.

▲

tonyarkles

1 year ago

[-]

Depending on what the data format is, it's also a hedge against product obsolescence. If it's a JSON file that gets stored in my Dropbox and your product goes away I can still extract my data out of it and do something useful with it.

▲

Marcelovk

1 year ago

[-]

I use a password manager that does this, called SafeInCloud (Android) - it asks you to select one of the available backends, such as Gmail, Dropbox and others, and keeps the entire password db encrypted there. It syncs between my devices quite well for that scenario where writes are seldom

▲

1B05H1N

1 year ago

[-]

Write/read is not great and there is no way to get help from google if you're not an enterprise customer (in the event of something happening). I use a cloud provider for important things and online drives as backups for smaller personal projects.

▲

mfrye0

1 year ago

[-]

I know a guy that built his whole product out on Hubspot using their free CRM and APIs. I believe he said he's storing a few hundred million records there and there's been no issues so far.

▲

joshstrange

1 year ago

[-]

All of the products I used to used that did this (like 1Password, YNAB, etc) now have their own cloud sync and it's better in every way. Yes, you pay for it but it's well worth the stability and speed IMHO.

Dropbox has fallen so far in my eyes over the past ~decade. It used to be rock solid file storage, a USB drive in the cloud, now it's so busy trying to be everything to everyone and failing at most things. The macOS is hot garbage, literally one of the worst pieces of software I've had to work with. It's buggy, doesn't give useful info, gets stuck syncing, and has alerts that you can't act on. My mom's laptop yells at her that some files might not be openable and she should upgrade to their new file provider API. Ok fine, let's do that, oh wait you can't because you have a personal and a business account, try again later. It's been this way for a while now (like a year+).

I have Maestral on my laptop but honestly I just need a tiny push to leave Dropbox completely after being a paying customer for over a decade.

▲

spoonjim

1 year ago

[-]

FYI I will not sign into an app with Google or Facebook etc. unless I need to (eg it’s an app for doing something with my email). Too risky.

▲

darig

1 year ago

[-]

The language/framework I created uses Google Sheets as a datastore, even providing language constructs for JOINs/Pivot tables, and caching timeseries report line items by daterange in a separate auto-generated google sheet. It all works fine, but SQL with a caching layer like memcached is obviously a better option if you have control over your server at all.

▲

zogrodea

1 year ago

[-]

Why is @darig's reply dead? It looks reasonable and interesting to me (although they could have perhaps made it clearer that it ties into this topic because of developing a programming language that does what the title asks).

▲

dang

1 year ago

[-]

Comments by banned accounts are killed by default.

When you see a comment that's [dead] but shouldn't be, you can vouch for it. This is in the FAQ: https://news.ycombinator.com/newsfaq.html#cvouch.

▲

zogrodea

1 year ago

[-]

Thanks for the explanation sir.