BUT there's another piece that makes or breaks these tools... whether they can build a community around them and stick around for years...
Open‑source cloud storage projects come and go when maintainers burn out... a sustainable business model or strong contributor base matters as much as technical checklists...
ALSO interoperability is underrated... if your drive can speak WebDAV or S3 and plug into existing identity systems, teams are more likely to try it...
In the end people want something that won't vanish after the honeymoon... that's harder than adding a progress bar...
I don't want to be part of a community around my cloud storage. I want it to work and I want to think about it as little as possible.
I use Syncthing and it does a pretty great job at this, no one ever insisted I need to join a Syncthing community, yet it keeps on working.
I don't pay a dime for Syncthing but I'm vaguely aware that they're linked to a company called Kastelo which provides enterprise support for Syncthing deployments. Probably a lot of Syncthing development is paid for that way.
Incidentally I founded an open source consulting company that's totally unrelated to cloud storage. We have enterprise as well as smaller contracts. We develop some addons in-house and the bigger enterprise contracts tend to subsidize most of the work that goes into them. We haven't asked anyone to be part of a community and I don't think we need to.
Communities are nice, but if you want your software to last I think a good business model and a good marketing strategy are a better bet. Bonus, you can quit your day job.
So understanding the long term stability of a community is more than just checking whether there is a company backing the project. It is important to analyze the nature and diversity of interest. I think it's just as important that there exists a larger community that the business depends on for extra feature + bugfixing work which is capable of forking. When this is possible, it is much less likely to be necessary.
Edit: @n3t heard wrt to the turn of phrase
Why ? who cares? if the tool solves the problem, you need a community maintain it. And that's it.
I use : - Syncthing (https://syncthing.net/) to keep the files synchronized between desktops and laptops computers
- Webdav (https://github.com/hacdias/webdav) to access the files on the server via other applications
- Cryptomator (https://cryptomator.org/) to crypt/decrypt sensible directories (that are synchronized through Syncthing) Cryptomator allow me to access also the directories via webdav
- MaterialFiles on Andrid to access the files on the server
I access my mini server from outside with a Wireguard VPN created on my Fritz!Box router.
Between home and office I created a site-to-site Wireguard VPN between the two Fritz!Box routers.
Why is that? Have been using NextCloud in our company and for myself, and I couldn't be happier, no issues since 3 years, all the tools and plugins I need, sync running perfect and hassle-free and performant. I thought it's generally liked up until now - I didn't try any of the alternatives though, so they might indeed be better. Though I don't have any reason to try them tbh, as NC works almost too well.
I use a cron job to back up Nextcloud to B2 and S3 Glacier.
How much storage do you use and how much does it cost?
Have you ever tried restoring from Glacier?
I've only tested partial restores from Glacier since it is expensive. I've got a raidz2 array locally as insurance against having to restore from a backup.
That single 18TB HD is hardly safe from a disaster or even plain old hardware death, and it's a single point of failure. You need at least 3 times as many HDs to start to have something you can actually rely on to keep your data for 3-5years.
> But compare it to immich for example and they're just not playing in the same league imo
I mean, this doesn't make sense at all, tbf. They're literally not in the same league, as their targeting different use cases. Nextcloud offers a MUCH broader experience, while Immich has a very clear cut focus and does nothing outside of that. Comparing it doesn't make any sense. Except if you're actually talking about the UI exclusively. Then, yes, Immich feels much more modern and smooth.
An old write up is here: https://itsfoss.com/peergos/
It might be better in their weird AIO solution? But i dont like the idea of giving a docker container the ability to spawn more containers. I just use one of their normal docker containers and have had to manually change a lot to make it work as they actually suggest. Like just recently i setup their notify_push plugin as it improves performance - but the provided setup instructions didn't work in my setup and i had to manually tweak several things.
And while community is great, I don't think Nextcloud developer community is that big and active. Their plugin system is basic, archaic, lots of things there are begging for rework.
So while Nextcloud is decent once set up, I am happy to see some fresh OSS projects solving similar issues appear. Maybe their approach will be better.
Seafile took me by surprise in terms of how quick it was at picking up new files and changes - syncing works incredibly well too. I moved all my files from my Google Drive into my Seafile instance and I'm now using it on all my devices as my main cloud storage solution.
The ability to just run it in a snap has really contributed to this imho, Nextcloud is enterprise software you just happen to be able to run in your homelab.
Been a user for around a decade. It is really great. Nextcloud was choking on large repos back in the day and it requires a beefier machine.
I first run Seafile on a cheap ARM board with 2GB ram and 2 core CPU.
Running it in Docker made it so much easier to administer (maybe add in the missing db indexes if there's a major version change).
If you want, I can paste my docker-compose.yml for reference as it's relatively complex.
Overall it's no WordPress instance that works everywhere.
That’s all!
Auto updates and I can bet it will not break.
Anyone who wants to seriously use Nextcloud should look into the AIO docker containers or rolling the individual containers themselves. Nextcloud has expanded into a full groupware stack and it's expected you have an actual admin managing the system like with any real deployment of enterprise software
If you need more advanced or fancy/niche features, AIO might be a better though heavier fit (I run an instance of AIO at home, mostly for testing). Snap is lightweight and a bit opinionated (in reasonable ways in my view), and the documentation used to mention some of its limitations. In exchange, you get snappier, more robust installation.
I used a Docker container and it led to so many error messages, having to wipe all the containers and restart, etc.
So I gave up and didn't bother.
There's lots more to hosting your own file share/sync tool than just standing it up.
He complained about the difficulty of installing an application. He didn’t complain about establishing a personal data center.
That one line will give you the Nextcloud. Exactly one more line in snap will give you a self sign cert. Alternatively, the line below will give you remote access, a domain, and a valid certificate for your application:
curl -fsSL https://tailscale.com/install.sh | sh
You will have a functioning personal Drive on a VPS or a computer at this point!
Toggle snapshots on VPS for backups.
Setting up services with public clouds also takes some steps.
If Twake nails those and keeps a sane on prem story with S3 and LDAP, it has a shot. The harder part is trust and docs. Clear threat model. Crisp migration guides from Drive and Dropbox. And a tiny CLI that just works on a headless box. Do these and teams will try it for real work, not just weekend tests.
I don't think I've ever considered a data store without that being one of my top concerns. This anxiety comes from real-life experience where the business I worked at had backups enabled for the primary data store for years, but when something finally happened and we lost some production data, we quickly discovered that the backups weren't actually possible to restore from, and had been corrupted this whole time.
I grabbed the hard-drive off the shelf, put it in an enclosure and handed them the source-code... (At the time, every time I upgraded my system, I would just keep my old drives, so... had a stack of them - buy a new external enclosure, slot it and park it.)
Then once you grow/need higher reliability, you can start adding more advanced checks, like it has the tables/data structures you expect and so on.
"Something went wrong!"
Isn't that just 91.5% JavaScript?
TypeScript is not real.
Personally I separate church and state by writing tests in JS and application code in TS.
If you're asking why these languages at all when this and that other language is faster, most likely it's less of a bottleneck than estimated.
With clients some of them have already made this bad decision; with my own personal files I get to avoid it.
And then I saw Npm references and thought “in JavaScript?!” But at least it’s typescript.
For people whose UX is dragging and dropping stuff to browser, and/or using a desktop sync client only, sure why not, the UI looks clean and familiar. But as someone who has used and still uses like 3 different similar things concurrently, the only real reason I use drive is because of the seamless zero-dependency office-like web software being part of the product.
(yes I know it's a curse too, I ended up writing a piece of software just to migrate company drive stuff to my personal drive when a company I was a cofounder in went bust to have a record ... those google docs can really only exist in Drive natively, any export is an immediate downgrade)
Is there a decade-old software that provides a UI or an API wrapper around these features for a "Google Drive" alternative? Maybe over the SAMBA protocol?
Another issue would be permissions: if I wanted to restrict access to a file to a subset of users, I’d have to make a group for that subset. Linux supports a maximum of 65536 groups, which could quickly be exhausted for a nontrivial number of users.
There is no support for writing multiple xattrs and file contents in one transaction.
Journaled filesystems that immediately flush xattrs to the journal do have atomic writes of single xattrs; so you'd need to stuff all data in one xattr value and serialize/deserialize (with e.g JSON, or potentially Arrow IPC with Feather ~mmap'd from xattrs (edit: but getxattr() doesn't support mmap. And xattr storage limits: EXT4: 4K, XFS: 64k, BTRFS: 16K)
Atomicity (database systems) https://en.wikipedia.org/wiki/Atomicity_(database_systems)
The file system is already a database.
For naming, just name the directory the same way on your file system.
Shareable urls can be a hash of the path with some kind of hmac to prevent scraping.
Yes if you move a file, you can create a symlink to preserve it.
Filesystem or LVM snapshots immediately come to mind
> or shareable URLs to files without a database?
Uh... is the path to the file not already an URL? URLs are literally an abstraction of a filesystem hierarchy already.
I use ZFS snapshots and like them a lot for many reasons. But I don’t have any way to quickly see individual versions of a file without having to wade through a lot of snapshots where the file is the same because snapshots are at filesystem level (or more specifically in ZFS, at “dataset” level which is somewhat like a partition).
And also, because I snapshot at set intervals, there might be a version of a file that I wanted to go back to but which I don’t have a snapshot of at that exact moment. So I only have history of what the file was a bit earlier or a bit later than some specific moment.
I used to have snapshots automatically trigger every 2 minutes and snapshot clean up automatically trigger hourly, daily, weekly and monthly. In that setup it was fairly high chance that if I make some mistake with an edit to a file I also had a version of it that kept the edits from right before as long as I discover the mistake right away.
These days I snapshot automatically a couple of times per day and cleanup every few months with a few keystrokes. Mainly because at the moment the files I store on the servers don’t need that fine-grained snapshots.
Anyway, the point is that even if you snapshot frequently it’s not going to be particularly ergonomic to find the version you want. So maybe the “Google Drive” UI would also have to check each revision to see if they were actually modified and only show those that were. And even then it might not be the greatest experience.
https://cockpit-project.org/applications
--
With no command line use needed, you can:
Navigate the entire filesystem,
Create, delete, and rename files,
Edit file contents,
Edit file ownership and permissions,
Create symbolic links to files and directories,
Reorganize files through cut, copy, and paste,
Upload files by dragging and dropping,
Download files and directories.I have no idea how this project was designed, but a) it's expectable that disk operations can and should be cached, b) syncing file shares across multiple nodes can easily involve storing metadata.
For either case, once you realize you need to persist data then you'd be hard pressed to justify not using a database.
Like broadly, for all configuration Hashicorp Vault makes you do, you can achieve a much more useful set of permissions with a Samba fileshare and ACLs (certainly it makes it easy to grant targeted access to specific resources - and with IIS and Kerberos you even have an HTTP API).
Samba is a complicated piece of software built around protocols from the 90s. It’s designed around the old idea of physical network security where it’s isolated on a LAN and has a long long history of serious critical security vulnerabilities (eg here’s an RCE from this month https://cybersecuritynews.com/critical-samba-rce-vulnerabili...).
My solution: Share nothing and use rsync.
I’d say Dropbox et all is closer to a good design but their backend is insanely crazy optimized to make it work and proprietary. There’s an added challenge that everything these days is behind a NAT so you usually end up needing to have a central rendezvous server where nodes can find each other.
Since you’re looking at rsync where you want something closer to Dropbox, I’d say look at syncthing. It’s designed in a way to make personal file sharing secure.
put all kinds of versioned metadata on docs without coming up with strange encodings, and even though POSIX (and NodeJS) offers a lot of FS related features it probably makes sense to keep things reeeeally simple
and it's easy to hack on this even on Windows
Sure, it's more overhead, but you can't put a price on preventing your NAS from developing sentience.
Would love to see your source code for your take on this product.
Using a database isn’t some kind of heavy-handed horrendous thing depending on the implementation (e.g., as long as it leaves your content files alone).
There is a database in most if not all useful cases, but there could also be the actual files separately.
1: This appears to be backed by a French company called Linagoria. I don't know much about the company, but they've been around for a bit.
2: I experimented with Mongodb for the similar product, and it turned out to be very unreliable. A lot can change since I used Mongodb, but in general, I'm weary of any product that uses it unless there's an expectation that data is lossy.
(Which was the problem Mongodb had at the time: Their CTO only wanted to target lossy data use cases, but the people interested in using Mondodb wanted a database that was easier to use than SQL.)
What happened was that its document model, and flexible index model, made it very attractive as an easy-to-use database. I used to call it the "Visual Basic" of databases.
I think the less technical people in marketing latched on to how a lot of people found MongoDB easier to work with, and there was a lot of selling to people who it shouldn't have been sold to.
The problem was that the lossiness nature of MongoDB didn't rear it's ugly head until deep in a project, and the assumptions made when writing documents lead to situations where operations required changing multiple documents; or other corner cases that triggered loss in larger schemas.
Of course, if you used MongoDB as intended, which was for ingesting lots of data with some tolerance of loss, you were totally fine.
The ones I've tried could only download once off via the web, or sync whole folders but not do the placeholder thing. That doesn't really work for me.
The mobile experience last I tried was pretty rough though. I don't really need my files on my phone and I have a web interface on my home server I can use to grab them in a pinch, but it's something to keep in mind.
You could of course build the app yourself from source.
Cool app though!
But yeah there's a reason people don't do this anymore
I would add to that list something like a splitwise alternative.
And open source too? Seems too good to be true.
With so much surveillance I think there's a real need for E2E on anything. I just bought the basic Tutanota package - but maybe that's just my OCD acting out.
EDIT: This is closer, and you can self-host
https://github.com/cryptoboid/splitio
But it's in JavaScript <throw up> can't win them all.
so far google is amazing at search. hopefully others will be better, but it's really hard to evaluate cloud software based on that
I do this all the time, right before open sourcing a project. Basically while it's private, commit quality can be a bit rough, and if I want to open source it, I'll remove .git, make a new init commit then open source it. No one needs to see what I do in my private abode :)
It's much better to refactor (rebase) the messy commits, removing the personal or embarrassing stuff; although that might result in a "false" history, a series of smaller-sized commits will usually be much easier to follow than reading a whole code base all at once.
Really, I see a ton of open-source projects that do this, and it results in a lot of more opacity and friction than necessary.
It results in less people being able to check the code and contribute to the project.
If the project is from the get-go supposed to be a long-lived project (like professional development for a business) then I agree, don't smoke the entire history no matter how embarrassing it is.
But for my personal projects, I can let you know that having access to the git history before I made it FOSS will make you dumber rather than being helpful for anything, compared to one clean starting commit.
I don't? I said I remove it because it isn't useful to anyone, might even be adding more confusion than it solves, not because I'm embarrassed over anything.
If there might be some usefulness hidden there (for example, trying something and then reverting it shows that you did explore it), it's also possible to place the old stuff in another repository or another branch (better the latter, unless it increases the repository's size too much)
True, those things tend to go into the documentation itself, checked into the codebase itself instead of being somewhat hidden inside the git history. Usually I end up having both a "Open Problems" (things yet to solve) and a "Tried X, this is why it didn't work" section somewhere in the documentation.
> it's also possible to place the old stuff in another repository
Yes, before the process I initially described, I usually leave a copy intact with the full-full history, but that's not what I published, just kept as an archive.
That's good, and yes, if that repository history really wouldn't add anything it's fine to squash everything
> > it's also possible to place the old stuff in another repository > > Yes, before the process I initially described, I usually leave a copy intact with the full-full history, but that's not what I published, just kept as an archive.
Ok, I meant a public repository though
TDrive would work
> If you have a US startup called X and you don't have x.com, you should probably change your name.
But they do own https://twake-drive.com/ already? What exactly is your point here? Either you misunderstand the linked article, or I do. But seems people would be able to find that just fine if they search for, as twake-drive.com comes up as the first result when I search for "Twake Drive".
Besides, Graham's articles are almost always geared towards startups in one way or another. This doesn't seem to be that, so not sure I'd even try to read it if I was the owner of Twake Drive.
Re: should they read it? Either you want your product to spread, or you don't.
If you're posting it on HN, you want to share it, and for it to be shared. A tough name makes it harder to share, so you have to decide if you really want your product to spread or not.
You search that in Google with file sharing keywords and the AI will helpfully correct it to 'do you mean GDrive?'
They would've lost a prospective user to a competitor while sounding like a knockoff of some other product.
I've definitely been more motivated to de-cloud as the tech bros capitulate as well as push their ai way too hard
Was that because of team expertise or particular aspects of TS you thought suited the domain?
Javascript was a poor choice that will hold the project back just as choosing PHP for the base has done and continues to do a lot of damage to NextCloud/OwnCloud. This is not a task for a scripting language, because they're disqualified on performance. It's also not a task for dynamic typing, and using Typescript can help with that, but it doesn't change the fact that Javascript is just generally slow and does not play well on multiple CPUs.
FWIW, the people working on this project has Mission and Vision pages on their website: https://linagora.com/en/mission https://linagora.com/en/vision
Took me a whooping 17 seconds to find those two.
-Someone, surely