When a few weeks ago twitter announced that they are effectively closing their platform, I put a lot of thought on possible alternative to twitter. The obvious choice is distributed social networks, where there is no single owner of the service, and so nobody can decide to enforce rules and cut API access. The two most popular distributed networks are Diaspora and Status.net. Here is how they work, in short:
Everyone can decide to launch an instance (node) of the software. This instance stores their data, and possibly the data of everyone who register on their node. When a user registered on one node needs to follow/communicate/interact with someone registered on another node, the OStatus standard is used to facilitate that. So, in fact, you can register anywhere, even on your own server, and communicate with the rest of the world. And that sounds great, right? Even if latency is not an issue, I had one question, and I didn’t like the answer – imagine you launch a status.net node and let all your friends register there. But then your service gets popular to a wider audience, and a lot more people join. You have two options – turn that into your revenue stream (you’d have to think of a business model, probably ads), or simply say “I don’t have time for this, I’m shutting it down”. And what happens is that all the user’s data is now gone. True, every user can download their data, and you can keep it for a couple of months until you delete it, but it’s a hassle – they’d need to jump to another node, register and import their data. And while this is OK for tech-savvy people, it certainly can’t get mainstream. You can’t have all of twitter move to status.net nodes, because the bigger ones will need a business model to support themselves, and the smaller ones will be dying every now and then and there will be tons of unhappy users unwilling and not knowing how to move their data around. It’s still better than having a single vendor, but it’s not exactly “distributed” if you have 3 huge nodes. And these nodes might at some point decide on restricting the network, because they have the data at one place. “Yes, but people can move away from them”. People can even move away from facebook – they can download all their data and import it somewhere else. There’s a difference, of course – you can move your status.net data to another node and still communicate with the others, and that’s why status.net and diaspora may turn out to be a good model, but the fact that small nodes just die and loose their data bothered me. (I can’t omit the fact that status.net is not user-friendly at all, and it’s my second failed attempt to use it)
At that point I thought that there is an even more distributed way of doing social networks, and I even thought of turning welshare into such a piece of software. The point is to have the data distributed (replicated), just as in a database cluster. But the cluster is internet-wide, rather than in a single network. There are enough good approaches to that in various NoSQL databases, Cassandra being the one I like most – you can add and remove nodes at any time, and the data always remains somewhere. This way, even if you decide to shut down your server, your users will be able to login to a different URL (to which you will point or redirect them) without any additional steps. Then each node will develop a business model and the system will thrive. The difference from above is that no data will be lost, and the regular users won’t even notice they are using a distributed network. Optionally, you can even replicate the data to cloud storage providers like dropbox.
And here comes the paradox. You can’t have a fully distributed social network, because you’ll have to distribute (replicate) the data. And everyone will possibly have access to that data. If a node grows big, it will get some user data from other nodes, which will turn into a privacy nightmare. So unless your users are fine with anyone being able to read all their data, you can’t replicate it. The problem lies with the data – it is good that you can own your data with status.net and diaspora, but the reality is the regular user doesn’t want to manage their data. Just give it to a provider and let them manage it. And they can then do aggregate queries on it and serve ads.
The good thing is, a network like twitter doesn’t actually have any privacy (apart from the protected accounts), so implementing such a distributed data mechanism is still an option. But as far as I see it, it’s not an option for a full-featured social network – you need a single keeper of the data, and you can’t democratize access to it, if privacy is concerned.