Some forewarning, this is a bit more of a technical/policy post, the first of many I plan to write that are in this vein. I hope some of my techie friends will get something out of it.

Can’t Open Your Email Inbox? Good Luck

I was perusing The New York Times and ran across this article, “Digital Domain – Can’t Open Your E-mail Inbox? Good Luck“. It reminded me of a post I’ve been meaning to write for a while addressing why on earth I would have my own domains, run my server, host my own e-mail, etc.

First, some thoughts on the meaningful article from the NYT. When hotmail.com first launched and took off, I thought webmail, the first instance of really putting personal data in the “cloud”, was a genuinely innovative and exciting technology. Like many new innovations, understanding the implications of it only comes after some time has passed. Hotmail.com blew up, Microsoft paid a pretty penny for it and now all the major players have some sort of webmail component to their service offering.

What is interesting to see nowadays is how dependent people have become on these webmail services. If the service goes down or they are not allowed access for some reason, their productivity is lost. I recall meeting with an adviser of mine a few months ago and he was complaining to me about how his day was shot because he used GMail and all his information and calendering were contained within the service which had been down for 10 hours for him. I suggested he take advantage of their IMAP access so he could at least have a local client on his computer that cached everything in his webmail account.

There is a certain level of trust that users have with their service provider when relying solely on their service to meet what is now a vital communications need. Changing your email address nowadays and informing your contacts may be more of a hassle than changing your phone number. I’ve been using a lifetime bouncer provided to me by the Stanford Computer Science Department which has allowed me to swap around the underlying email service as I please. If this bouncer/aliasing approach were more common place, people would be able to get around some of the downtime issues that plague their productivity.

Getting back to the key issue in the NYT article, the issue of trust is really a two way street when it comes to these types of service offerings. Users do not want to provide too much identifying information (generally speaking) when it comes to online service registration, and yet service providers need a distinct way to authenticate an individual’s access to a particular account. The fallout from the hacking of Sarah Palin’s email account should not come as a surprise, as we saw the same thing a couple of years ago when Paris Hilton’s T-Mobile account was hacked by someone guessing her dog’s name as her password. Imagine now how this problem can become more much widespread as personal details about individuals become more prevalent with revealing social networking profiles, blogs, professional networks, etc. as users still provide relatively unsophisticated passwords.

RMS – Richard Stallman on Cloud Computing

Before I left on my vacation, I ran across some rumblings on Hacker News and other sites in response to comments made by rms, also known as Richard M. Stallman. An article at The Guardian summarizes his comments, so I would suggest skimming that before continuing on here. If you think about it for a minute, it’s clear how these two articles relate. Now, I appreciate much of what Stallman has done for the open source community with the Free Software Foundation. GNU and many of his other contributions have provided the bedrock for building systems and services that would otherwise have been very costly and impeded innovation. That said, I think Stallman’s take on “cloud computing,” as many things are being (incorrectly) lumped under this umbrella, is a bit extreme.

First, there is much discussion going on now about what exactly cloud computing is. I don’t want to get into defining it right now, other than to point out that, like others, I feel the term is being used too liberally to label many services which don’t fit. Webmail is closer to SaaS (Software as a Service) than anything else in my book. But that’s besides some of the key points that Stallman brings up against users utilizing services like GMail. I tend to agree with his points about why it is not prudent to entrust important personal data to a 3rd party. I never understood how most people would be okay with Google parsing their emails to target advertising. It seemed like the beginning of a very slippery slope. If a 3rd party can scan your email to target ads, what is the next thing they can/want to do with your personal data to deliver more advertising to you?

For some users, this trade off is acceptable as they are willing to forgo privacy for the cost savings or features of a particular service. For most users, however, I believe this pushes back the line on acceptable privacy practices. As this becomes has become the norm, will users balk at the next, more intrusive joust into the sphere of personal data? I won’t agree with Stallman that users are “stupid” for making such choices, but I will concede they are either ignorant of Google’s practices or less vigilant about their privacy. His notion of service providers “locking” users in is also not a fair assessment as more providers are supporting IMAP which, in effect, makes their email data extremely transferable. (See imapsync, which will effortlessly let you move email from one IMAP provider to another.) While this is true for email, Stallman’s assessment is accurate for some other cloud services. For example, if you built a service that utilized storage on Amazon’s S3 service and then decided that you wanted to move to Nirvanix, you would have to invest in porting/integrating some layer to work with Nirvanix API’s instead of S3’s. This may change as the cloud service industry matures, but it’s yet to be seen as this would create a race to the bottom on margin and would make the services commodities.

I believe there is some inherent value in cloud services, though I’m not sure they can be realized at the consumer level quite yet. For small businesses and startups, there is a lot of value that can be achieved with minimal cost by utilizing the infrastructure expertise and investment of others that have come before them.  For consumers, the issues that should trump cost are privacy and access. Just as there is a very delicate balance in ceding civil rights for increased security as have seen with things like the Patriot Act and its sequel, I believe a similarly delicate balance exists in ceding access and control to personal data for “free” services. When the ends are hard to ascertain (are we safe enough? and am I getting the most value in exchange for my personal data?), it is easier to continue giving up what we once held dearly.

The Middle Ground

As with many things in life, I find that extremes are usually not the best way to go and finding some middle ground balances the pros and cons of a variety of approaches. One extreme is currently being pushed by significant online players like Google, Microsoft and Yahoo. They are all encouraging their users to push all their critical personal data like email, personal media, contacts, etc. into cloud services under their branded umbrella, be it Google Apps, Microsoft Live, etc.  The other extreme is what Stallman is calling for, rejecting the use of services that seek to re-attribute ownership of data from you to the service provider.

I believe that there is a middle ground that can allow users to benefit from both. Users already unconsciously strike this middle ground and the best example is sharing personal media. Typically, an individual will take many more pictures than they put online through various services like Picasa, Flickr, Photobucket, etc. While they may avoid putting all the pictures online for convenience (who wants to wait around for hours for pictures to upload), they are also creating a dividing line between content they keep to themselves and content they want to utilize a 3rd party service for.

I believe this can be the archetype for the balance between private personal data and exposed personal data that ends up somewhere in the cloud. The user should be responsible for defining that access and control and service providers should take the ethical route of educating users about their practices up front instead of behind some stale “Privacy Policy” link that sounds so boring no one but EFF members will venture to.

Control What You Can

I have zero expertise in professional IT administration. I do believe, however, that it is worth taking the time to take control over what resources and services you need if you can leverage building blocks that are widely used.

I’ll give an example of what I mean. When it comes to email, there are a handful of widely deployed open source solutions that are used by millions of people every day. The advantage in leveraging these building blocks is that you benefit from the high usage resulting in catching bugs that are then solved in further versions of software. There is little value in building your own stack of email services at this point as all of the key components have become commodities.

You could take this rationale to another level and believe that is then worthwhile to use services that have millions of users like GMail or Hotmail. I believe there is a valid distinction here which the NYT article hit on. While you will likely benefit in a similar way using a service with millions of users as you would using open source components with millions of users, the difference lies in the ownership of data. Own the data, control the privacy and be responsible for the access.

Learning Opportunities

Running your own server is not trivial. There are lots of stumbling blocks to overcome, but in the end, it is extremely fulfilling. I believe I’ve learned enough to perform basic IT tasks to bring a small office up if needed. This is something that will hopefully prove to be useful sometime down the road. I’ve also picked up a lot more knowledge about how the Internet’s infrastructure works when it comes to routing mail, resolving domain names, etc. I could go on and on, but the key is looking at running your own server an an empowering opportunity that really has a lot of interesting solutions you can tailor tailor to your own needs. Best of all, there’s lots of useful information and guides out there to get you started.

Why IT Administrators and Employees Don’t Like Each Other

Running your own server to serve your needs isn’t all roses. I understand now why putting up IT services for others is such difficult work that I was wrong to trivialize earlier. There is a certain quality required of a person to be so detailed, knowledgeable and instinctive when it comes to IT-related tasks that it shouldn’t be underappreciated. When I was at my last employer, we used to grumble about things that seem like a dream at my current employer, like software package version management. Employees will always be pushing the envelope on technology they want available in the IT infrastructure and IT administrators will always be balancing that with the risk of rolling that technology out to support more than just the interested user(s).

If you choose to go this route of running your own server, you’ll get to see both sides of this struggle. As a consumer, you’ll want all the bells and whistles and the latest shiny software drop. As the administrator, you’ll wonder how well the new package has been tested, whether it is compatible with other dependencies and so on. There is much to learn in balancing needs versus risk in this situation that can improve your instincts about evaluating software in general.

You Get Out of It What You Put Into It

For many years, I was a customer of shared hosting companies that basically provided you with space for a web site, access to remote storage over FTP, email, domain hosting and domain management. Several times a year, I would encounter down time that prevented access to my web site, remote storage and email. These occurrences were extremely frustrating because they inhibited much of my online activity. A few months ago, I had all I could take and decided to ditch the shared hosting model and sign up with a VPS (Virtual Private Server) provider.

Basically a VPS is a dedicated slice of a server that is shared much more strictly, guaranteeing you resources that will prevent your services from getting bogged down. There are managed and unmanaged VPS providers, with managed accounts offering much less configurability and control and unmanaged offering just enough to shoot yourself in the foot. I opted for an unmanaged provider after scouring the web for reviews and settled on Linode. The purpose of this post wasn’t really to be an ad for VPS providers or Linode in particular, but I would highly recommend them to anyone interested in going down a similar road. Certainly, it’s not for the technically faint of heart, but it can be very worthwhile to learn a lot about what makes some Internet services tick by building your own stack.

Best of all, I get to control a lot of data that I care about. My sites, my email, and even personal data related to some services like bookmark management are managed on my VPS. Mozilla Weave is a project that is attempting to solve these latter issues by allowing users to self-host personal data used by in-browser services like bookmark managers. It’s been a worthwhile journey so far, and hopefully I’ll continue to learn more along the way.