data overload

Recently I’ve been on a self-analysis kick: how do I spend my time each day (really spend it, not what I plan to do), what do I eat (again, what I really eat vs. what I hope to),  how much exercise I really get, what I’d like to spend more or less time on, etc.

This kind of data is deeply interesting to the individual it pertains to, and deeply irrelevant to the rest of the world: except in aggregate, when this data suddenly transforms from an extraordinarily boring log into a marketing strategy, or a pattern for health interventions, or a map of how people in certain types of workplaces spend their day.

Which is why, when I bought a Fitbit the other day (in a quest to gather even more data) and was setting it up, I lingered over their Terms of Use (which yes, I am just enough of a self-proclaimed legal nerd to read). The bit that’s most interesting is the non-identifiable data part — in aggregate, Fitbit can share and sell the data their devices collect and upload to the internet. Since these are essentially wearable pedometers, that means quite literally every step you take, and for some of the devices, every heartbeat you have…

…in the cloud, synced to your phone, synced to their servers.

And these traces are only provided-as-aggregated through the grace of the service and its security; of course Fitbit both has and stores disaggregated data as well, because that’s what you’re signing up for — more than likely aligned with a Facebook account or, though Google, your email, or perhaps to another app.

Do you check yes to those terms?

Fitbit and their ilk, along with other related services like personal genomics kits, have found a clever human hack: we all seem to love data about ourselves. How fat, how far, how likely, how many likes. Wearable technology is trying to find its niche, except when it comes to wearable self-monitoring, which (I predict) just needs to get slightly more usable, or maybe integrated into phones directly, before it explodes even more. After all, everyone I know, and probably everyone you know too, either wants to get more exercise or lose a few pounds or both, and both knows what to do and yet needs some motivation to do so, even if that motivation is just a tiny machine strapped to your wrist.

Does privacy trump data? Very few things trump the desire to lose five pounds, in my experience, except perhaps for chocolate cake and butter. And by the time you’re reading those terms, you’ve got the hundred-dollar item in your hand, packaging ripped open, ready to play with a new toy.

I checked yes.

Here’s what we have to ask ourselves: does it matter?

I made a decision long ago to put lots of personal information online. The remainder is no doubt easily findable. I have a fairly prominent position running a project that many people, and many people in many governments and security services, care about. I travel to lots of countries. My credit card number (linked to a bank account that I manage online) has been stolen twice (it’s been fine; bank systems worked). Like most people, I carry a mobile phone; like most people, I have not disabled the GPS, which means there’s a trace of where I am all the time anyway. I still use Facebook, which is about as horrifying from a personal-privacy-online perspective as it gets. My email and most chats are all in Gmail, except for my work email, which is in Microsoft’s cloud — and all of that, we know from simple statistics, has a pretty decent probability of being hacked. Most of my documents, some of which are confidential, also live in the cloud. (I have worried more about this than I ever have about a record of where I am turning up publicly online.)

I am not temperamentally inclined to be overly worried about mentally unstable people showing up on my doorstep, though I’d guess I have a better-than-average chance of that happening. I am also not especially inclined to be worried about my personal privacy; I am by nature open. But I’m a researcher: I know exactly what it means to dig up information on people, and I like to make explicit decisions about what those results look like for myself. In the case of Fitbit, or any of a thousand other similar decisions made in a few seconds hovered over a terms of use checkbox, the reward is immediate and juicy (data about my exercise patterns!) and the disaster is amorphous and only a long-term possibility (what would a data breach of Fitbit servers even look like?)

So here’s what I’ve concluded: individually, like my individual data, it does not matter. In aggregate, as we all make decisions about systems that can or do track us, it matters a great deal. Consumers should not be forced to make these decisions post-point of sale, based on incomplete information (is Fitbit a trustworthy company? How many times have they been subpoened for data?)

We need a better way of making these decisions, and a standard best-practices for various industries to adopt. Wikimedia has a rather groundbreaking privacy policy — you read and post what you want, subject to our content terms (which are minimal) and community content standards (which are not), and we collect next to nothing about you when you do it. In our current world of data collection everywhere, this is extraordinary. It’s not always a simple decision, either: as a researcher and someone who wants the site to be more usable, I wish we did have more information on how people use Wikipedia. But these considerations have to fit into our core value of user privacy, not the other way around.

This is not a groundbreaking analysis, and I am not a legal scholar. What I know is I, personally, want there to be a better way: to weigh privacy decisions about signing up for services with a standard metric in hand (like the EPA numbers on cars, perhaps) of what it means to do so. Now that is data we could all use.

This entry was posted in interwebs, teh. Bookmark the permalink.

One Response to data overload

  1. Interesting thoughts, and I think we have a similar view of putting public data online.

    Have you heard of the CommonTerms project, It looks like it might solve some of your problems, or at least make the reading of ToU a bit faster. I’m not sure of the current status as the Roadmap,, talks about May 2013 as “currently”, but you could try reaching out to @common_terms or @Plannero to see what they are up to.

Comments are closed.