17 November 2015

When you get up at 2:30 in the morning to catch a train to attend a workshop I think it’s inevitable you question yourself as to what on earth it is you’re up to.

I’m in Edinburgh today for an Alan Turing Institute (ATI) scoping workshop on the data anlaytics pipeline.

It’s just over a year since the open meeting at London’s Royal Academy which launched the call for partners in the joint venture that was to form the ATI. As it stands the institute will be an international leader in the area right from the start, rumors circulate about who will be seconded there, but given the choice of director and the concentration of powers from the joint venture partners the chips are well stacked for them. The question is, at what cost to the wider UK academic environment in this area? The claim on world-leadership at outset will merely be based on a reshuffling of the cards, placing the aces all in London’s hand. This itself may already be enough to benefit the country as a whole. After all, in the near vicinity of the institute is DeepMind, the Guardian, Facebook London and a host of start-ups. Next door is the Crick institute … maybe that’s where we should play our aces!

Britain is a hub and spoke economy with London at the centre, but there are supposed to be efforts to rectify that, data certainly presents such an opportunity, but the new distribution of expertise won’t help.

Let’s up the stakes. We are an international community, and a focus on UK priorities seems a little parochial. Where are the challenges in data on the world stage at the moment?

I think they are massive and manifold. In particular, by its very success the machine learning community has created a major challenge for our society. To be effective machine learning methods need to centralize data for exploitation. They need widespread access to all facets of their subjects. What are those subjects? This isn’t an Internet of Things: it’s an Internet of People.

The tendency to centralise data implies that we are heading for a form of digital oligarchy where the knowledge wielded through data will be held by only a few. This in turn implies a loss of individual freedom that brings to mind a future where our virtual selves are subject to a rule of law that bears a close resemblance to medieval feudalism.

What’s this got to do with Alan Turing Institute? Turing himself was a victim of draconian laws and unwarranted meddling of authorities in personal matters. He was prosecuted under British law for homosexuality. British law that is still a legacy bequeathed to many African states and forms the basis of ongoing persecution across the world.

The problems of those states go far beyond our own discarded laws, but the opportunities for progress on the back of an information revolution are also cause for a great deal of hope. The opportunity in Africa actually stems from the lack of a pre-existing centralised structure. This means people are poised to adopt new models of information sharing that are cheaper, decentralised and may be more equitable in the long run. These models will evolve in an era where a personal information storage device, the mobile phone, underpins the infrastructure. This facilitates peer-to-peer interaction which provides a chance for the user to regain control.

We need to exploit the opportunities offered by our information futures without succumbing to the pitfalls that are triggered by our tendency to give up our personal freedom with our data.

So where is the generation of learning methods that we will deploy on these platforms? Where are the methods which allow for user-centric models of data management? Where are the methods that are privacy aware and transparent to those they aim to help? Where are the methods that can assimilate a variety of data modalities, most of which would be missing for any given individual? Where are the methods that are designed not for an internet of things, but for an internet of people?

There are always exceptions, but the wider community currently seems to be in a mad rush to exploit the strengths of our current generation of approaches rather than developing the next generation of technologies. This short-termism is necessary for industry, but the role of academics should not be to compete with them but to complement them. Unfortunately, in modern academia we also seem subject to the whims of the prevailing wind, or at least we are when we operate individually.

The Alan Turing Institute offers an opportunity to rally. The data science community in the UK is internationally leading but split across the statistics and machine learning fields. Success with UK science funding very much relies on the establishment of a clique of aligned interests. Ironically the breadth of applicability and utility of data science technologies has meant that until now the community has never had the cohesion to overwhelm the inherent conservatism of the councils. This has meant that the fundamental methodologies needed are not debated or funded. The platform on which we build our data science futures is an academic hobby horse.

The scoping workshop I’m attending today will present an opportunity to make the case. The initial success of the Alan Turing Institute will be measured not next year but in a decade’s time. If all it does is more of the same, then it will have damaged a robust eco-system of research in the UK with no particular beneficial effect. However, the reason I got up at 2:30 in the morning is because I don’t believe that’s going to happen. The only reason to seek “Critical Mass” in data science is to ensure that you are forming the agenda rather than following it.

In the UK we are taking a distributed data science community and centralising it in the Alan Turing Institute … today I’ll argue we should use that concentration of power to drive an agenda that decentralises the Internet of Things ensuring it becomes Internet of People. That would seem the right way to honour Alan Turing. Because after all he was not just a great mathematician, but a human being whose fundamental freedoms were infringed.

blog comments powered by Disqus