First Day of Data Science in Africa
Early start, awake at 4 to work on slides and lectures.
I feel like the key is flexibility. Kenya will be a new audience for me and you need to adapt material. At dinner Ciira emphasised the importance of giving the a good background introduction to machine learning, one that wasn’t two technical, but then later showing them that there was serious maths there.
This is a good approach. There is no hiding it. The task is big. But I know that John is also here with immense experience of applying data science in practice.
Dedan Kimathi
A bit of background on Dedan Kimathi University. We are in Nyeri, about 150 kilometers from Nairobi. To the east is Mount Kenya, it is 100 kilometers away, and in the haze it is cloud like. It must be an extraordinary sight on a clear day.
The University is new, it was until 2012 a technical college and then a college as part of a larger University. It now has its charter. There are about 5000 students and the campus is on the site of a colonial coffee plantation. The guest house, where the speakers are staying, is the farm house.
Ciira has organized banners and t-shirts. There is one across the front of the University evidence, and one on the main building. Data Science for Africa.
There are a lot of students, up to 100 expressed interest, and in the class we vary around 80. I’m writing this now as John explains classification by k-nearest neighbours. The students are quite initially when asked to make suggestions, but once someone is willing to break the silence the questions flow.
students attending the school
Before we started the sessions we were welcomed by the Vice Chancellor and the Deputy Vice Chancellor. I got a chance to have a few words with the Vice Chancellor whilst we were being introduced. It must have been a tremendous journey to take the University from college status to fully chartered. The University is in an excellent position, it is close enough to Nairobi to mean that the journey (3 hours) is not prohibitive. Yet, we are in a slightly more rural area. Nyeri is a colonial town. It is the town where Princess Elizabeth became queen when her father died, she was on safari here.
There are challenges, Ciira is Department Chair for Electrical Engineering and his department has 7 faculty and 500 students. But there is also a willingness to embrace new ideas and move the institution forward. The minister in charge of ICT visited recently, and has backed the workshop. Many researchers will make their way out of Nairobi towards the end of the week, along with members of the UN Global Pulse lab.
What is Data Science?
For the first session I focussed on the background to machine learning, putting it in the wider context of modelling and prediction. A key thing is to get good interaction with the audience. Different cultures seem to react in different ways to this. Students were quiet at first but after some time there were lots of good questions mainly about the context of data science, where machine learning fits in, how it relates to other fields like statistics, predictive analytics, AI, data mining. It was a lot of fun to try and answer these questions, although I tried to emphasize that I was giving my own perspective. I was asked about whether there was a good book about what data science was (and also how it relates to data analytics). One reason it’s an interesting field is that currently there is no “canonical” definition of what a data scientist needs to know, and what their role will be. It feels early for a definitive text. That’s part of what we were doing with the workshop and conference was working to help define that term. The talk in the end covered much broader ground than I’d intended, but this was totally appropriate.
Coffee break involved some local coffee, brewed, roasted and ground all on the campus. DeKUT coffee. It is weaker than I’d normally drink, but very high quality.
Mathematical Fundamentals
Ciira asked me to emphasise the maths a little more in my second session, to try and ensure that the students understand that it isn’t only about plugging methods together and combining them to get a result, that they need to understand the fundamentals are mathematically based and the principles need to be assimilated to get the best out of these algorithms. There was a question directly about this “Why do I need to know this if the software is available?”. The answer is because we don’t yet have a toolkit of learning algorithms to do every job in AI we need doing. Just like a craftsman needs to understand tools, so does a data scientist.
This workshop is about the basic toolsets, but those basic tools can be turned to many uses if they are understood and applied creatively. I mentioned the boy on his bicycle from the day before and how cleverly he was riding it. The bicycle was like an ML method slightly ill designed for the purpose it was being used (it was too big) but the boy was skillfully managing to get something out of it. For the moment that’s what we are faced with. We are attacking this from two directions: firstly we are trying to improve the tools (make the bicycle smaller) and secondly we are trying to improve the understanding of those that need to deploy these methods (teaching people to adapt their riding style to the bike they’ve got).
Lab Session
The lab session went very well, thanks to the efforts of Mike Smith and Andreas Damianou. The internet was pretty slow, so getting people up and running involved a lot of innovative thinking. Mike Smith excels in this area, and soon had a local sever running plus a system of distribution using memory sticks.
Mike Smith explaining the lab session
Julius Adebayor had also arrived, so after the lab session had finished we headed in to town for a couple of beers to celebrate a succesful first day. It had gone very well, although the key as always was adaptability. I probably only covered about half the material I’d originally planned for, but I covered an enormous amount of ground that I hadn’t thought about! The most important thing is to react to the audience, and see where they are. In the end, the presentation I gave was a much better one than I’d planned, and it was really the students that brought that out of me, with questions that often went right to the core of what data science is and what they might be able to get out of it.
Tomorrow is the day when the workshop gets serious. John Quinn will be presenting on his perspectives. They are hewn over a decade of experience of tackling real challenges in African Data Science.