Africa must reap the benefits of its own data
Twenty-two years ago when I was a doctoral student in artificial intelligence (AI) at the University of Cambridge, I had to create all the AI algorithms I needed to understand the complex phenomena related to this field.
For starters, AI is a computer software that performs intelligent tasks that normally require human beings, while an algorithm is a set of rules that instruct a computer to execute specific tasks. In that era, the ability to create AI algorithms was more important than the ability to acquire and use data.
Google has created an open-source library called TensorFlow, which contains all the developed AI algorithms. This way Google wants people to develop applications (apps) using their software, with the payoff being that Google will collect data on any individual using the apps developed with TensorFlow.
Today, an AI algorithm is not a competitive advantage but data is. The World Economic Forum calls data the new “oxygen”, while Chinese AI specialist Kai-Fu Lee calls it the new “oil”.
Africa’s population is increasing faster than in any region in the world. The continent has a population of 1.3-billion people and a total nominal GDP of $2.3-trillion. This increase in the population is in effect an increase in data, and if data is the new oil, it is akin to an increase in oil reserve.
Even oil-rich countries such as Saudi Arabia do not experience an increase in their oil reserve. How do we as Africans take advantage of this huge amount of data?
There are two categories of data in Africa: heritage and personal. Heritage data resides in society, whereas personal data resides in individuals. Heritage data includes data gathered from our languages, emotions and accents. Personal data includes health, facial and fingerprint data.
Facebook, Amazon, Apple, Netflix and Google are data companies. They trade data to advertisers, banks and political parties, among others. For example, the controversial company Cambridge Analytica harvested Facebook data to influence the presidential election that potentially contributed to Donald Trump’s victory in the US elections.
The company Google collects language data to build an application called Google Translate that translates from one language to another. This app claims to cover African languages such as Zulu, Yoruba and Swahili. Google Translate is less effective in handling African languages than it is in handling European and Asian languages.
Now, how do we capitalise on our language heritage to create economic value? We need to build our own language database and create our own versions of Google Translate.
An important area is the creation of an African emotion database. Different cultures exhibit emotions differently. These are very important in areas such as safety of cars and aeroplanes. If we can build a system that can read pilots’ emotions, this would enable us to establish if a pilot is in a good state of mind to operate an aircraft, which would increase safety.
To capitalise on the African emotion database, we should create a data bank that captures emotions of African people in various parts of the continent, and then use this database to create AI apps to read people’s emotions. Mercedes-Benz has already implemented the “Attention Assist”, which alerts drivers to fatigue.
Another important area is the creation of an African health database. AI algorithms are able to diagnose diseases better than human doctors. However, these algorithms depend on the availability of data. To capitalise on this, we need to collect such data and use it to build algorithms that will be able to augment medical care.
Some of the latest technological developments are intelligent personal assistants, which respond to voice instructions. Google has developed Google Assistant, Amazon Alexa, Apple Siri and IBM Watson. These devices are very effective but do not handle African accents well.
We can enhance these devices by including emotion-detection algorithms and making them less sensitive to different accents, especially rich and diverse African accents. For us to capitalise on our accents heritage, we need to create our own database of African accents and use this to build intelligent personal assistants that can understand African languages.
Facial-recognition algorithms do not work very well for African faces, due to limitations of the African faces libraries. The second reason is the suboptimal data collection for African faces, which are different from Asian and European faces. The third reason is that we have not designed AI algorithms for face recognition from the African perspective.
Companies such as Facebook are collecting huge amounts of data from African people who have Facebook accounts. However, we should think of how we can create a face database. Departments of home affairs can use this database to increase security at points of entry into countries. Currently, for the smart identity card, the home affairs department is imaging the face from the front only. For a facial database, imaging is required for the sides as well. In a way, Facebook is building this, with our help, as we upload our images. These images also express emotions, which contributes to another aspect of the database.
There is so much heritage and personal data we can collect and monetise to derive economic value. Some of these include pictures of eye irises and fingerprints, which are very valuable for building biometric security systems. However, for us to be ready, we need to develop skills to effectively collect and analyse these data. These skills are data analytics and AI algorithmic skills.
To use these open-source AI algorithms, one requires some understanding of programming. Data analytics skills should go beyond the basic statistics courses we often find in our universities and must include advanced topics such as signal processing as well as handling incomplete and imperfect data sets.
How do we then increase our capacity to collect and analyse data? First, we should introduce national data banks that collect these data. However, we should do this in a way that we protect data security and privacy. One way of achieving this is to expand the mandate of organisations such as Stats SA to include the gathering of personal and heritage data in addition to gathering and analysing economic data and performing national census.
Regional organisations such as the SA Development Community must create data banks that gather and monetise regional data.
At the continental level, the AU should establish data banks that will consolidate and monetise a continental database. New opportunities for this arise, for instance the AU’s approach for the “AU passport”.
On creating such data banks, we should bear in mind that any given database is usually incomplete and imperfect. We should equip data-gathering organisations with the competence to analyse incomplete and imperfect data.
If we are able to explore the vast heritage and personal data of Africa’s 1.3-billion people — observing all ethical implications — we can create the “Saudi Arabia” (oil) of the fourth industrial revolution.
Prof Marwala is vice-chancellor and principal of the University of Johannesburg. He deputises for President Cyril Ramaphosa on the SA presidential commission on the fourth industrial revolution.