Sponsored
subscribe Support our award-winning journalism. The Premium package (digital only) is R30 for the first month and thereafter you pay R129 p/m now ad-free for all subscribers.
Subscribe now
Picture: SUPPLIED/OLD MUTUAL
Picture: SUPPLIED/OLD MUTUAL

The amount of data produced and stored has seen exponential growth globally in the last decade and this is expected to grow even faster in the future. 

One example of the speed of this advancement is the adoption of smart devices, which constantly record and store information about their users. While this information may be useful to individuals regarding their own habits, in the hands of data scientists the aggregation of this data can be used to predict emerging trends among populations. 

Within the fund management industry, analysing data has always formed part of the market analysis process, to create a competitive offering. However, as data has evolved, and with the rise of “big data”, structuring and understanding of data and its sources is changing the face of asset management.

There exist two extreme views of data among fund managers: one that all data is noise, and no additional insight can be gained by its examination — which supports the efficient market hypothesis (EMH); and on the other end of the spectrum is the view that all data has the potential to explain some part of the market, and nothing should be ignored.

The basis of the EMH is that all relevant information of a company is known by all market participants and is already reflected in its share price. This implies that no additional insight can be gained by analysing data.

About the author: Reza Fakie is a portfolio manager at Old Mutual Investment Group. Picture: SUPPLIED
About the author: Reza Fakie is a portfolio manager at Old Mutual Investment Group. Picture: SUPPLIED

Fund managers at the Old Mutual Investment Group do not subscribe to the view that all data is noise as market evidence has disproved this time and again.  

Regarding the theory that all relevant information of a company is already known by all market participants and is reflected in the share price, there is still much debate as to what level of public information is captured in the price of shares. Old Mutual's fund managers, for example, do not believe that past share histories and patterns by themselves drive future share prices. However, the interaction of investors with these patterns should not be ignored as, if enough investors believe in a historical price pattern holding and act on it, then it may result in a self-fulfilling prophecy which only reinforces the belief in the pattern holding true.

As fund managers it is important to distinguish between data that drives the share price of a company over the long term, which in the Old Mutual's experience is linked to the underlying fundamentals of the company, to behaviour-based movements which may not be based on any insight into the company. It is important to note that even behaviour-based price movements may persist for periods of time irrespective of the performance of the company.    

There have been certain investment styles and market anomalies that have persistently existed and have allowed individuals to generate positive alpha by investing in them. This is evidenced by styles such as value and growth, as well as the market anomaly of momentum, which has allowed investors to outperform the market for significant periods of time.

The second view, however, has grown in popularity with the mainstream use of machine-learning models, especially deep learning models. Deep learning models have allowed one, without any understanding of the underlying data, to build a model that takes in a large amount of unprocessed data and generates a relationship between the inputs and some output to be forecasted. These models, however, completely obscure what the underlying relationship between the input and output is. 

The danger is that for any model to generate forecasts outside the data used to build the model, the assumed behaviour of the inputs needs to be persistent relative to the output. If one does not know how a model assigns importance to the input data as well as how the input data is being transformed, then it is difficult to understand if the model is working as intended or if spurious relationships drive the model — which is when two things seem to have a very close relationship but are actually unrelated (see graphic).

Machine-learning models are useful in uncovering relationships in data that may have been overlooked by traditional methods. However, one then has the responsibility to fully understand what that relationship is. 

Graphic: SUPPLIED
Graphic: SUPPLIED

Old Mutual's fund managers take a balanced approach here by carefully examining and understanding the data that goes into its process and then constantly testing to ensure that once a relationship has been identified, that it does persist.

Data is only useful for investors or fund management if the information extracted from it has the following characteristics:

  • It must tell you something about the future: Most data tells you about the current environment. However, for data to be useful it must provide insight on the future of companies and their share prices
  • It must be actionable: Information which cannot be acted on may inform a broader world view but may not add value to the fund management process.
  • All data must be able to be behaviouralised: For any model to be effective, the underlying data must be understood well enough so that the relationship between it and what you are trying to forecast can reasonably be expected to persist.

It is important also to examine the source of information for consistency and quality of data. While one expects an increasing frequency of new data sources and types of data, to understand the nature of the new data types one needs a sufficient history to understand the behaviour. 

An example is satellite imaging of the parking lots of malls as an indicator of retail activity. One may believe that this may reasonably give a good indication of the number of shoppers. However, without sufficient history to create a sufficient baseline to compare to, it is difficult to draw any conclusions on the data.

One may also not be able to sufficiently capture the seasonality of the data. For example, people tend to spend more time in malls in certain times of the year, such as over Christmas, and certain times of the month, such as after payday, and if this is not adjusted for then incorrect conclusions can be drawn.

Data like this is also likely to miss out on other trends affecting the data. For example, the rise in ride-sharing companies, like Uber, would mean fewer cars parked in parking lots, but it does not decrease the retail activity in the mall.    

It is also important to ensure that any new data considered for a model complements the existing data

It is also important to ensure that any new data considered for a model complements the existing data used rather than duplicating an existing piece of data. Old Mutual's fund managers would, for example, not use both the fundamental company ratio of price-to-earnings (PE) and earnings-to-price (earnings yield) in the same model as it is the same data, only simplistically transformed. This example may seem simplistic, but when dealing with large amounts of data it is easy to miss these duplicates.

The impact is that any information that could be gained from this data is self-reinforced and so it may seem more important than reality. 

Aside from the behaviour of data, for most predictive models one needs to ensure that the data inputs have been sufficiently transformed for them to be comparable to ensure relationships captured are not spurious and no piece of data has an outsize impact on the model.    

Fund managers are constantly inundated with large amounts of data and an ever-growing list of new types of data. While it is important to constantly examine new data to decide if it has sufficient information content to be useful, it is imperative that all the data in a model is fully understood to uncover the true relationship between it and what one is trying to forecast.

It is the Old Mutual's belief that a systematic and evidence-based approach will lead to the best outcome.

This article was paid for by Old Mutual Investment Group.    

subscribe Support our award-winning journalism. The Premium package (digital only) is R30 for the first month and thereafter you pay R129 p/m now ad-free for all subscribers.
Subscribe now