Machine Learning and Big Data: Buy-Side European Bond Opportunities

By Andy Webb and Luigi Marino from MTS. This article first appeared on TABB Forum, you can read the original here.

One of the most interesting recent developments in capital markets has been the conjunction of machine learning and big data. As the ability and incentive to capture data at ever finer granularity has led us to the microsecond and beyond, there has been a huge proliferation of data. At the same time, machine learning techniques have continued to evolve in terms of improving their predictive ability as they are trained upon ever more data. The third ingredient is that the hardware (such as GPUs) necessary to train machine learning algorithms on all this data has become increasingly commoditised and is now readily available in very high density at relatively modest cost.

It's not so long ago that the concept of capturing and time stamping market data at microsecond granularity was unheard of, but this has now become commonplace, with some record sets for just a day easily running into literally millions of rows. Furthermore, the range of related instruments in this sort of record set can be similarly impressive: for instance, MTS Historical Data delivers a complete record set for more than 1800 European government bonds, plus additional data for related repo transactions.

At the same time, the machine learning tools to analyse and extract market intelligence from this data have continued to improve. One of the key changes in this area has been the way in which scale has been driving progress in deep learning (large neural networks) in several interconnected ways. As neural networks' size increase and they are fed ever larger volumes of data, their prediction performance continues to improve. By contrast, older machine learning techniques simply plateau in terms of performance above a certain volume of data (see Figure 1). This means that very large data sets, such as MTS Historical Data, are valuable because they can be used to push prediction accuracy to higher levels. 

Figure 1[1]

At the same time, advances in and commoditisation of high-performance hardware - particularly in high-density parallel processing with GPUs - has ensured that the necessary horsepower to train large neural networks with huge volumes of data is available at reasonable cost (see Figure 2).

 

Figure 2[2]

The right data

While collecting data at high frequency has become commonplace in markets such as FX and equities, the sheer range of instruments in bond markets and their occasionally patchy trading activity has meant that they have tended to lag in this respect. For instance, many corporate bonds trade relatively infrequently and/or in low volumes, so a comprehensive data set of multiple bonds with coinciding time stamps for each data point is highly unlikely to emerge.

An exception to this is a bond trading platform where very high volumes of similar types of liquid bond are traded. Under these conditions, it is entirely possible to build a data set that is sufficiently granular and consistent for a deep learning algorithm to use for meaningful prediction. MTS Historical Data is a case in point. This provides trade by trade data (size, price and yield) at up to microsecond granularity back to April 2003 for cash bonds and January 2010 for repo. Also included are the top three levels in the order book per bond for each trading day with microsecond time stamps, as well as full depth of the un-netted MTS Cash order book including every new order and price tick since June 2011. A further important distinction is that all MTS Historical Data is completely original and direct from source, it is not derived or blended with other data in any way.

The fact that this is available at this fine level of granularity across more than 1800 European government bonds makes it valuable from a machine learning perspective. As these bonds have various common factors and interrelationships, there is a reasonable prospect that one subset of them will have value as the raw basis for machine learning features that can be used for predicting the future behaviour of others. The presence of repo data further enhances the possible predictive opportunity here, given the "intimate relationship between cash and repo markets and the impact of supply and demand effects"[3].

The right tools

Armed with high quality, rich, big data, the predictive possibilities are almost endless, because the tools to make the most of this opportunity are readily available and in most cases are open source rather than proprietary. Multiple scalable machine learning libraries are immediately accessible for widely used languages such as R and Python, plus others such as Julia. In addition, various machine learning frameworks, which provide a single consistent interface to multiple machine learning algorithms, are also available. The advantage of these frameworks (such as caret, h2o and mlr) is that they provide an easy route to combining multiple machine learning algorithms via stacking (or blending) for a single prediction task. This can be particularly useful when using large data sets, such as MTS Historical Data, as sufficient data will be available for training and out of sample testing both the first stage learners and any further algorithms that sit on top of these. For example, a support vector machine, random forest and gradient boosting machine might be the first stage learners but their output could then be used as some of the feature inputs to a deep neural net overlay. Such stacking techniques can often (not always) outperform any individual algorithm in terms of predictive accuracy.

One of the less glamorous aspects of machine learning is data munging/wrangling: transforming raw data into features usable as inputs to a machine learning algorithm. This actually covers a range of activities, such as cleaning (e.g. removing text from supposedly numeric data fields) and rescaling (many machine learning algorithms require input features within a certain numeric range). To this can be added the critical task of feature selection/creation. Again, the good news is that the tools for all these tasks are readily available, and again many of them are open source and well-maintained. 

The right hardware

From a hardware perspective, one of the advantages of machine learning is that many (not all) of the most effective algorithms are (or can be) embarrassingly parallel in operation. This fits well with the advances made in recent years in using commodity GPUs for parallel processing tasks, as opposed to their traditional use for video display. NVIDIA has been one of the quickest movers here with its range of Tesla high performance computing (HPC) GPU cards.

A further opportunity is the ready availability of cloud computing services that provide all the hardware necessary for even the largest scale machine learning projects on a rental rather than purchase basis. While large providers such as Amazon Web Services (AWS) or Google Cloud Platform (GCP) attract a lot of attention, there are also various smaller providers, such as FloydHub, who provide solutions that are particularly focused on providing turnkey cloud computing solutions for machine learning. 

The right mindset

One of the most fundamental changes currently underway is the growing acceptance of machine learning in financial markets. Neural networks in particular have in the past had something of a hit and miss reputation, but their latest iteration in the form of deep learning is already gaining acceptance and implementation on the sell-side for large-scale projects. On the buy-side, there is also growing acceptance, but there is still some way to go. At Bloomberg’s Buy-Side Week 2017 New York event, just 16% of participants said they had incorporated any kind of machine learning into their investment strategies. At the other end of the spectrum, 32% hadn't even considered using it in this role, while 24% were researching how to do it and 26% wanted to learn about how to do it.

Historically, there has been some resistance from buy-side quants who regarded classical modelling techniques as credible and machine learning as dubious. This is changing as the quant community is increasingly accepting machine learning as a useful tool and potentially the next alpha capture opportunity. This also applies beyond quants to the buy-side more generally, with value managers in particular increasingly prone to a fear of missing out[4]

The right opportunity

Pull all these themes together and it becomes apparent that the buy-side is looking at a major opportunity in a market where some might least have expected it. The patchy inconsistent data available for many bonds has long been a bugbear for human analysts, never mind machine ones. This is why the availability of a data set as comprehensive and granular as MTS Historical Data is significant. In conjunction with machine learning, it opens up an entire new market segment of alpha opportunities for the buy-side.

The data is obviously equally applicable for building either classification (buy/sell signal) or regression (absolute price/yield value) models for prediction. These can then be deployed for live trading via the MTS BondVision API. Apart from directional trading strategies, the connected nature of the instruments in the MTS Historical Data universe also opens the door to a broad range of relative value arbitrage strategies.

Conclusion: right place, right time

From the perspective of buy-side participants looking for new sources of alpha, the availability of MTS Historical Data is a good example of right place, right time. It covers a market segment that many would usually consider extremely challenging from a machine learning perspective, because of the typically intermittent nature of the data available. Nevertheless, by focusing on only the most liquid and actively traded European government bonds, MTS Historical Data delivers a comprehensive dataset.

An additional hurdle when building models for bonds is the typical complexity of the models involved, which can easily result in over-fitting when working with a limited range of potential inputs with weak information value. Again, MTS Historical Data helps here by providing a highly granular yet broad data set that facilitates the extraction of high-quality machine learning features that can be used to train robust models capable of sufficient generalisation.

Therefore, the combination of MTS Historical Data's high-quality big data with readily accessible innovations in large-scale machine learning and reasonably priced hardware, represents a considerable buy-side opportunity.