Best of TTU – Machine Learning

Top Traders Unplugged - A podcast by Niels Kaastrup-Larsen

Categories:

Today I would like to share with you some life lessons and learnings from expert Robert Sinnott, where he shares his in depth knowledge of the modern evolution of systematic trading, in the form of Machine learning, and how machine learning has effected and impacted their short term and long term strategies.  If you would like to hear the full episode then you can listen by clicking here.    I’m pretty sure you are going to learn something valuable from what Robert shared with me. 

Rob: In statistics, in machine learning and decision theory, many of these concepts that are talked about actually got their start in the ‘70s, in the ‘60s, even in the ‘50s. Where a lot of the nuance has come today, or a lot of the excitement has come today, is in where and how they are applied, in what data sets they’re applied to, and to what degree you can automate and systematize the application of these tools.
Machine learning is a toolbox. To say that we use machine learning to build our algorithms is kind of like saying, “I use tools to build a house.” It’s not really additive in terms of your understanding. So, let me break that down.

"Machine learning is a toolbox."

Let’s talk about what actually we do with machine learning. Before we go into the tools that we use, which I think are really interesting, let me also break down to the problems that we try to solve because I think there’s a lot of hype about machine learning and I think that there are some kinds of problems where there is a lot of potential for growth.
I think that there are some kinds of problems that, regardless of the amount of machinery that you throw at them, are still going to be a challenge and are still going to be a source of where people who are well versed in machines, and specifically their limitations, will be able to still add value as humans rather than just automatons.
So when we think about machine learning, I would say that there are two kinds of problems. You have what you would think of as classification problems, or (as what I would like to call) stationary problems, meaning that the problem that you’re working on doesn’t change over time.
Here’s a great example:  Google has come out with a lot of really interesting results and a lot of really impressive, fast algorithms for identifying various things in videos and images. You go back ten years, and it was a really hard thing to identify a cat in a photo. Now it’s a really trivial thing. In fact, you can do that for arbitrary objects. You can just go online, and there are online classifiers that allow you to make these decisions.
One of the very, very, early uses of these things was (in financial markets) things like counting cars in parking lots, or identifying cars in parking lots, or identifying the levels of oil in silos, or trying to predict crop yields, things like these. Those kinds of questions, where it doesn’t matter how many people are looking at the field to identify if this is going to be a good forecast or a bad forecast, or is it going to be a high yield crop or a low yield crop? That doesn’t change the success of detection of that yield.
It doesn’t matter how many people are looking at whether or not that’s a cat in the video or a cat in the image. That doesn’t change your success rate. That’s a static problem and the more data you can throw at it, the more training samples that you can throw at the problem, the better your algorithm will be up to some asymptotic.

What are the challenges there? Well, the challenges there are an abundance of features. The more things that you potentially know about a data set the harder is to glean what’s true. Also noise, the more noisy a data set, the more pixilated an image, for example, the harder it’s going to be to get your answer. But again, the more data that you have, the more training samples that you have,