I intentionally avoid the world of quantitative investing on this podcast. The whole point of this format is to learn about many different fields, and the vast majority of my time is already spent in quant world. Occasionally I’ve broken this rule because of something unique, including this week’s conversation with Richard Craib, the founder and CEO of Numerai. If you listen to the podcast often you’ll have heard me reference Numerai, a hedge fund which blends quant investing, cryptocurrencies, crowdsourcing, and machine learning — talk about a PR company’s dream. One important note: Numerai is both incredibly open and very secretive. You may sense a bit of frustration on my part, but that is only because, as a fellow quant who loves details about data and modeling, we couldn’t go deeper into the details on the record. We discuss how Numerai has created an incentive structure to work with data scientists around the world in an attempt to build better investing models. The idea of having data scientists stake cryptocurrency in support of the quality of their models is fascinating. Like many hedge funds, Numerai doesn’t share its track record, so we don’t know if this works—but I hope you, like me, use this conversation as inspiration for how different technologies can intersect.

Hash Power is presented by Fidelity Investments

Please enjoy my conversation with Richard Craib.


For more episodes go to InvestorFieldGuide.com/podcast.

Sign up for the book club, where you’ll get a full investor curriculum and then 3-4 suggestions every month at InvestorFieldGuide.com/bookclub.

Follow Patrick on Twitter at @patrick_oshag


Show Notes

2:32 - (First Question) – How he came up with Numerai and how its related to his background

4:08 – How he works with and models the data for his system

5:24 – Describing machine learning as it relates to his work, and specifically linear regression

7:11 – The important stages in his sequence

8:46 – How the scale in the number of data scientists they use is different from other areas

11:30 – Which is the most important aspect of creating alpha; their data, algorithm work, proprietary ensembling of those algorithms.

14:30 – The idea of staking in blockchain

17:30 – Does the magnitude of the stake matter in blockchain

19:10 – Understanding the full incentive structure for both staked and unstaked work

21:07 – How is the prize pool determined

22:29 – Philosophy on how to source interesting data

26:11 – His thoughts on the crowd model and the wisdom of crowds

27:12 – The size of stakers for Numerai

27:51 – Interpreting the models and knowing when something is broken

30:03 – How they think about people not submitting their models

31:48 – Their model building

32:39 – Most interesting set of things they are working on to improve the overall process

            35:38 The Market for "Lemons": Quality Uncertainty and the Market Mechanism

37:11 – How people can come along with their own data

39:00 – His thoughts on the quantitative investment community

40:44 – What else is interesting him in the hedge fund world

44:03 – Building a marketplace and staving off competition

46:16 – Kindest thing anyone has done for him


Learn More

For more episodes go to InvestorFieldGuide.com/podcast

Sign up for the book club, where you’ll get a full investor curriculum and then 3-4 suggestions every month at InvestorFieldGuide.com/bookclub

Follow Patrick on twitter at @patrick_oshag

United States


00:00:04hello and welcome everyone I'm Patrick Shaughnessy in this is invest like the best this show is an open ended exploration of markets ideas method stories and of strategies that will help you better invest both your time and your money you can learn more and stay up to date
00:00:18and investor field guide dot com and Patrick o'shaughnessy is the CEO of o'shaughnessy asset management all opinions expressed by Patrick and podcasts guests are still leave their own opinions and do not reflect the opinion of o'shaughnessy asset management this podcast is for informational purposes only and should not
00:00:39be relied upon as a basis for investment decisions clients of a shows the asset management may maintain positions in the securities discussed in this podcast I intentionally avoid the world appointed investing on this podcast the whole point this format is to learn about many different fields and the
00:01:05vast majority my time is already spent in the quantum usually have broken this rule because of something unique including this week's conversation with Richard crepe founder and CEO of Nimrod if you listen to the podcast often you have heard me reference number the hedge fund which ones Kwan
00:01:19investing crypto currencies crowdsourcing in machine learning talk about a PR companies during one important note memorized both incredibly open and very secretive you may sense a bit of frustration on my part but that is only because as a fellow quant loves details about data and modeling we couldn't
00:01:34go deeper into the details on the record we discussed under my has created an incentive structure to work with data scientists around the world in an attempt to build better investing miles the idea of having to decide to stay crypto currency in support of the quality of their
00:01:47models is fascinating to me like many hedge funds new ride doesn't share its track record so we don't know if this works but I hope you like me use this conversation as inspiration for how different technologies can intersect like the has part documentary this episode another half hour
00:02:02singles are brought to you by fidelity investments a company that is constantly researching and experimenting with emerging technologies like crypto assets in blockchain to improve the lives of their customers that only provides a comprehensive set of products and services to individual investors employers and financial advisory firms for
00:02:19more information please visit fidelity dot com please enjoy my conversation with Richard great to Richard thanks so much for joining me today you you represents such an interesting intersection which is my favorite kind of investing conversation between contradiction all quantitative investing machine learning slash artificial intelligence but also
00:02:44blushing technology and I've always used memorize one of the interesting examples of people using blockchain for something that seems to be make a lot of sense so maybe you could begin just by giving us a touch of background on how you came up with this idea maybe how
00:02:57that's related to your own history investing your own professional background and then we'll go as deep as possible and to some of the interesting features of the number I model check yeah the first set of ideas for new where I came when I was working as a qualms
00:03:11and there was some point in the fund I was working out but there wasn't any machine learning that they were doing so I decided to spend a year researching how we could use machine learning and then I finally found something that I thought was quite good but the
00:03:25same time I was also playing in data science competitions on cattle and I sort of thought okay well the best way to get really good performance is to get the world's to compete on the data sets and that was true of nearly everything that ever been done the
00:03:43Netflix prize was one way better than the people at Netflix to do and then every other tournament on cattle that was one is old like way better than the companies that was in the competition could do and at this time I was also reading about it in and
00:03:57I thought you know maybe there could be something interesting where finance which is kind of stayed the same for a long time could be changed by these two technologies blockchain and machine learning maybe you could describe the actual process here behind outsourcing if you will the analytical or
00:04:14that the model building peace were to spend a lot of time on data and and what kind of data is and what you do to it to make this possible but if you could just describe for those unfamiliar the actual sequencing of what you put out there how
00:04:25you take that an incorporated into your system well we have the large datasets that is built up from different kinds of financial data and it's structured in a way where you have these target variables and you can use the data we get to make a model that predicts
00:04:41from the features to the targets and we decided to give away that so no hedge funds really want to give away their data but new ride was the first hedge fund to give away all of its data for free but all the data it contains all the patterns
00:04:58of the stock market but you have no idea what the data is so you have to do use machine learning to basically find a way to model the data even though you don't know what feature one means our feature to me and that allows us to share our
00:05:14data sets with anybody without them being able to run off and start their own hedge funds with them but they actually kind of up on incentivized to come back and submit predictions to us which we then trade maybe we could use this as an opportunity to kind of
00:05:27level set on some terminology and describe a generic machine learning process as it relates back to the more traditional like linear regression or things you might be a bit more familiar with so you mentioned this idea of features are and targets maybe if we relate that back to
00:05:41what the actual date is that you upload so when I look at it this morning you know as a whole bunch of identifiers that could be securities they're sort of what I would call a blind factors so some new you've regularized which is kinda numbers between zero and
00:05:53one maybe you could describe kind of what that might represent I would think about that as like in a variables and then you've got outcome variables are targets for labels whatever that maybe you could describe that generic process of machine learning as it relates to something more simple
00:06:05like linear regression that would be a good jump off point for the rest the conversation yeah well machine learning is regression and so it really isn't much more than curve fitting to data sets so one way to think about it is if you have a bunch of points
00:06:20in like a scatter plot on the X. Y. axis you can find a line that fits those points with the regression but you actually don't need to know what the X. and Y. axes are to find out line and on you arrive there are fifty dimensions instead of
00:06:39two I think it's fifty right now and then the users find curves that it does high dimensional spaces and once you can have a regression you have a model to predict something and so you're finding classes of points that should be you know labeled the target of zero
00:06:59which might mean the stock is going to go down all classes of points out all labeled one which I mean stocks gonna go up and your curve that you find using machine learning hopefully be very accurate in predicting new data as well so I always think about these
00:07:13things especially quantitative process sees as sort of all following a similar rubric and what's interesting about memorize how differently it's doing certain parts of this so first of the curious if you agree with this general framework which should look something like engineering the features themselves doing whatever transforms
00:07:27you need to in your case regularizing them selecting the features the fifty you mentioned in the case machine learning like hyper parameter tuning and then you got out you to select the models that are gonna work that's being built by your audience ensemble the maybe and put it
00:07:40into production do you think that that is kind of the is that how you would describe the stages that are important in this sequence yeah that sounds kind of rights I mean the the one thing that people can think about with something like memorize we are trying to
00:07:52crowd sourced stock tips but really it's not like that at all like no one's telling us any specific stock to buy we don't think it's it makes sense to crowdsource that it does make sense to do machine learning competitions where you said the problem up in a very
00:08:10specific way and you ask people to solve that and so the regular radiation step which is kind of like normalizing the data making it seem like something that can be modeled on that doesn't seem too much like a to non stationary then taking that is the part where
00:08:27we let our users so our users aren't doing any insight into financial data they just have to do the machine learning piece and that's important distinction with with new arrive where that's the piece that we want people to work on because that's the piece where the crowd is
00:08:43going to have a a large edge over just one person doing it can you describe that agile and little bit more detail so I feel like there are two key areas of edge the first is the date at self and one of the things we find is the
00:08:55more time you spend on getting unique are really clean data set you know that drastically improves the outcomes of this exercise and then the second would be the competition that you're running effectively and in an open way which is ever increasing scale at at fitting that curve maybe
00:09:11just talk about the dimensions of that skill like where's their diminishing returns you know if I'm a hedge fund or one of these huge hedge funds Kwan hedge funds that has no called a hundred P. H. D.'s that are working on custom data set why would there just
00:09:23be diminishing returns to the next fifty PhD is looking at the same thing there are only so many machine learning tactics you know changes but it's not it's not that crazy so where does that where does that belief that going from a hundred to our many data scientists
00:09:36you have in the world working on this competition where is that skill like what's being differentiated this definitely you know kind of only a handful of machine learning algorithms out there that people regularly use like the head of neural nets and decision trees random bars support back to
00:09:52machines there aren't that many of those so it is true that anybody could kind of run through each of those and then just ensemble that but really what you're trying to get at as much more creative approaches where they can be certain kinds of emphasis on certain things
00:10:09that you wouldn't be able to do just by choosing a whole battery of machine learning tests so some of the what what our users have done they definitely are using a lot of neural nets but some of what they've done is like is very peculiar like we contel
00:10:22exactly what they've done but it ends up having really really strong performance so I think it definitely helps and then it also helps in a technical sense it always helps to have more so if it is diminishing it still helps to have more but it wouldn't work if
00:10:37knew where I had to hire every single one of the would be the biggest hedge fund in the world by far by hundred acts if we hired everybody to actually sit in our office and work on this so we can do things in a better more efficient way
00:10:51by letting people do whatever they want use whatever programming language they want to use whatever the algorithm they want and that makes it very competitive and worthwhile to do it in this way star in many other things with financial data if you have a fifty two percent edge
00:11:07to turn that into a fifty three has a really really big impact on your shot it's sort of like could be a difference in the shop of one or two so that's why it is particularly helpful in this case I think you can really have a new rifle
00:11:22health care when you arrive for your other kinds of data because they never have that kind of difference between a fifty two and fifty three percent edge Sir if there are kind of three parts of this creation about five in assuming that the end goal here the first
00:11:36being date you know getting the data that you think matters or talk about that the second piece being what you just described which is this this kind of global network of data scientists working on the best algorithms and then the third being how you ensemble or create sort
00:11:48of a meta model I guess of the best models that people have submitted which of those three year how do you assign the importance or the impact of those three categories on your outcomes on the alpha itself which is the most important by how much arguments in against
00:12:02which one which for us says so you got to you got three primary sources of edge let's call it the data that you have what the date is how you're collecting it higher normalizing it you know that's kind of proprietary it's happening at number I'd known in the
00:12:13world knows with that data is the algorithm work that the crowd is doing and then the proprietary on sampling of those models I think that data is very important I think the founders of to think Matt I'm not sure they said publicly but they they often say that
00:12:30it's the data that's the most important thing and that is kind of true if we gave out a problem that didn't make sense in the first place say didn't account for the day and we gave them the council costs or was tradable only in countries that you actually
00:12:45contrary to the US by law anything like that would be like the algorithms would then find inefficiencies but that wouldn't really matter because you couldn't actually execute on that so that's one part of the data and then the other part is having it be clean and very long
00:13:01so I think data is critical and new variety doesn't allow our users to bring their own data they have to use our data and so that we are limited by the data we have but they can use any algorithm so I think yeah Dayton algorithms are always good
00:13:20if you have very bad data it won't work if you have a very bad algorithm it won't work out but I think the critical thing actually is not really in the list of three things because the ensemble in the models we don't use anything to we used to
00:13:34use a kind of very intricate way of of ensemble in the models together but since we released numerator are cryptocurrency that actually had this massive impact on everything where the people who stake their model with the crypto currency basically they're taking him back on their model by betting
00:13:54cryptocurrency those users end up having much much better models and I think it was actually that piece of like game theory changing the game a little bit putting you having skin in the game that actually happy is the is the biggest thing we've done by far because it
00:14:10captures kind of what's going on in normal hedge funds if you invest in New Horizons fund it you know I have a whole bunch of my own money in in New Horizons funds of the people who invest a kind investing alongside me so we all have skin the
00:14:21game and everything everything works but if you don't have that then things don't work very well and blockchain is actually kind of neat way to get people to have something to lose can you describe that mechanic itself so maybe say a little bit more about the currency that
00:14:35you created the genesis of that idea and then this idea of staking so I think some people listening will be fairly familiar with staking but a lot won't so maybe just describe that process and why blockchain and the critic consider uniquely set up to do that well we
00:14:49started off new arrive paying people and December twenty fifteen right we started paying our users and decline and it wasn't really like anything the ideological or something note we was just like way easier to pay going back going because our users were anonymous and they were all around
00:15:06the world and so is much easier to use pick one from payments but because I had been involved and interested in if you're aiming invested in theory and I was curious about how we can use if you're him and a lot of our early investors where people in
00:15:22the blockchain space like a lost cause we fed our sin Jerry crude these all like my friends in Silicon Valley and so we started talking about could newer I use its own cryptocurrency in some way and of course there isn't really any good reason to have your own
00:15:36crypto currency if all you're going to do is make another currency but if you make an application that need smart contracts then having your own token makes it a lot more sense and so with with numerator what you could do as state your predictions that you sent to
00:15:56us and if you'll see if your predictions do well then when you earn more money and if they do badly then we destroy your steak so you'd never want to submit a predictions unless they were good because you wouldn't you wouldn't stake it unless they were good so
00:16:10it's a big filter for us and the problem with something like new arrival any crowdsourcing on the internet is that that much of the cost of spam for the spammer is low it's easy to send millions in emails I hope that someone clicks on one of them it's
00:16:24easy to make loads of accounts on Twitter all of these kinds of things are kind of what we're used to with the internet but in the real world because there's a cost to doing things people do the right thing and that's what we wanted to make happen on
00:16:38newer I wanted you little cost to Cape are you willing to put ten dollars on your predictions is if you're not we don't really want to use your predictions and so that has this really nice impacting and why you kinda need a blockchain for this like you might
00:16:52say well what if you just why are you the money but somehow it doesn't work because what you're doing with the state is you're having the blockchain be the escrow because if we just said promised you your wire money we promise to destroy your money if you like
00:17:08what it without even need and we promise to give it back well how can how can you know the promise is going to be honored and if you do then is not contract and you know everything's going to be on it and therefore the stakes on you arrive
00:17:19have been really large mean over a million dollars have been staked on new Mariah tournaments a lot that's been returned and some of that's being being destroyed but it is amazing what you can do something like this now in terms of the efficacy is there a bigger gap
00:17:34between state and on staked algorithms or between like small stake in large stake does the magnitude of the stake matter or is it more of a binary that just when people stake anything you get much better results that's a great question I was very curious about how that
00:17:48would work anyone who's not staking what we looked at it was something like and you get a sharp of one in a hot minute if you take the people who are staking shop goes to two point one that's a very big difference you gotta think that that the
00:18:02floor sharp is actually pretty high because the data is that we're giving away is already decent even if you just use regression to get back to top of one but then to go to a shop of two point one is amazing and that's just anyone who states more
00:18:16than one in a bar and what you find is if you if you say well what if they stayed more than fifty or something then you get fewer and fewer models and those models actually don't work that well so if you had if you staking a really large
00:18:30amounts it can often be something strange that you're trying and then it doesn't work very well unless it's balanced out by others so we actually see that it's good to use any model that stayed with any number of of NMR and if you use them all together than
00:18:46the average is really really good do still except on state submissions yeah we do we let anybody submits to us and there's no cost to making the council submitting what downloading the data and the reason you might do that is because you can still kind of get some
00:19:03practice and if you win four weeks in a row with a free model and then you might want to like start staking them by so many mark starts they get so say a bit more about about the full incentive structure here's for both staked on steak so if
00:19:17I'm somebody that does want to stake anything and I just wanna seven now grow them and like you said I win four weeks in a row like is there Pat associated with that if you don't stick it all you can win very small amounts for like verifying your
00:19:29phone number and then if you if you win like you get no point one anymore and that the problem with anything when you're done you giving it out for free is that it's going to be like gained as much as possible so we have to have those incentives
00:19:41be pretty low to prevent people from gaming it but once a user starts to steak and starts to then they end up becoming really really active at some people who've never missed a steak since we started it and that payouts there what's the structure so do basically win
00:19:59what you stake like hot house the return on staking structured for stickers well as with a lot of things on the rights kind of complicated again you have to be careful about people making multiple accounts or any system you design SP like Sybil resistant and I think if
00:20:14anybody's in following anything about a theory animal that the topic what he's writing about is always talking about ten you make something symbol attack resistant yes because you basically conga line I any identity so underwrite when you stake NMR you're part of a staking auction and in that
00:20:34auction we basically trying to allocate money to the people who are seeking and you kind of want auction to be there's no incentive to make another account and submit again to kind of game that so that's we do what sort of a variance of a multi unit Dutch
00:20:52auction I think the then people like do their stakes in that way and that way like the more you stake the better but it's also not like over powering to really really rich uses you might steak loads just try to win the whole prize pool how is the
00:21:08price fool determines like the size of it and how that change through time or has it changed through time and maybe just to put like some tangible numbers around this would be an example of of %HESITATION data scientist another all anonymous but some data scientists that's been active
00:21:22and how much they've taken home as a result of their work on the platform well we definitely have actually had some better made more than a million dollars which is crazy and that's because the current value of our tokens has been very high compared to say how much
00:21:37I work at it took initially to win some of the NMR before it was launched some uses that incredibly well now we pay about five thousand dollars per tournament per week which is about twenty five thousand dollars per week a hundred thousand dollars per month so it's very
00:21:53profitable to people who get into it it's quite strange to me that you may I started memorize very inspired by the Netflix prize which which went on for a very long time and it was only a million dollars and now need arise paid out over eight million dollars
00:22:08since starting and like you said a lot of that I I know if you look at the chart of numerator obviously it's it had like an insane on the what the network you if it was at the peak but something remarkably high and as with all kept I
00:22:20think it settle down lower than that but it's this neat way of engaging a big and smart audience again we talked about the importance of the data there I just love to hear your philosophy on what features and what targets defeat the system I understand that the whole
00:22:36point here is that you're not gonna reveal like what the actual data sources are but just like the philosophy around what to choose where to look for how to source interesting data what targets to tie it to like obvious everyone you said before maybe zero is better turn
00:22:50someone is good returns or something simple like that but they're all sorts of outcomes in markets that could be binary as your targets that you're trying to fit too so I just love to hear your philosophy on how to choose the data what to feed and how to
00:23:02choose and feed targets as well I think for any of this data you wanna be make sure that's it's normalized in some way so I'll tell you mean for numerous data one thing we I think have been open about publicly is that we don't use alternative data or
00:23:19any data that's itself kind of already very near we try to use very clean very long structured data and you know we already doing a lot of complicated things in one company to then also used very very complicated when data but once you have that one of the
00:23:37things you're trying to do is is make the pasta look a little bit like the present so if you had a hundred pictures of people's faces it would be way better if they were all looking in the same direction had the same lighting system way better if all
00:23:53the faces were like mug shots because then they're all looking like they're from some kind of similar face distribution but if they're people looking to the side people with glasses on people then it starts to be irregular and it's much better to have a regular data so you
00:24:08kind of want to make the past look like something like out of the present by making the pasta normal nine somewhere in regular in some way and that's what we're trying to do with the features and the same thing for the target variable it turns out you don't
00:24:24really want to make a hedge fund you don't really want to make money when the stock goes up you kind of want to do it on a risk adjusted basis so you wanna make your maybe your target of one should be something like a risk adjusted return and
00:24:38that's similar in look kind of like your goal with regular rising your training data you also want to regularize your targets so new arise justice very simple binary classification problem on lots of other financial obligations people actually use regression where you know you want to remodel the exact
00:24:57return but we kind of like simplify it by making it zero one am I interpreting that the piece on data than the correct way when I think okay this sounds like if you've got a big long history of stocks and you know it's a lot of date on
00:25:10stocks historically that you're looking at things like you know financial statements and price and things of that nature that are I would say let's say common in sort of the Quandt research world things like the sources of data that led to the value factor in the momentum factor
00:25:25and things of that nature yeah that is the kind of classic Kwan style data without you and momentum and things like that are part of the literature of quantum vesting so you sort of have to have those as a starting point but you can have you can do
00:25:42lots of other things as well and again none of the features on new Mariah are that your feature one isn't value it and or anything like that but in principle they are does the very common features in the stock market and so those ideas somewhere in the right
00:26:00state how much do you subscribe to this kind of old school wisdom of crowd model or think that I think therefore five components of required for a crowd to be superior to an individual it's like you know diversity of opinion there can't be group think they can't be
00:26:16overlapping gotta be decentralized how do you think about that model the is that something that you thought about when when setting this up they ought yeah I mean people often talk about the wisdom of crowds and they they actually forget all of those requirements you have to have
00:26:29certain things for that to be for it to make any sense Sir if you poll people on the street you say what stock should I buy and you you ask a hundred people that's not going to be a good strategy because on those people are very car lated
00:26:43with each other and they all just say by the Tesla and cells not job or something in the city so you definitely need to mathematically if you look at what it what it means to average two things if they are called linear and the average is identical to
00:26:59just using one of them if they are both ogle and they each have their own signal then they actually cannot be additive when you so these things are like yeah very important and not the way we think about this how big is the group of stickers that say
00:27:15how many people have staked any new bride at any point in the beginning of the year we used to get sixty states per week and now that up to six hundred stakes per week so it's pretty cool that it's being used so much and it's way more use
00:27:32than most things and if you're in yes sometimes where there's more stakes in dollars and number than they are crypto kitties being sold and so it's really interesting to see that maybe watching isn't about consumer things like critic case but maybe it's about enabling we had a company's
00:27:49that couldn't exist like new arrive it's definitely interesting that a hell weekly basis six hundred data scientists would be seven algorithms to a blind date to set you know it's definitely feels like things like that will happen in the future like you said maybe this isn't well suited
00:28:02the structure to the health care world or other verticals but it definitely shows just in raw numbers the power of the crowd it begs the question for me of as the person at the top of the centralized piece of this which is you know taking these these algorithms
00:28:16and actually turning into a strategy for real money how you think philosophically about the move from old Kwan I'll call it which was you know for my hypothesis tested more linear or sometimes non linear but more linear type lease where regression type work versus something now that is
00:28:33way less interpretable so if you're working in fifty feature dimensions we can all visualize that scatterplot maybe a three dimensional one but once you get to fifty dimensions in many cases we have no idea really what's going on in these models to come up with the predictions and
00:28:48one interesting question in quant is always how do you know when it's broken so if you put a a model into production how long do you give it to work where the signals that you look for to unwind it and maybe move on to something else I'd love
00:28:59to hear your philosophy on whether not you care about interpretive ability of what the models are doing and how to know when something is broken we definitely going to care about interpretive ability and I think probably have taken that's almost as far as the kid ever goes if
00:29:12you think about the way near where I work psych we do not build their own models we just rely on the people to build a model so the people who the datacenters you're building on numerous data they're operating on data that they cons see they don't know what
00:29:28it is and then when they submit predictions to us they are not giving us their models we don't know how they made their models how interesting they're just giving us the predictions so we're like a licensing predictions from them but we never see their code so what we
00:29:45are putting into production is the role of predictions without even knowing the algorithm that was used to create it so we don't know what the model is that was built on data that the data scientists don't know either so it can be lessened terrible do you concerned at
00:30:04all like from like a supplier power standpoint with that it's a great surprises me that you don't get the model itself you're just getting the production Saudi control for you know the people that build the best model not submitting their predictions anymore in the future did you think
00:30:16about having the process be that they actually have to give you their model and cares about how I thought about that yeah it does make a kind of tension a good tension between us and our users because they really can have this genuine threat of not coming back
00:30:32if we don't pay them we do something stupid and I like that tension and I think if you do I see the all the companies sort of doing things similar to newer I like this one will come to be in where they have all the user models just
00:30:48sitting on their service managed a what we promise you know we won't use them without asking your permission or but that's something not really how that's not really part of the crypto zeitgeist which is like you actually want to be able to genuinely jump out of the systems
00:31:06to express a certain thing ending be engaged in a kind of trust list relationship you know we are not trusting our uses with our data where totally obvious getting and why should they trust us with their models and the I think that's the way things are going and
00:31:22most systems that you know when you don't feel walked in you end up being on Gage than if you were locked in a certain sense I remember that I think with PayPal there was this thing that the easier they made it to withdraw the money that you had
00:31:36to pay about the last people agree with choice because they always need a will these ways of drawing and I think that's a nice way to go with with something like this make it isn't easy to leave this possible you mention that you don't build models internally do
00:31:50you build a model of models on top of the ones that have been the best performing well we actually did do a lot of complicated things with making and sort of spy metamodel that was based on doing machine learning on the user models to do the ensemble but
00:32:05it turns out yeah the power of the end of March staking it's very hard to beat taking a simple average of the state models we have a few things that help besides that ends up being a really good bass line and so yeah it's almost like the the
00:32:22users of self selecting by choosing to stay or not so they're telling us they should be a meta model and what they're telling us is true so what we're looking forward what is the most interesting I think I guess set of things that you're working on that you
00:32:35think might improve the overall process so if if the move from not having the crypto currency to having a stable cryptocurrency it sounds like maybe the major of one of the major if not the major evolution and %HESITATION the success of the model to a two point one
00:32:48sharp what are the things looking forward that you think will keep you at the edge of the of the competition some big ones that we've done recently were improving the staking auction in some ways I described earlier where you make it's rational only to submit your bass model
00:33:03then that's had a huge impact taking us even further so sharp and then also releasing multiple tournaments are right now on the right is actually five simultaneous tenements at any time and users can you maybe get one but not the other one and so we getting more diversity
00:33:21of models because the target parables are different in each of those so that kind of thing is very interesting we release the five tournaments and the first couple weeks it only took a couple weeks for the average person to stake in four point two five of the tenements
00:33:39so it's sort of like this amazing thing where our data scientists are totally ready for whatever data we give them and they're willing to try lots of different techniques on all of all these tenements so I think if you think about another hedge fund imagine you had a
00:33:56hedge fund those had a very good global equity model but then they decided you know we should do currency thing that would sort of take many years of hiring people and getting a whole bunch of new things set up before they could even do anything and for us
00:34:10it's like oh we found the currency data set let's stick it on and see what people can do it and then very quickly have the best model in the world for that data sets so it's really nice to be able to do that so there's lots of progress
00:34:25we can make it not to mention by just releasing new data sets and every single thing we've done like that has has improved the shop so it's not clear whether that will end and then going forward I think the one interesting thing that's we'd sort of hinted at
00:34:39which we haven't announced yet is the idea of could we crowdsource the data hot so we are crowdsourcing models could we get people to give us data and if you just did that simply like Hey everybody who has a data set email me and will you have the
00:34:59same problem love to get a whole bunch of bad data bunch of spam and maybe you can use a blockchain maybe you can use NMR to give predictions of different of a different kind and so that's something that's very interesting and something we've been thinking about for a
00:35:18very long time is that sort of like an oracle problem effectively in crypto it is in the it is in a certain way yeah we would need to trust something but I think you can basically I think the you know it George alcohol yeah said the name sounds
00:35:37extremely familiar market for lemons he's a Nobel Prize winning economist he had this thing about if you have a market where there's lots of bad cars then none of the good cars whether he in that market and that's in financial data there's a similar thing happening where if
00:35:59someone were to email you and say Hey I have an amazing model that predicts the stock market perfectly you know you don't even reply to that email you don't even bother even though what he's promising has insane value the probability that he's actually got something is so so
00:36:16love it's not even worth considering but maybe if he emailed you instead I have this really good model and and actually state ten thousand NMR on the fact that it's good maybe you actually want to read that that email and look at spend a little bit of time
00:36:37looking at his data considering you're playing a different game now you're playing a game where he's actually got something at stake and I think that's pretty cool and I think you can extend that to basically web as market collapse this huge markets do not exist for I have
00:36:54signals to stocks always so many questions all the signals were really going to work does the person why isn't a person just trading it themselves all these things that make it impossible for good markets to former on that I think those markets would form if you had it
00:37:10stakes attached to all of them really interesting idea and so on ask a clarifying question so it's the way you're describing it sounds like kind of commission all you're doing is alternative data sources maybe stuff that hasn't been around all that long but this sounds like you might
00:37:23allow someone to come with that predictive signal but that's based on their own data so long as they staked it or is it just that they have some interesting data and they're staking the idea or they're willing to put a stake behind that if they feed a raw
00:37:39data set into your other the other party or engine that the data scientists that they'll find something compelling which of those two is it it's really more just if right now on the website the when you upload predictions you're all these abstract security IDC you know like you
00:37:53don't know what they are but if you could upload to numerous websites that you think apple's going to go out and using Google is gonna go up and you can actually give a real ticket symbols then you could be generating that up and down signal from Arbor tree
00:38:09data that you have of data whatever so it's both at the end of the day the submission is still a signal not just a raw data sets or get someone couldn't say you know I've created this amazing you know whatever it is data set on all companies brown
00:38:23what the hell to do with it so I care data scientists figured out instead it snowed just do whatever you want to do but you're submitting a signal in your staking something on the on the predictions and better than that signal yeah yeah understood I think like opening
00:38:37number I up to let other people you said it is so I like well to let other companies use it it or other industries I think it's just so the wrong approach instead it's more about individuals like you in in join you have some data that you want
00:38:53to follow the protocol to upload it then this could be a really good place to put it especially if you want a steak on it I would love to hear your impression or take on the rest of the maybe even specifically quantitative investment community you said you came
00:39:07from a sound like a more traditional hedge fund where you were working on machine learning one maybe at the time not many other people were you mention quintile bien sound to signal you you mentioned some of the names in the in the cot world I just pictures to
00:39:20get your take on where you think that world is in relation to some of the topics that we discussed so how many of the big players are actively using machine learning techniques versus the kind of old traditional way of thinking about quantum yeah I think there's definitely machine
00:39:39learning as not as well used as people might think by the big ones they might have teams that are sort of the machine learning but probably none of this stuff actually in production and I think places like renaissance don't use machine learning from what I know from some
00:39:55of the people who work there that's just not what they do so and one of the reasons for that is that they just kind of have an edge if they have a really good data edge and suppose they've been just collecting stock market data long before anyone else
00:40:09and actually storing it in databases long before everyone else did data just so big that they don't need yeah modeling is to be very big that's what it's counter intuitive like I guess the brand of runs on it's not like you know we have the best data that's
00:40:22not what they sang they stuck with the smartest people but maybe they just have the best data that's why I asked the question earlier about like if you had a pie chart of the importance of the data you have verses the I guess it's the informational verses analytical
00:40:34or modeling edge that everyone always talks about which is more important it's you know it's interesting that maybe places with all these beaches maybe if the data is not a not a modeling one really interesting what else is exciting to you in in the I guess the hedge
00:40:47fund world it's interesting to hear all of these very crypto like philosophy is applied to a very old school on changing model Sir are you encouraged by that overlap or there are you frustrated by it in any interesting ways you know assuming you're you're set up with prime
00:41:05brokers just like everybody else and you're in the same %HESITATION you have the same legal teams has every other you know a headliner hedge fund I'm just curious your take on that the intersection of those two worlds and any surprising reaction she might have had to them I
00:41:17think one of the interesting things about the crypto when you're designing something in crypto you're really thinking about what's the best thing you're not thinking about making money and this could be more troops say someone like metallic hue when you're thinking about designing proof of steak or something
00:41:38like that this is like the last thing on your mind is is like money is just like how could you get the system to work then all these different actors to behave in a certain helpful way and I don't think anybody has thought about the hedge fund industry
00:41:53with that lens so I'm trying to be the topic of hedge funds and basically look at the stock market and say well if you're an alien coming to earth and you're like well clearly capital allocation is a super important problem capital you need all the money to go
00:42:11to the right places otherwise society won't move forward and technologies will be created and so this is like a really important thing and the way that the humans have set this up as well let's have a huge we had zero sum ask game where there's just like all
00:42:28the smart people from the top universities joining hedge funds that are identical to other hedge funds in every single way except that their friends that one in there at this one and they all buying the same data and none of them are sharing any knowledge about what's going
00:42:41on seems like a selling point doesn't seem like a super Nash equilibrium there's probably some technologies we can use to make all this much much better to describe shelling point just not to take it for granted something that is the is the neck will agree on I don't
00:42:56have a definition for me but anyway it is inequitable him but but is is kind of like a local local optimum or something like it could be way better if a certain things changed that's what new arise been doing way you have things like well we want you
00:43:11to help workers work on on this data with us but we don't want you to to run off with the data if we could just solve that we have a completely different system we actually have a distributed heads which is never being done and so that's the problem
00:43:25we first worked on and then we're like well now how do we get people to really believe express their beliefs and then we like what we used newer and and %HESITATION made a cryptocurrency that did that and that Yasser the kind of game design of the company is
00:43:43the company we are not trying to do anything except make a system that is resilient and long term going to be way way better so yeah that's sort of my take on the hedge fund industry and I I don't I mean I like that in a way no
00:44:01one else is coming at it from that perspective one kind of final question on your strategy is and I'm obsessed with marketplace businesses in general I just think it's so interesting when a business has to build two sides of a market effectively you know you're staying decide to
00:44:15someone's site of investors on the other side in your the the thing in between that the concern would be somehow a competitor comes up and spins up something that pays out more or in some way is caters better to the data scientists as a better audience building strategy
00:44:31so I just love to hear your thoughts on that like how you reach data scientists how they become aware of new Mariah eight that's very intentional versus just word about maybe this is where math but if it's not word of mouth like strategies you have for building that
00:44:43audience and for keeping them happy given that they represent such I think a large portion of your advantage we didn't do any advertising or anything like that but numer I because it had all these unique properties ended up took a PR dream yeah it sort of became like
00:45:00as there was a lot of press about new Mariah and it's very interesting to me how you can have if you ask a Silicon Valley software engineer name three hedge funds that would that new Mariah would would maybe be one of them is kind of crazy but yeah
00:45:15that was a big part of getting users and yeah I mean we could we be doing any of the everything we've been doing has been like zero two one so we have made some mistakes in some of our users get grumpy with us about certain things and we
00:45:30have been kind of innovating faster than we've been maintaining in some sense and in some parts of what we're doing so but I do think that our users are you know at any given time if you're holding in the markets because you think it's better than selling it
00:45:46and we have loads of users holding an MRI and sticking in every week and it's up ten actions beginning of this year so I like where we are in terms of all the important things and you have never understood cryptocurrency valuations so it's hard to know how to
00:46:03think clearly about that when they so Carly did even when you arrive so different but yeah I think you'll be hard for another company to come and do something similar to name or I think there's a lot of things under the hood as well as comedy scene will
00:46:16definitely a an interesting intersection of some of the more important technologies of our day so it's been interesting to hear about everything you're doing my closing question for all my guests is to ask what the kindest thing that anyone's ever done for you as coming to mind is
00:46:32in fact the early parts of New arrived way I came to use silicon valley without any connections here and those coming from South Africa and how it Morgan the co founder renaissance technologies was off first investor and he kind of has a similar thing I think ways just
00:46:50not is not in the time it is the way he wants to just invest in things that are nothing but making money he wants to invest in very interesting things and things that could be really interesting for the world and and society and things like that so he
00:47:06was extremely kind to me by investing in the right wonderful what will thanks again Richard for your time and I hope to catch up as you guys continue to evolve yes thank you Hey everyone Patrick here again to find more episodes of best like the best go to
00:47:23investor field guide dot com forward slash podcast if you're a book lover you can also sign up for my book club but investor field guide dot com forward slash book after you sign up to receive a full investor curriculum right away and then three to four suggestions new
00:47:38books every month you can also follow me on Twitter at Patrick underscore OSHA act as HPG if you enjoy the show please leave a quick review for us on I tunes which will help more people discover best like the best thanks so much for listening and

Transcribed by algorithms. Report Errata
Disclaimer: The podcast and artwork embedded on this page are from Patrick O'Shaughnessy, which is the property of its owner and not affiliated with or endorsed by Listen Notes, Inc.


Thank you for helping to keep the podcast database up to date.