You can read the whole thread here. Worth it as always with Ryan's tweetstorms.
Faced with the growing volume of companies in the market and the constraints of a $130 million early-stage fund at Sunstone, we did significant work on this problem in the 2015/2016 time period.
I had previously been part of the teams at Atlas Venture (2007-2010) and Accel Partners (2012-2013) that tried to leverage structured public data about companies. This was also the thesis of my angel investment in Mattermark. We would aggregate data from all public sources - Crunchbase, SEC, press, social media - to build Markov chains that look at likelihood and quality of both next raise and eventual outcome.
There were always several issues: on the one hand, data quality/availability is poor. The sample (N) isn't the total population - many great companies are fairly secretive. On the other, the information advantage (IA) was non-existent: what came out of these models was akin to the Correlation Ventures results. I.e., if Sequioa does a Series A in consumer, you should try and get into the Series B. If Matrix does an enterprise software deal, that's probably a good company. Well, no kidding, but good luck getting into that deal if it's doing well.
So at Sunstone where we are focused on Seed and Series A, we tried something very different: we ignored the company data and instead focused wholly on the founder. We aggregated CV-level data of founding teams across ~5,000 companies and ran a vanilla LSTM neural network on it to make predictions about whether a company would get financed and what the quality of its investor base would turn out to be.
The early results were promising: while data quality was still a problem, we didn't mind as much. As a small fund, it's OK to run the model and end up with a small set of companies that you can then manually screen. False positives were less of a problem than in other use cases. Remember, at that point we hadn't even looked at what the company does. Out of 1,000 screened companies, the model would recommend we invest in ten companies. All of which looked promising.
Getting this right would solve the dealflow issue for a small fund entirely. We were super excited. If this worked, we could focus all our attention on being amazing for these ten companies - both before and after investment.
Not only could we be very aggressive in reaching out to them - we knew we were likely to want to make the investment. So we could buy them flights to come see us. Have them picked up at the airport by a driver. Spend two days with them, really getting to know them. Give them the Zappos treatment. Freeing up all the time spent sourcing and evaluating investments also meant we would be free to spend most of our time helping our existing companies. This was a game-changer. Dealflow would be handled by a machine!
We took our prototype and pitch to Y Combinator. Partially because it was fun to go through the process, partially because this felt like it might be a separate Sunstone fund. We got through to in-person interviews and had a great chat with Sam Altman & company. In the end, YC felt like it would be competitive but offered to introduce us to their LPs.
Alas, as we tried to scale the approach we found that data quality became a harder and harder constraint. Of all founding teams in the market, too few declare themselves as founder of a company before their seed round on LinkedIn. Many companies that met our criteria had already raised. In the midst of the frustration, our very talented developer Michael Hirn left for the beckoning world of crypto.
Two years on, I'm starting to feel the itch of another attempt. Given the constraints around N, what's critical is generating own data. I still believe in the IA hypothesis that the founding team is the best predictor of a good venture outcome - at least at Series Seed and A.
To that end, I'd like to develop a founder self-assessment tool that takes into account founder experience, disposition, personality traits, skills, etc. Think a Hogan type assessment. That data would be correlated with self-reported prior success criteria, publicly available data about the company, and compared to a model of currently successful founders.
We previously assembled this type of data manually for around 100 companies in Europe that were $100M+ outcomes and got to super-interesting results: e.g. you should have at least one immigrant on your team, social cohesion is key (you should have worked or founded something together), having a failed or moderately successful startup is a plus, etc. But we also felt like there were key things we were skipping, questions like: does this person have a good social support system? Are they self-aware? Are they good managers?
A founder assessment like this would need scale - it should be taken by as many founders as possible. Possibly with some sort of "coaching" slant like "we'll let you know what to work on to become a world-class founder." As a minimum I'd like to have 25,000 responses.
Once you felt good about your data, your model and its predictions, you could turn it into a fellowship-type seed programme: "have you and your team take this assessment. It will take 30 minutes and if you get through, we'll give you $100-250K to get started." Again: game-changer.
The firm on the back of this would again have solved the dealflow issue, but it would also have a natural value-add: to take the assessment and coach founders to greatness based on their results. What should they be working on? What weaknesses should they be aware of? What strengths should they play to?
When Marc Andreessen said he thought venture was one of the few industries that couldn't be automated by AI, I thought "this sounds like what I would say if I were an accountant." It can undoubtedly be done. It will be done. The question is: will it be done by us?
If you're qualified and feel like helping me build something like this (read: data scientist, HR/psychology background), let me know. Could be a very worthwhile endeavor and Sunstone would provide an awesome platform to do this together.