Behind the release: Our product team talks Canoe AI & Hybrid Extraction
Alternative investment documents are notoriously messy. A capital call from one quarter might have a single investment listed; the next quarter’s might have seven. Formats change, tables appear and disappear, and no two GPs report quite the same way. It’s this complexity that has made alternatives one of the last frontiers for automation. We sat down with Catherine Bacon, Product Manager of Canoe AI, to discuss the company’s new hybrid extraction strategy – a multi-modal AI approach that adapts to whatever the documents throw at it.
Let’s start with the big picture. What is Canoe AI, and what problem is it solving?
Catherine Bacon: Canoe AI is our strategic AI program, integral to all parts of our platform. The beauty of AI is its flexibility. A lot of the approaches alts solutions have brought to bear so far aren’t always as flexible as things change over time. Canoe AI represents a more dynamic approach that’s nimble in responding to new information, new changes, new formats—with less intervention from people.
The nature of alternatives data is that we see so much variation in how it’s reported. Even with templates standard in the industry, the depth of data may change from document to document. There’s more to report in one document than in the next one. So, when it comes to extraction, an approach based on the document’s structure doesn’t adapt well to those nuanced changes, versus an AI approach that’s really looking at the context surrounding the information.
Everyone claims to have AI now. What makes Canoe AI different?
Catherine Bacon: It comes down to the expertise and the model tuning to specific data points that this industry cares about. We ran tests with out-of-the-box models and they didn’t perform well. They’re wrong a lot. They don’t meet our standards for quality and accuracy.
It’s our strongly held belief that off-the-shelf AI doesn’t get you far enough in nuanced industries, and alts is a nuanced industry. The best outcomes and the best data will come from applying deep industry expertise to tune models, ensuring that data is extracted in full accordance with the nuances of the industry.
You recently rolled out Hybrid Extraction for Canoe AI. It sounds like Canoe is not simply replacing one method with a new one with this update. Could you talk about that?
Catherine Bacon: Hybrid extraction is based on the philosophy that we should use the best method to extract a given data point. We have reason to believe that large language models (LLMs) will be the best approach for most data points because they are more nimble and less reliant on continuous structure. But here’s the thing—some documents might still perform great with more traditional, pattern-based extraction. Consider account statements, which tend to be highly structured. Maybe that performs great with patterns, and we can actually provide extraction faster with a pattern-based approach.
Hybrid extraction is really about the flexibility to utilize the best method and achieve the most accurate and complete result with the least intervention. We’re not locked into only one method of extracting data.
Can you give a concrete example of what the ideal state of hybrid extraction would be?
Catherine Bacon: The most obvious example is something like a capital call where investments are being made. You might have one investment made for a portfolio company in a call, or you might have six. Every new investment is generally reported as an individual row. So you go from a call this month that had one row to a call next time that has seven rows.
The rigidity of pattern-based machine learning doesn’t respond well to changes in depth because new elements emerge that weren’t previously present. Whereas an LLM-based approach can respond to that change and see, “Ah, this is still an investment, I now have six of them. I had one before, I don’t care how many I’ll have next time—I know what an investment looks like.”
So hybrid extraction combines patterns and LLMs. Does it also blend multiple LLMs?
Catherine Bacon: Yes, we’ve trained multiple models for individual fields. We also have the ability to run any number of models as appropriate to get the fields that we’re after, even within the same document. It’s not a one-size-fits-all approach that you might encounter elsewhere.
Can you walk us through how the retaining process works for Canoe AI and what key areas of improvement are clients expecting?
Catherine Bacon: We take a really flexible approach when it comes to improving Canoe AI’s performance. We’re always keeping an eye on how it’s doing, looking for any gaps, and determining where retraining would actually make the biggest impact for our clients. We really view retraining as a partnership with our clients. The more fields they map in the Data Mapping Model, the smarter and more accurate Canoe AI becomes over time.
When we retrain, we focus on a few key accuracy metrics like overall accuracy, accuracy when a field is missing, and accuracy when it’s present. The goal is to make sure Canoe isn’t pulling data that doesn’t exist and that it’s getting the real data right as often as possible. If we notice performance dipping in any of those areas, we dig in to understand why and fix it. Right now, we’re consistently seeing accuracy above 90%, which has made workflows noticeably more efficient for clients. And finally, when we train and test our models, we use a really diverse mix of data across different clients, funds, and VPS. That’s important because it helps ensure the model performs well across a wide range of documents, not just one narrow dataset.
How might this latest Canoe AI release translate to real impact for a client who is already using Canoe?
Catherine Bacon: We expect to see higher rates of extraction per document—more fields, more data, more accuracy per field, and less intervention.
It’s not enough to just get the required fields per document. We really want to ensure that we’re going to a depth of fields per document that matches how clients are using data from those documents. This means we are continuing to invest in calls and distributions, in particular, to ensure we can get down to the level of detail that our clients are using for accounting. It should be more of a review-and-pass process, with that continuing to require less and less touch over time.
A recent MIT report suggested that 95% of enterprise AI implementations are failing. What would Canoe’s message be to an investor who is experiencing AI fatigue?
Catherine Bacon: Even though AI feels very new and present and all the rage, it’s an industry that’s been around for a long time. Our ML team members are experts who’ve been working on AI for more than the last six months—even though six months may feel like when AI suddenly exploded.
AI won’t result in failure when it’s outcome-oriented: When you know that you have a thing that you need to accomplish, and you can direct the AI appropriately to do it. People want results in a business context. They need to orient around the outcomes that AI can generate for them, the workflows that AI can actually automate for them, and the things that, at the end of the day, they’ll be able to accomplish tomorrow with AI that they weren’t able to accomplish yesterday.
Having a successful AI offering also requires commitment. There’s commitment in the continual tracking of the success and accuracy, in the continual retraining as data evolves, and continual investment in tuning and staying relevant with the industry. This is not a one-time effort; we’ve done it, so check the box. It’s a continuous practice to ensure the success of this approach.
What’s the one thing about Canoe AI that clients might not immediately realize is revolutionary?
Catherine Bacon: I’m genuinely very excited about our multi-strategy approach. AI is changing so quickly, and the advances in visual extraction are exciting. I’m excited about the investments we’ve made in a platform that allows us to plug in the latest and greatest in AI strategies and supplement with whatever strategy is most effective for a given data point.
Where does Canoe AI go from here?
Catherine Bacon: It’s limitless. The pace at which AI advancements are coming out is mind-blowing. Things that were unthinkable or impossible-seeming three months ago are just someone’s average Tuesday now. It’s hard to say where it’s going to go other than it’s going to move fast, and we’re going to be able to do things with AI soon that we can’t even think of right now. It’s an exponential curve, and we’ll be taking advantage of it, head-on.
How does all of this feed into Canoe’s mission of “alts, smarter”?
Catherine Bacon: It’s really in both halves. For AI to be successful in delivering outcomes for our clients, it has to be all about alts. It can’t just be your generic run-of-the-mill AI. It needs to come with that deep industry knowledge and expertise. We’re committed to delivering AI through that lens.
And the “smarter” part comes in with the hybrid strategy approach. We’re not limited to a single method of extracting data. We expect new advancements continually, and we’ve built a platform that allows us to continue to evolve with the best methodologies.
Editor’s Note: This interview has been edited for length and clarity. The views expressed are those of Catherine Bacon alone.
###
About Canoe Intelligence
Canoe Intelligence (“Canoe”) is the platform for smarter alts management. We redefine alternative investment intelligence with AI-driven software that directly addresses the core challenges of private markets. Our technology empowers institutions, LPs, and wealth managers to future-proof their alts infrastructure, modernizing systems and providing a scalable foundation for long-term growth and compliance. By automating manual data processing with AI-native precision, Canoe helps clients reduce operational costs and risks, significantly lowering overhead and mitigating errors. Ultimately, our timely, accurate, and comprehensive data enables investment teams to drive superior investment outcomes through deeper insights and more profitable allocation strategies. With Canoe, it’s all about making Alts, smarter. Learn more at www.canoeintelligence.com.
MEDIA CONTACT:
Betsy Miller Daitch
Canoe Intelligence
+1 443-690-6200
bdaitch@canoeintelligence.com