Q&A with the Founders of Subsalt
Subsalt is an early-stage data startup headquartered in Charlotte and founded last year by a trio of former senior leaders at Passport. The company has largely been flying under the radar, and it seems the founders like it that way.
Toward the end of 2021 and with no fanfare, Subsalt raised a sizable $800,000 pre-seed round with investment from Rackhouse Venture Capital1, Grotech Ventures2, Creative Co3, and operator-angels from CLT and the Bay Area. The goal of the funding is to develop and test a beta product and then release a generally available version for commercial use.
The product concept is a lot to wrap your head around at first. However, it is based on a powerful insight: there is significant potential to unlock productivity by making it less costly and time consuming for companies to share their data with external partners as data protection and data privacy concerns become increasingly salient in our society.
We recently spoke with the three co-founders — Ben Winokur, Luke Segars, and David Singletary — about their backgrounds, what the problem is they are trying to solve, their plans for 2022, and what the name is all about. Our conversation has been edited for clarity and brevity.
Can you talk about how the business was formed?
Ben: Subsalt was founded to address a problem we saw at Passport: data sharing is really valuable, but as organizations have tried to share data in new ways, novel privacy challenges have emerged. At Passport, we saw this as tech companies began to share transportation data with cities, specifically in the case of e-scooter data. This problem was outside the boundaries of what Passport needed to solve to serve our clients, but seeing the impact of it made it clear how important this problem is and convinced us that the stakes of data portability are really high, both in terms of the potential benefits and the potential costs.
What are each of your backgrounds?
Ben: I joined Passport as a general counsel, so I have a legal background. I had the chance to work with Luke in the product management organization at Passport, and for the last two years I was the chief of staff.
David: I joined just before Ben as a salesperson and for the last few years I worked as the head of sales and worked closely with Ben and Luke. Ben and I worked on hundreds of deals together negotiating data rights and agreements with cities.
Luke, Ben, and I collaborated together to bring some paradigm shifting technology to market in this space which was really neat. Very much a “zero-to-one” moment — like a startup inside of a startup. The three of us got really good about figuring out the problem statement and a solution that fit that problem statement really well, and then we took it to market. We worked collaboratively with a small number of clients to bring it to life, and we did that a couple of times. In some ways, it was the beta version of us starting a company together — figuring out how to find product-market fit. Generally, at Subsalt we are trying to bring a new paradigm to a fairly conservative space. Big corporations are cautious, regulated industries are very cautious, and we’ve got experience in that space.
Luke: I’ve spent my career doing product work. I have a technical background and have been building our product at Subsalt. I spent about 6 years in the Bay Area before moving to Charlotte and recently moved to Seattle. I really enjoy enterprise product work and understanding problems that folks have in these really complex organizations. Like David said, the three of us have attacked the same problems from three different angles together. We came at this issue when we were collecting lots of data, for example when predicting parking demand. Technically, it was easy to share data with other organizations. You have Snowflake and a bunch of tools out there to move information from point A to point B. But legally, it is very difficult to grant access to a potentially very sensitive dataset (for example, location data on where people are parking and where they live). So there was a lot of friction in practically doing things with data even though technically you could.
That is ultimately the problem we encountered working at Passport. We found that this problem exists broadly in every market, especially as privacy starts to become something that is top of mind.
Where we are starting is in healthcare where you have regulations like HIPAA that make it a major legal issue if you share things in a way that you aren’t supposed to.
Are you mainly focused on the healthcare vertical?
Ben: We are building a horizontal solution, not a vertical solution to this problem. But, finding the right wedge into this problem set is something that we have spent a lot of time exploring early on. We spent a lot of time talking with people in organizations that have variants of this problem. What we found is that the problem was most well defined in healthcare. It is the place where HIPAA privacy rules were actually creating a relatively consistent problem space in terms of the process that exists inside of healthcare organizations to manage privacy compliance. Those processes create inefficiencies and actual costs.
Hospital systems have a need to make their data available to partners in the supply chain — an insurance company, a post acute care company, or AI/ML vendors that glean insights from that data. Our approach here was to go where the problem was best defined and understood. Where there was a real cost center to go solve.
Who is your competition right now?
Ben: Data access in healthcare right now is largely process-oriented. There are committee approvals necessary. Each entity who needs data has their request scrutinized to make sure they get only exactly the minimum amount of data necessary — which is actually a requirement in HIPAA. Each entity has to verify that they are agreeing to a set of formal cybersecurity policies. And they are paying in time/efficiency loss and the expense of internal resources. The manual approval process is how they show compliance and good faith adherence to HIPAA standards. So what we are ultimately competing with is a lot of paper process and people process - places where spending more manual time is currently legally necessary. We think technology can play a huge role in making those processes unnecessary so these things can happen much more quickly so that these organizations can get a lot more value out of their data.
Luke: And it’s not hours or days that they’re spending to get these access requests done. Every access request can take weeks or months. We are working with three organizations who have this problem in spades. We’ve seen this with well regarded researchers working on COVID and waiting months before research can start. This is happening all the time. It’s a constant cost center that comes from trying to do things with your data. It’s like a tax on productivity with data.
On your website you talk about “synthetic” data. What is synthetic data and how can it be used in practice?
Luke: The use cases here are things where you don’t need individual level precision. If you need to call a patient to follow up to see how they are feeling, synthetic data is not the right tool for the job.
We focus on analytics tasks at Subsalt — so machine learning, business intelligence, and research — all things looking for population level patterns. Those are the things that Subsalt is useful for. Those are the use cases where it is possible to de-identify people.
Synthetic data has really matured with the deep learning wave in data science. 10-15 years ago it had the same shape, but the quality and representativeness of synthetic data was quite low. A lot of the new deep learning architectures have allowed the quality of synthetic data to dramatically improve within the last 5-10 years.
What we use synthetic data for is to create “lookalike” data that can’t be tied back to real patients. For example, for a COVID testing dataset, we can train a model and create records that are not tied back to individual people. You generate brand new records that are not real and throw away the original data — which has a lot of benefits from a compliance perspective to go back to the problem we are solving. You are effectively not sharing HIPAA-covered health data at that point.
We’ve been working with some ML platform companies to train models on real data and synthetic data and compare results. We’ve gotten within a few points of precision with synthetic data on things like classification models without having to go through the months-long process of manual data approvals. We are actively working with some folks like that in the industry to see if we can actually mitigate the need for real-data for things like machine learning, making the whole transaction a lot smoother on both sides.
Can you go into the ROI a bit more? How do you explain the value proposition to potential customers?
David: There are a bunch of immediate direct costs tied up in how this data gets shared today. The long term of this idea is enabling all industries to share data as if it were all unregulated, which is the way that all commodities trade. You think about the data economy where you can buy and sell data freely on an exchange like Snowflake or Amazon. Those have to be things that don’t take 6 months per transaction to get a contract on — they should be able to be traded in hours and minutes not days and months.
We think we can unlock an exponential number of data transactions between companies in ways that are just impossible today and becoming harder over time as regulations about privacy extend beyond healthcare into things that are unregulated today.
What does the rest of this year look like for Subsalt?
Ben: There are a few milestones that we laid out when we raised our recent round of investment. On the product side
Get a general availability product v1 to market. Luke by himself got an alpha version up and running and hired a data scientist to work with to get a beta program up and running.
We are working with three design build style partners on proofs-of-concept so that we move from unaffiliated beta testers into true commercial partners testing the system and making sure it is sufficient to achieve ROI on cost savings and increased use of data.
On the customer side,
We are working with three organizations now. Our goal is to have a paying customer by late Q2/early Q3 and have that be first revenue and validation of willingness to pay in the market. Validation that customers who have access to the software see value in it, and it’s solving an important problem for them. We want to achieve the earliest indications of product-market fit and pour a little gas on the fire to scale up the product development motion and go-to-market motion.
We have to ask. How did you come up with the company name?
Ben: It's a tongue-in-cheek reference to the axiom that "data is the new oil." Since we're focused on allowing companies to access and use sensitive and regulated data that has traditionally been hard and expensive to provision, we saw a comparison to Subsalt oil formations, which are oil reserves trapped underneath a salt formation. These reserves were difficult and expensive to extract historically, but new technology has improved and they're now commercially feasible to drill.
If you want to learn more about Subsalt, you can connect with the Subsalt team via their website or LinkedIn.
Thanks for reading the latest edition of our Q&A series! We’ll be back in two weeks with our May Dispatch.
Subscribe on Substack | Follow on Twitter
Previous Q&As
What would you like to see in future Dispatches?
Are there any topics or companies that you would like to see covered? Individuals in the startup scene we should interview? Drop us a comment and we will make it happen!
If you enjoy reading these Dispatches, please share us with your network!
Rackhouse is a VC based out of San Francisco. It focuses on Data Science investments and is run by Uber’s former head of data science.
Grotech is a VC based out of Maryland but has a strong presence in the southeast with several portfolio companies based in Charlotte and the Research Triangle.
Creative Co is a growth equity studio based in Charlotte that was founded in 2019.