our everyday life More than ever before are connected to the globalized grid. Products are remotely sourced and shipped; Traveling to a place 3,000 miles away may be easier than crossing a large city in traffic; And broadcast information to anyone and everyone at the tap of a finger.
called a startup Sanaso Some have developed AI voice technology that aims to make an important component of that grid work more smoothly – how people who speak the same language but with different accents can better understand each other, making accented voices work more smoothly. Filter and convert those pronunciations to another in real time. , Today the startup is announcing $32 million in funding for its devices on the heels of strong momentum as it comes out of stealth and launches more widely.
Insight Partners is leading the investment with participation from new backers GV (formerly Google Ventures), strategic backer Assurant Ventures and angel investor Gokul Rajaram. Previous backers Human Capital, General Catalyst, Quiet Capital and DN Capital are also participating in this Series A round. In addition to the investment, Sunus is also announcing a strategic partnership with Alorica, one of the world’s largest BPOs, offering technology to 100,000 employees and 250 enterprise customers globally.
The company is not disclosing valuations but we understand it to be $150 million post-money. This Series A is one of the biggest for a voice AI startup, and from what we understand, it’s coming after Sunus turned down an acquisition offer from Google. (If you can’t buy them, invest in them!)
As you might have guessed from its list of investors, Sunus’ technology is already being deployed in call centers. In particular, it has found much traction with far-flung customer service providers becoming the focus of abuse against agents who may speak the same language as a customer but have a heavy accent. Huh.
In addition to insurance giant Assurant and BPO Leviathan Alorica, other clients include large collection agency firm ERC and travel industry BPO IGT. In a sad commentary on the state of our world, Sunus CEO and co-founder Maxim Serebryakov said that the result of using technology in these places has been dramatic in terms of a reduction in agent harassment.
Sunus plans to continue expanding its business in that vertical, but to begin sizing up for other use cases in the enterprise, for example as a plug-in for video calls, or voice-based interactive Have to use both for services. To help machines (and ML-based systems) understand a wide range of pronunciations.
Serebryakov initially co-founded the company with Sean Zhang and Andres Pérez Soderi, two fellow students at Stanford’s artificial intelligence lab, after taking a job to help his fourth friend drop out of school and return to his native country, Nicaragua. needed to go back. a family emergency.
The friend took a job back home at a call center serving customers in the US, and even though he was completely fluent – and a student taking leave from Stanford no less – he faced endless abuse over the phone from those people. Had to pronounce those who did not like.
Three others understood that judgment, reaction, and abuse very well, being first generation immigrants (and I would add that I also found myself doing this very well in my current life and growing up as one). I know from. First generation immigrants in America). And so they decided to put their AI learning to the test to see if they could fix it. (Earlier this year, Sunus also picked a fourth co-founder, Sharat Keshav, who is now also COO, who left another company after learning about the company and wanting to be involved in building it, which he co-founded, Observe.AI.)
There are tons of tools out there today to “autotune” and modify a person’s voice in real time or delayed time – they’re just as common as photo filters at this point. But as Serebryakov notes, it is especially difficult to preserve the natural, real voice and be able to change what it is saying.
What’s interesting is that the problem is so abstract – Sanus has approached fixing it by incorporating thousands of hours of differently-pronounced speech into a system and ordering it to match with other sounds, including techniques And the whole mix of method is now under patent as well. Process – that end result is that Sanus’ pronunciation “translation” engine can be used with any language, not just English as you might have assumed. (Serebryakov tells me it’s already being used to “smooth out” pronunciation in Japan, China, and South Korea, for example.
“This kind of technology is applied globally, from one utterance to another,” he said. “It will take time, but our goal is to let people communicate in any accent.”
There’s a certain unease around the concept of what Sunus is doing and doing here. This raises a lot of questions of potential abuse, and moreover some people may find it distasteful and obsolete that technology should be developed specifically to obscure a person’s true identity: should those people not. Who’s judging on accents who should learn to be more open-minded and accepting rather than people who are forever accommodating prejudices by hiding anything that marks others as extraneous or different?
However, there are points that argue against these as well. Sunus is not building any applications specifically for consumers or making its technology accessible to them at this time because of how it could be misused. Even its customers aren’t using a cloud-based version of the technology: To keep things extra secure, it sits on-premises and so customers control their own data that Sunus’s own. passes through and is generated.
On the part of hiding true identities, this is certainly a big issue that we all need to deal with on a daily basis. And in the meantime, it’s giving people who are at the fast end of those jabs a way to cope better, and some very practical ways for people (even those with good intentions). are) making it easier for each other to understand. There is pronunciation on the way.
I had a demo of the service during my interview, where Sunus called one of their clients’ agents in India, asking him to chat with me first with his accent, and then with his mid-west-neutral. Turned the tone “on”. It was a little scary knowing what was happening in the background, but on the surface, I was quite surprised at how natural it all sounded—well, naturally, at least. His voice was clear, but perhaps a little too articulate, and almost a little robotic and emotion-free.
Obviously, that too is somewhat intentional for now, and could evolve if that’s what customers and other users want.
“The reason we’re focusing on call centers is because it’s a low-hanging fruit,” Serebryakov said, noting that the difficulties of effectively building ground-breaking technology were daunting, but it Also happens to fit the use case. “When making this for us, it was important to go the path of least resistance. No songs, no laughs, no hyper emotional speech. What we’re dealing with is that we need to give up control over this.” Trying how these users interact at work.” There’s no crying in baseball, and no fun and games in the call center either.
“Insight Partners is thrilled to deepen our relationship with Sunus on such cutting-edge and transformative technology,” Insight MD Ganesh Bell said in a statement. “As the company comes out of its latent phase, I look forward to working with this extremely talented and passionate team to create a product that will, among many things, be experienced by English speakers as one another.” This will help to end the unfortunate prejudices and discrimination that perpetrated in the language, which pertains to many Sunus employees.”