loading...
Cover image for Standing on the Shoulders of People who Stand on the Shoulders of Giants

Standing on the Shoulders of People who Stand on the Shoulders of Giants

ben profile image Ben Halpern ・3 min read

Software is becoming more and more about playing well with others; creating co-beneficial relationships in order to work together to build great stuff. Within those relationships, different parties play different roles, all in the name of creating, ideally, really great stuff. Yesterday, DeepGram, a company that uses artificial intelligence to index and provide search as a servicefor audio and video, rose to the top of Product Hunt. When that happened, myself and thousands of other developers got simultaneous exposure to a shiny new toy. I cannot speak for everyone, but I got absolutely giddy about getting to play with this service. I want to expand on why this sort of thing is so exciting to me and I will use DeepGram as the case study, but the principles apply to the general trend of great technology that is built in order to be distributed as an underlying layer for other great technology.

DeepGram

DeepGram uses deep learning in order to index audio files and dynamically create remarkably accurate text snippets for search and transcription through audio file. I have read up on deep learning, so I could make small talk at a party, but the concepts executed on by Scott Stephenson are well above my pay grade. But even DeepGram did not invent the concepts they are working with. These ideas have been decades in the making. This company is making great use of the computer science concepts that have been building to this point and packaging it in a digestible format. DeepGram is standing on the shoulders of giants and I am more than happy to stand on theirs in order to make really cool stuff.

I have already started playing around with the Audio Search AP DeepGram has provided along with 40 hours of free indexing. For me, this immediately offers up solutions for ideas I had been batting around in my head for a while for personal and professional projects. But DeepGram is proprietary software and anyone who develops on top of it is subject to the failures or whims of the company. If Google or Facebook buys up DeepGram and decides to kill off this silly software as a service business the company is carving out in favor of pressing internal projects, there is nothing developers like myself can do. But this sort of platform risk is inherent to our industry and anyone who wants to jump in on DeepGram right now needs to weigh this as a legitimate risk for using something new. Early adopters of Twilio had to take the same risk and the uncertainty never goes away entirely. Zynga built a multibillion dollar company on the back of a Facebook platform that was eventually closed off. The company has lost 80 percent of its value since its peak.

Companies like the aforementioned Google and Facebook most certainly have their own in-house versions of this technology. The NSA probably does as well. The technology is not totally novel, and they are not even the first company to provide a consumable API in the space. Pop Archive was already around. However, the brilliance of the DeepGram product and the simple nature of their product offering makes this a big leap forward for the community. It is going to be exciting to see what solo hackers and small teams make with this audio searching capability.

Accurate transcription of audio and the ability to jump to different parts of audio files based on snippets has applications across many industries and automating this process removes an immense amount of inefficiency and cost in the process. Check out this example application the DeepGram team built which allows you to search Youtube videos for soundbites by US presidential candidates. With the help of some simple language-specific wrapper libraries, even a novice programmer could make use of this technology.

Discussion

pic
Editor guide