Transform Your Speech into Text with the Power of OpenAI and useWhisper

Ra Baryon Neutrino — Wed, 04 Jun 2025 06:46:10 +0000

This article was generated using ChatGPT from README.md

Demo

Are you tired of spending hours transcribing speech into text manually? Are you looking for a way to save time and increase accuracy? If so, you'll want to check out useWhisper, a React Hook for OpenAI Whisper API that comes with speech recorder and silence removal built-in.

Introduction

Transcribing speech is a common task in many industries, including journalism, entertainment, and customer service. However, the process can be time-consuming and often leads to errors. With useWhisper, you can quickly and accurately transcribe speech into text using the power of OpenAI.

Getting Started

Getting started with useWhisper is easy. First, install the package:

npm i @chengsokdara/use-whisper

yarn add @chengsokdara/use-whisper

Once you've installed useWhisper, you can start using it in your React app. Import the hook:

import { useWhisper } from '@chengsokdara/use-whisper'

Next, call the hook within your component, passing in your OpenAI API token:

import { useWhisper } from '@chengsokdara/use-whisper'

const App = () => {
  const {
    recording,
    speaking,
    transcript,
    transcripting,
    pauseRecording,
    startRecording,
    stopRecording,
  } = useWhisper({
    apiKey: env.process.OPENAI_API_TOKEN, // YOUR_OPEN_AI_TOKEN
  })

  return (
    <div>
      <p>Recording: {recording}</p>
      <p>Speaking: {speaking}</p>
      <p>Transcripting: {transcripting}</p>
      <p>Transcribed Text: {transcript.text}</p>
      <button onClick={() => startRecording()}>Start</button>
      <button onClick={() => pauseRecording()}>Pause</button>
      <button onClick={() => stopRecording()}>Stop</button>
    </div>
  )
}

You're now ready to start using useWhisper to transcribe speech!

Configuration Options

One of the benefits of useWhisper is its flexibility. It provides several configuration options that you can use to customize your speech recognition experience.
The available options are:

apiKey: Your OpenAI API key (required).
autoStart: Automatically start speech recording on component mount.
customServer: Supply your own whisper REST API endpoint
nonStop: If true, recording will auto-stop after stopTimeout. However, if the user keeps speaking, the recorder will keep recording.
removeSilence: Remove silence before sending the file to OpenAI API.
stopTimeout: If nonStop is true, this becomes required. This controls when the recorder auto-stops.

Methods and States

In addition to the configuration options, useWhisper also provides several methods that you can use to control speech recording:

recording: Speech recording state.
speaking: Detect when the user is speaking.
transcript: Object returned after Whisper transcription is complete.
transcripting : Remove silence from speech and send request to OpenAI Whisper API
pauseRecording: Pause speech recording.
startRecording: Start speech recording.
stopRecording: Stop speech recording.

Transcript Object

The transcript object returned from the hook contains the following:

blob: Recorded speech in JavaScript Blob.
text: Transcribed text returned from Whisper API.

Source Code

chengsokdara / use-whisper

React hook for OpenAI Whisper with speech recorder, real-time transcription, and silence removal built-in

useWhisper

React Hook for OpenAI Whisper API with speech recorder, real-time transcription and silence removal built-in

Demo
Real-Time transcription demo

use-whisper-real-time-transcription.mp4

Announcement

useWhisper for React Native is being developed.

Repository: https://github.com/chengsokdara/use-whisper-native

Progress: chengsokdara/use-whisper-native#1

Install

npm i @chengsokdara/use-whisper

yarn add @chengsokdara/use-whisper

Usage

import { useWhisper } from '@chengsokdara/use-whisper'
const App = () => {
  const {
    recording,
    speaking,
    transcribing,
    transcript,
    pauseRecording,
    startRecording,
    stopRecording,
  } = useWhisper({
    apiKey: process.env.OPENAI_API_TOKEN, // YOUR_OPEN_AI_TOKEN
  })

  return (
    <div>
      <p>Recording: {recording}</p>
      <p>Speaking: {speaking}</p>
      <p>Transcribing: {transcribing}</p>
      <p>Transcribed Text: {transcript.text}</p>
      <button onClick={() => startRecording()}>Start</

…

View on GitHub

Contact me

for web or mobile app development using React or React Native
chengsokdara@gmail.com
https://chengsokdara.github.io

Conclusion

useWhisper is a game-changer for anyone who needs to transcribe speech into text quickly and accurately. With its built-in speech recorder and silence removal, you can save time and reduce cost. Whether you're a journalist, customer service representative, or anyone who needs to transcribe speech, useWhisper is the tool for you. Try it out today!

AI-Powered Hiring: From Inbox Chaos to Structured Data with Postmark & LLM

Ra Baryon Neutrino — Sun, 01 Jun 2025 18:12:07 +0000

This is a submission for the Postmark Challenge: Inbox Innovators.

Postmark x Dev.to Challenge by Cheng Sokdara

What I Built

I built a web application that automates the initial stages of the hiring process by leveraging Next.js and Postmark's inbound email parsing feature. The application receives job application emails, uses an LLM (OpenAI GPT) to intelligently extract relevant information from the email body and any attached resumes (PDF or DOCX), and then stores this structured data in Firebase Firestore. This data, including candidate details, job application specifics, and the original parsed email, is then accessible via a dashboard for easy viewing and management.

Demo

You can try out the live demo here:

https://postmark-devto-chengsokdara.vercel.app

Testing Instructions:

Navigate to the demo application and log in using a Google account.
Once on the dashboard, you'll need to provide a username and a disposable OpenAI API key. This key is used securely on the server-side to process the parsed emails with the LLM.
After submitting the form, a unique webhook URL will be generated for you.
Copy this webhook URL.
Go to your Postmark account, navigate to your inbound email settings, and paste this webhook URL. Save the changes.
Now, using any email client, compose a job application email (you can attach a resume in PDF or DOCX format) and send it to your Postmark inbound email address.
Wait a few seconds for the processing to complete.
Refresh the dashboard in the web app or navigate to the "Applications," "Candidates," or "Emails" pages to see the parsed data.

Code Repository

The complete source code is available on GitHub:

https://github.com/chengsokdara/postmark-devto-chengsokdara

How I Built It

I developed this project from scratch with the help of ChatGPT and Grok. This involved writing original utility functions and a few custom UI components.

The core process flow is as follows:

Postmark receives an inbound job application email and parses it, providing a JSON representation.
My application ingests this JSON. If a resume (PDF or DOCX) is attached, it's parsed using pdf2json for PDFs or mammoth for DOCX files.
The combined data (parsed email content and parsed resume text) is then sent to the OpenAI GPT LLM.
The LLM processes this information, extracting key details and structuring them.
This structured data, which includes candidate information, job application details, and the original parsed email, is then saved to Firebase Firestore.

My experience with Postmark's inbound parsing was straightforward; the JSON output was easy to work with and integrate into the application flow.

Tech Stack:

Framework: Next.js (App Router) with TypeScript
Styling: Tailwind CSS & Daisy UI
Database: Firebase Firestore
Authentication: Firebase Auth
State Management (Global Toast): Zustand
File Parsing: pdf2json (for PDFs), mammoth (for DOCX)
Validation: Zod
LLM: OpenAI GPT-3.5 (or your preferred model)
Icons: Heroicons
Email Parsing Service: Postmark

Team Submissions

@chengsokdara

Contact me for web or mobile app development using React or React Native
https://chengsokdara.github.io

DEV Community: Ra Baryon Neutrino

Transform Your Speech into Text with the Power of OpenAI and useWhisper

Demo

Introduction

Getting Started

Configuration Options

Methods and States

Transcript Object

Source Code

chengsokdara / use-whisper

React hook for OpenAI Whisper with speech recorder, real-time transcription, and silence removal built-in

useWhisper

Demo

Real-Time transcription demo

Announcement

Install

Usage

Contact me

Conclusion

AI-Powered Hiring: From Inbox Chaos to Structured Data with Postmark & LLM

Postmark x Dev.to Challenge by Cheng Sokdara

What I Built

Demo

Code Repository

How I Built It

Team Submissions