DEV Community

cychu42
cychu42

Posted on

Hacktoberfest Week 2: Transcript

Topic

This time, I contribute to a repo from Software Engineering Unlock,a podcast for discussing topic about software engineering. They need some help to go over automatically generated transcript to correct them to improve accessibility, so I participated in the issue for one of the episodes and made a pull request.

Changes

A Snippet

Image description

  • Added missing content during the very beginning and the end.
  • Added some punctuation.
  • Fixed incorrect parts of the transcript.
  • Fixed typo, including misspelled names and missing capitalization.
  • Fixed time code.
  • Removed some filler words.

Process

I went over the transcript line by lien while listening to the podcast. I discovered there were some missing contents from the audio, probably added from post-recording editing, so I added them to the transcript and shifted the time code.
This went a lot smoother than last week, as I have experience of rebasing and squashing commits, as well as how to follow contribution guideline. The maintainer was very nice in our interaction, and the pull request was accepted without a hitch.

Learning

It wasn't difficult per say. It mostly came down to carefully editing the lines to watch the audio and figuring out what the speakers are saying in what would be rather casual format, so it took some concentration and repeatedly going over the same section to be double-check.
In order to do this, I learned about accessibility and transcribing rules, such as the format used, what to do with background sounds, and so on.
For instance, the format used was: [time_code] <speaker>: <words>, like [00:02:25] Dr. Smith: I would say so.
For background noise, you only include then if it's relevant, as ( <sound_description>). Make use to italicized the description, like ( laugh ).
Things like stutter and filler words are supposed to be omitted to improve readability.

If I am to do it again, I probably would avoid shifting time code by a certain amount of second without following the audio. I thought I just needed to shift rest of the time code by a certain amount after including added content from the final audio, but I was apparently wrong. The time codes wouldn't match exactly, as I discovered. It's better to follow along to fix time code. Perhaps the original transcript was off in the first place.

Top comments (0)