This project's full code is in GitHub @ https://github.com/shravyanayani/iosPdfVoiceReader
Why Choose Native Apps?
Native apps on iOS mobile devices are designed to take full advantage of the underlying platform’s features and capabilities. In this case, the Text-to-Speech (TTS) functionality of the phone can be seamlessly integrated, allowing the app to read PDF content aloud without limitations or external dependencies.
Beyond feature access, iOS native apps also excel in performance and responsiveness. Since they are optimized for the specific operating system, they can handle tasks like parsing text and generating speech more efficiently, resulting in a smoother, faster, and more reliable user experience.
Why Create PDF Voice Reader?
PDF is one of the most widely used document formats, valued for its portability across devices and operating systems. Books, research papers, articles, and even web pages or documents can easily be saved as PDFs, making them a universal standard for digital reading.
While PDFs are convenient for distribution, reading them visually isn’t always practical or desirable. In many situations—such as while driving, before sleep, exercising, or when reducing screen time—having the document read aloud can be far more convenient.
Although there are existing apps in app stores that provide PDF-to-speech functionality, many of them come with drawbacks. They often include intrusive advertisements and lack the customization options users truly need. For example, most do not allow skipping repetitive elements like headers, footers, or page numbers, which disrupt the listening experience.
By creating PDF Voice Reader, these limitations can be overcome. The app not only eliminates ads but also offers greater flexibility, allowing users to tailor the reading experience to their needs. This makes it a more personalized, efficient, and user-friendly solution for anyone who wants to consume PDF content through voice.
Key Features of PDF Voice Reader
1. Native Text-to-Speech (TTS) Integration
PDF Voice Reader leverages the built-in Text-to-Speech engine of the mobile operating system. This ensures seamless performance without external dependencies. The app converts the text extracted from a PDF into high-quality speech, using the same voice and settings already available on the iOS device. Users can also customize the voice directly from their device’s system settings.
2. File Selection with Native File Picker
Users can easily select a PDF file from their device using the native iOS file picker dialog. Once chosen, the selected document is displayed in the app, ready to be read aloud. This makes the process quick, intuitive, and consistent with the device’s user experience.
3. Playback Controls
The app includes simple but powerful controls for listening:
Play/Pause/Resume the reading at any time.
Adjust the reading speed through a dropdown menu with options for slower or faster playback.
4. Page Navigation
Reading doesn’t have to start at the beginning of a document. Users can:
Enter a specific page number to jump directly to that section.
Restart playback from the chosen page once the controls are activated.
This feature is especially useful for textbooks, research papers, or long-form PDFs.
Use Next Page and Previous Page buttons to skip directly to different sections.
5. Phrase Ignoring for Cleaner Listening
One of the most unique features of PDF Voice Reader is the ability to ignore repetitive phrases such as headers, footers, or page numbers.
Users can add these phrases to an “Ignore List” so they won’t be read aloud.
Each ignored phrase is displayed in a list with a delete icon, allowing users to manage or remove phrases at any time.
This customization significantly improves the listening experience, making the content flow more naturally.
6. iOS Theme and Controls
The PDF Voice Reader app is built using iOS native controls, ensuring a familiar look, feel, and behavior consistent with other iOS apps. This not only enhances user-friendliness but also makes the interface more intuitive, as users can rely on the interactions they already know. Additionally, the app automatically adapts to the system’s chosen theme—whether light mode or dark mode—providing a seamless and visually consistent experience.
Why Xcode for Building PDF Voice Reader ?
To create the PDF Voice Reader iOS app from scratch, I chose Xcode as the development environment. Xcode is the official IDE (Integrated Development Environment) for iOS, designed specifically for building, testing, and deploying apps on Apple devices. Its tight integration with the native iOS SDK makes it the most reliable and future-proof choice for native development.
1. Access to Native SDKs
Xcode comes bundled with the latest and previous versions of the iOS SDK, ensuring compatibility across a wide range of iOS versions. This is critical for building apps that not only use the newest platform features but also remain accessible to users on slightly older devices.
2. Built-In Device Simulators
One of Xcode’s most powerful features is its built-in iOS Simulator, which allows developers to test the app on multiple device models and iOS versions without needing the physical hardware. This makes it possible to verify performance, behavior, and UI responsiveness across a wide variety of scenarios, saving significant development time.
3. Standardized Layouts and Controls
Xcode also provides native UI components and layout tools that strictly follow Apple’s Human Interface Guidelines. By leveraging these, the PDF Voice Reader app automatically inherits key iOS features such as theming (light and dark modes), accessibility standards, and a familiar look-and-feel. This ensures the app feels natural to users while maintaining high compatibility with iOS design principles.
4. Streamlined Development Workflow
From code editing and debugging to interface design and deployment, Xcode offers a comprehensive workflow in one place. This integration reduces complexity and allows for faster, more efficient development compared to using third-party tools.
Why Use Swift for PDF Voice Reader?
For developing the PDF Voice Reader app, I chose Swift as the programming language. Swift is Apple’s modern, powerful, and intuitive language designed specifically for building apps across the Apple ecosystem, including iOS, iPadOS, watchOS, and macOS.
1. Native Performance and Compatibility
Swift is fully integrated with the iOS SDK and Apple’s development tools, making it the best choice for achieving native performance. Apps written in Swift run efficiently, take advantage of the latest iOS features, and integrate seamlessly with system services like Text-to-Speech.
2. Simplicity and Readability
Swift’s syntax is clean, concise, and expressive, making it easier to write and maintain code compared to older languages like Objective-C. This simplicity helps speed up development while reducing the chances of errors, making the codebase more maintainable over time.
3. Safety and Reliability
One of Swift’s strengths is its focus on safety. Features like strong typing, optionals, and automatic memory management help catch errors early during compilation rather than at runtime. This leads to more reliable and stable apps—crucial for providing a smooth reading experience to users.
4. Modern Features for Faster Development
Swift offers powerful features such as closures, generics, and structured concurrency, which make coding more efficient and expressive. These modern tools enable developers to implement features like customizable playback or phrase filtering with less code and greater clarity.
5. Future-Proof and Actively Supported
Swift is actively maintained and improved by Apple and the open-source community. Choosing Swift ensures the app will remain compatible with future versions of iOS and benefit from ongoing performance improvements, security updates, and new language features.
Steps to Create the Project in Xcode
Since this is a single-screen app, we can start with the standard iOS app template in Xcode. Follow these steps:
Open Xcode
From the top menu, go to:
File → New → Project
Choose Template
In the dialog that appears, select the iOS tab.
Under Application, choose App.
Click Next.
Configure Project with following Settings
Product Name: PDFReadAloud
Organization Identifier: com.productivity
Interface: SwiftUI
Language: Swift
Check the box for Include Tests to add a testing target.
Click Next.
Select Project Location
Create or select a folder named PDFReadAloud.
Click Create to generate the project.
At this point, Xcode will scaffold the project with the necessary files and structure, and you’ll be ready to start coding the app.
Significant Code Fragments
Code to open a PDF file selection dialog and display file name.
Button(action: {
pdfViewModel.showDocumentPicker = true
}) {
HStack {
Image(systemName: "doc.fill")
Text("Select PDF File")
}
.padding()
.background(Color.blue)
.foregroundColor(.white)
.cornerRadius(8)
}
.sheet(isPresented: $pdfViewModel.showDocumentPicker) {
DocumentPicker(pdfURL: $pdfViewModel.pdfURL, pdfFileName: $pdfViewModel.pdfFileName)
}
if !pdfViewModel.pdfFileName.isEmpty {
Text("Selected file: \(pdfViewModel.pdfFileName)")
.font(.subheadline)
}
struct DocumentPicker: UIViewControllerRepresentable {
@Binding var pdfURL: URL?
@Binding var pdfFileName: String
@Environment(\.presentationMode) var presentationMode
func makeUIViewController(context: Context) -> UIDocumentPickerViewController {
let picker = UIDocumentPickerViewController(forOpeningContentTypes: [UTType.pdf])
picker.allowsMultipleSelection = false
picker.delegate = context.coordinator
// Request access to the document
picker.shouldShowFileExtensions = true
return picker
}
func updateUIViewController(_ uiViewController: UIDocumentPickerViewController, context: Context) {}
func makeCoordinator() -> Coordinator {
Coordinator(self)
}
class Coordinator: NSObject, UIDocumentPickerDelegate {
let parent: DocumentPicker
init(_ parent: DocumentPicker) {
self.parent = parent
}
func documentPicker(_ controller: UIDocumentPickerViewController, didPickDocumentsAt urls: [URL]) {
guard let url = urls.first else { return }
// Start accessing the security-scoped resource
let didStartAccessing = url.startAccessingSecurityScopedResource()
defer {
if didStartAccessing {
url.stopAccessingSecurityScopedResource()
}
}
do {
// Create a copy in the app's documents directory for persistent access
let documentsDirectory = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask).first!
let destinationURL = documentsDirectory.appendingPathComponent(url.lastPathComponent)
// Remove any existing file
if FileManager.default.fileExists(atPath: destinationURL.path) {
try FileManager.default.removeItem(at: destinationURL)
}
// Copy the file
try FileManager.default.copyItem(at: url, to: destinationURL)
// Update the view model with the local URL
DispatchQueue.main.async {
self.parent.pdfURL = destinationURL
self.parent.pdfFileName = url.lastPathComponent
}
} catch {
print("Error copying file: \(error.localizedDescription)")
// If copying fails, try to use the original URL directly
DispatchQueue.main.async {
// Create a bookmark for persistent access
do {
let bookmarkData = try url.bookmarkData(options: .minimalBookmark, includingResourceValuesForKeys: nil, relativeTo: nil)
UserDefaults.standard.set(bookmarkData, forKey: "pdfBookmark")
self.parent.pdfURL = url
self.parent.pdfFileName = url.lastPathComponent
} catch {
print("Failed to create bookmark: \(error.localizedDescription)")
}
}
}
parent.presentationMode.wrappedValue.dismiss()
}
}
}
Code to select a specific page number to begin reading from.
Text("Page #")
TextField("1", text: $pdfViewModel.pageNumberText)
.keyboardType(.numberPad)
.frame(width: 60)
.textFieldStyle(RoundedBorderTextFieldStyle())
.onChange(of: pdfViewModel.pageNumberText) { newValue in
if let pageNumber = Int(newValue), pageNumber > 0 {
pdfViewModel.currentPage = pageNumber - 1
}
}
Button(action: {
pdfViewModel.previousPage()
}) {
HStack {
Image(systemName: "arrow.backward")
Text("Previous Page")
}
.padding()
.background(pdfViewModel.hasPreviousPage ? Color.blue : Color.gray)
.foregroundColor(.white)
.cornerRadius(8)
}
.disabled(!pdfViewModel.hasPreviousPage)
Button(action: {
pdfViewModel.nextPage()
}) {
HStack {
Image(systemName: "arrow.forward")
Text("Next Page")
}
.padding()
.background(pdfViewModel.hasNextPage ? Color.blue : Color.gray)
.foregroundColor(.white)
.cornerRadius(8)
}
.disabled(!pdfViewModel.hasNextPage)
func nextPage() {
guard hasNextPage else {
updateStatus("Already at the last page", isError: true)
return
}
currentPage += 1
pageNumberText = "\(currentPage + 1)"
readCurrentPage()
}
func previousPage() {
guard hasPreviousPage else {
updateStatus("Already at the first page", isError: true)
return
}
currentPage -= 1
pageNumberText = "\(currentPage + 1)"
readCurrentPage()
}
var hasPreviousPage: Bool {
guard let pdfDocument = pdfDocument else { return false }
return currentPage > 0
}
var hasNextPage: Bool {
guard let pdfDocument = pdfDocument else { return false }
return currentPage < pdfDocument.pageCount - 1
}
func readPDF() {
guard let url = pdfURL else {
updateStatus("No PDF file selected", isError: true)
return
}
// Check if we need to restore security-scoped resource access
var didStartAccessing = false
if !url.isFileURL || !FileManager.default.fileExists(atPath: url.path) {
// Try to resolve from bookmark if needed
if let bookmarkData = UserDefaults.standard.data(forKey: "pdfBookmark") {
do {
var isStale = false
let resolvedURL = try URL(resolvingBookmarkData: bookmarkData,
options: .withoutUI,
relativeTo: nil,
bookmarkDataIsStale: &isStale)
if isStale {
updateStatus("PDF bookmark is stale, please select the file again", isError: true)
return
}
// Start accessing the security-scoped resource
didStartAccessing = resolvedURL.startAccessingSecurityScopedResource()
// Update the URL to the resolved one
pdfURL = resolvedURL
} catch {
updateStatus("Failed to access PDF file: \(error.localizedDescription)", isError: true)
return
}
}
}
// Create PDF document with proper security options
let pdfDoc = PDFDocument(url: url)
// Stop accessing the security-scoped resource if needed
if didStartAccessing {
url.stopAccessingSecurityScopedResource()
}
if pdfDoc == nil {
updateStatus("Failed to load PDF document. The file may be corrupted or password-protected.", isError: true)
return
}
self.pdfDocument = pdfDoc
if let pageNumber = Int(pageNumberText), pageNumber > 0 && pageNumber <= pdfDocument?.pageCount ?? 0 {
currentPage = pageNumber - 1
} else {
currentPage = 0
pageNumberText = "1"
updateStatus("Invalid page number. Starting from page 1", isError: true)
}
updateStatus("Successfully loaded PDF with \(pdfDocument?.pageCount ?? 0) pages", isError: false)
readCurrentPage()
}
func readCurrentPage() {
print("-------in readCurrentPage pdfDocument.pageCount \(currentPage)")
guard let pdfDocument = pdfDocument, currentPage < pdfDocument.pageCount else {
updateStatus("Invalid page number or no PDF loaded", isError: true)
return
}
speechSynthesizer.stopSpeaking(at: .immediate)
guard let page = pdfDocument.page(at: currentPage) else {
updateStatus("Failed to load page \(currentPage + 1)", isError: true)
return
}
// Extract text from the PDF page
var pageText = page.string ?? "Empty on purpose"
// If still empty, show an error
if pageText.isEmpty {
print("No text found on page \(currentPage + 1)")
pageText = "No readable text found on this page."
updateStatus("No readable text found on page \(currentPage + 1)", isError: true)
} else {
updateStatus("Reading page \(currentPage + 1) of \(pdfDocument.pageCount)", isError: false)
}
print("-------in readCurrentPage pageText = \(pageText)")
// Filter out excluded texts
for excludedText in excludedTexts {
pageText = pageText.replacingOccurrences(of: excludedText, with: "")
}
// Create and configure the utterance
let utterance = AVSpeechUtterance(string: pageText)
utterance.rate = selectedRate * AVSpeechUtteranceDefaultSpeechRate
utterance.voice = AVSpeechSynthesisVoice(language: "en-US")
// Set pitch and volume for better speech quality
utterance.pitchMultiplier = 1.0
utterance.volume = 1.0
currentUtterance = utterance
speechSynthesizer.speak(utterance)
isPlaying = true
shouldContinueToNextPage = true
print("Reading page \(currentPage + 1) with \(pageText.count) characters")
}
// Helper function to update status messages
private func updateStatus(_ message: String, isError: Bool) {
DispatchQueue.main.async {
self.statusMessage = message
self.isError = isError
// Auto-clear success messages after 5 seconds
if !isError {
DispatchQueue.main.asyncAfter(deadline: .now() + 5) {
// Only clear if it's still the same message
if self.statusMessage == message {
self.statusMessage = ""
}
}
}
}
}
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {
// Automatically move to the next page if we're at the end of the current page
if shouldContinueToNextPage && hasNextPage {
DispatchQueue.main.async {
self.nextPage()
}
} else {
DispatchQueue.main.async {
self.isPlaying = false
}
}
}
Code to change the reading speed, either faster or slower.
Text("Reading Speed")
Picker("", selection: $pdfViewModel.selectedRate) {
ForEach(pdfViewModel.availableRates, id: \.self) { rate in
Text("\(rate, specifier: "%.2f")x").tag(rate)
}
}
.pickerStyle(MenuPickerStyle())
.onChange(of: pdfViewModel.selectedRate) { newValue in
pdfViewModel.updateSpeechRate(newValue)
}
var availableRates: [Float] = [0.25, 0.5, 0.75, 1.0, 1.25, 1.5, 2.0, 3.0]
func updateSpeechRate(_ rate: Float) {
if let utterance = currentUtterance, speechSynthesizer.isSpeaking {
speechSynthesizer.stopSpeaking(at: .immediate)
let newUtterance = AVSpeechUtterance(string: utterance.speechString)
newUtterance.rate = rate * AVSpeechUtteranceDefaultSpeechRate
newUtterance.voice = utterance.voice
currentUtterance = newUtterance
speechSynthesizer.speak(newUtterance)
isPlaying = true
}
}
Code to pause, stop, or restart the reading.
Button(action: {
pdfViewModel.togglePlayPause()
}) {
HStack {
Image(systemName: pdfViewModel.isPlaying ? "pause.fill" : "play.fill")
Text(pdfViewModel.isPlaying ? "Pause" : "Play")
}
.padding()
.background(pdfViewModel.pdfURL != nil ? Color.blue : Color.gray)
.foregroundColor(.white)
.cornerRadius(8)
}
.disabled(pdfViewModel.pdfURL == nil)
func togglePlayPause() {
if isPlaying {
speechSynthesizer.pauseSpeaking(at: .word)
isPlaying = false
shouldContinueToNextPage = false
updateStatus("Paused reading at page \(currentPage + 1)", isError: false)
} else {
if speechSynthesizer.isPaused {
speechSynthesizer.continueSpeaking()
isPlaying = true
shouldContinueToNextPage = true
updateStatus("Resumed reading from page \(currentPage + 1)", isError: false)
} else {
readCurrentPage()
}
}
}
Code to add phrases to an exclusion list and remove them individually when needed.
Text("Exclude Text:")
.font(.headline)
HStack {
TextField("Text to exclude", text: $excludeText)
.textFieldStyle(RoundedBorderTextFieldStyle())
Button(action: {
if !excludeText.isEmpty {
pdfViewModel.addExcludeText(excludeText)
excludeText = ""
}
}) {
Image(systemName: "plus.circle.fill")
.foregroundColor(.blue)
}
}
ScrollView(.horizontal, showsIndicators: true) {
HStack {
ForEach(pdfViewModel.excludedTexts, id: \.self) { text in
HStack {
Text(text)
.padding(.horizontal, 8)
.padding(.vertical, 4)
.background(Color.red.opacity(0.2))
.cornerRadius(4)
Button(action: {
pdfViewModel.removeExcludeText(text)
}) {
Image(systemName: "xmark.circle.fill")
.foregroundColor(.red)
}
}
}
}
}
func addExcludeText(_ text: String) {
if !excludedTexts.contains(text) {
excludedTexts.append(text)
saveExcludedTexts()
}
}
func removeExcludeText(_ text: String) {
if let index = excludedTexts.firstIndex(of: text) {
excludedTexts.remove(at: index)
saveExcludedTexts()
}
}
func saveExcludedTexts() {
UserDefaults.standard.set(excludedTexts, forKey: "excludedTexts")
}
func loadExcludedTexts() {
if let savedTexts = UserDefaults.standard.stringArray(forKey: "excludedTexts") {
excludedTexts = savedTexts
}
}
This project's full code is in GitHub @ https://github.com/shravyanayani/iosPdfVoiceReader
Top comments (0)