In the ever-evolving landscape of artificial intelligence, creating a chatbot capable of handling real-time conversations with streaming responses is a fascinating challenge. In this blog post, we'll walk you through the process of building a Kotlin-based chatbot using Ollama AI. We'll cover everything from setting up the project to implementing and testing the streaming capabilities. Let's dive in!
Project Overview
Our goal is to create a chatbot that leverages Ollama AI to provide real-time streaming responses to user inputs. We'll use Kotlin, OkHttp, and MockWebServer for testing. The chatbot will handle streaming responses, displaying them as they are received, ensuring a smooth and interactive user experience.
1- Initialize the Kotlin Project:
- Start by creating a new Kotlin project in IntelliJ IDEA.
- Add the necessary dependencies in your
build.gradle.kts
file:
dependencies {
implementation("org.jetbrains.kotlin:kotlin-stdlib")
implementation("com.squareup.okhttp3:okhttp:4.9.1")
implementation("org.json:json:20210307")
implementation("org.jetbrains.kotlinx:kotlinx-coroutines-core:1.5.2")
// Dependencies to test
testImplementation(kotlin("test"))
testImplementation("org.junit.jupiter:junit-jupiter-api:5.9.0")
testRuntimeOnly("org.junit.jupiter:junit-jupiter-engine:5.9.0")
testImplementation("org.jetbrains.kotlin:kotlin-test:1.9.10")
testImplementation("org.jetbrains.kotlin:kotlin-test-junit:1.9.10")
testImplementation("com.squareup.okhttp3:mockwebserver:4.9.1")
}
2- Install Ollama and Download the Model:
- Before running the chatbot, you need to install Ollama on your machine and download the necessary model. Follow the Ollama installation guide to set up Ollama.
- Once installed, download the model using the following command:
ollama pull llama2-uncensored
3- Create the OllamaClient:
- This class will handle sending requests to Ollama AI and processing the streaming responses.
import okhttp3.*
import okhttp3.MediaType.Companion.toMediaType
import okhttp3.RequestBody.Companion.toRequestBody
import okio.BufferedSource
import org.json.JSONObject
import java.io.IOException
class OllamaClient {
private val client = OkHttpClient()
private val baseUrl = "http://localhost:11434/api/generate"
fun streamResponse(prompt: String, onResponse: (String) -> Unit, onComplete: () -> Unit, onError: (Exception) -> Unit) {
val requestBody = JSONObject()
.put("model", "llama2-uncensored")
.put("prompt", prompt)
.put("stream", true)
.toString()
.toRequestBody("application/json".toMediaType())
val request = Request.Builder()
.url(baseUrl)
.post(requestBody)
.build()
client.newCall(request).enqueue(object : Callback {
override fun onFailure(call: Call, e: IOException) {
onError(e)
}
override fun onResponse(call: Call, response: Response) {
if (!response.isSuccessful) {
onError(IOException("Unexpected code $response"))
return
}
response.body?.use { responseBody ->
val source: BufferedSource = responseBody.source()
while (!source.exhausted()) {
val line = source.readUtf8Line()
if (line != null) {
val jsonResponse = JSONObject(line)
if (jsonResponse.has("response")) {
onResponse(jsonResponse.getString("response"))
}
}
}
onComplete()
}
}
})
}
}
4- Create the ConversationHandler:
- This class will manage the conversation, ensuring that user inputs are processed and responses are displayed in real-time.
import kotlinx.coroutines.*
class ConversationHandler(private val ollamaClient: OllamaClient) {
private val conversationHistory = mutableListOf<String>()
fun start() = runBlocking {
while (true) {
print("You: ")
val userInput = readLine()
if (userInput.isNullOrEmpty()) break
conversationHistory.add("You: $userInput")
val context = conversationHistory.joinToString("\n")
var completeResponse = ""
ollamaClient.streamResponse(
context,
onResponse = { responseFragment ->
completeResponse += responseFragment
print("\rOllama: $completeResponse")
},
onComplete = {
println() // Move to the next line after completion
conversationHistory.add("Ollama: $completeResponse")
print("You: ")
},
onError = { e ->
println("\nOllama: Error - ${e.message}")
print("You: ")
}
)
}
}
}
5- Main Function:
- This will serve as the entry point for your application.
fun main() {
val ollamaClient = OllamaClient()
val conversationHandler = ConversationHandler(ollamaClient)
conversationHandler.start()
}
Testing the Streaming Response
To ensure that our OllamaClient handles streaming responses correctly, we'll write unit tests using MockWebServer.
- Setup the Test Class:
import okhttp3.mockwebserver.MockResponse
import okhttp3.mockwebserver.MockWebServer
import org.json.JSONObject
import org.junit.After
import org.junit.Before
import org.junit.Test
import kotlin.test.assertEquals
import kotlin.test.assertNotNull
class OllamaClientTest {
private lateinit var mockWebServer: MockWebServer
private lateinit var ollamaClient: OllamaClient
@Before
fun setUp() {
mockWebServer = MockWebServer()
mockWebServer.start()
ollamaClient = OllamaClient().apply {
val baseUrlField = this::class.java.getDeclaredField("baseUrl")
baseUrlField.isAccessible = true
baseUrlField.set(this, mockWebServer.url("/api/generate").toString())
}
}
@After
fun tearDown() {
mockWebServer.shutdown()
}
@Test
fun `test streamResponse returns expected response`() {
val responseChunks = listOf(
JSONObject().put("response", "Hello").toString(),
JSONObject().put("response", " there").toString(),
JSONObject().put("response", ", how are you?").toString()
)
responseChunks.forEach { chunk ->
mockWebServer.enqueue(MockResponse().setBody(chunk).setResponseCode(200))
}
val completeResponse = StringBuilder()
val onCompleteCalled = arrayOf(false)
ollamaClient.streamResponse(
prompt = "hello",
onResponse = { responseFragment ->
completeResponse.append(responseFragment)
},
onComplete = {
onCompleteCalled[0] = true
assertEquals("Hello there, how are you?", completeResponse.toString())
},
onError = { e ->
throw AssertionError("Error in streaming response", e)
}
)
Thread.sleep(1000)
assertEquals(true, onCompleteCalled[0])
}
@Test
fun `test streamResponse handles error`() {
mockWebServer.enqueue(MockResponse().setResponseCode(500).setBody("Internal Server Error"))
var errorCalled = false
ollamaClient.streamResponse(
prompt = "hello",
onResponse = { _ ->
throw AssertionError("This should not be called on error")
},
onComplete = {
throw AssertionError("This should not be called on error")
},
onError = { e ->
errorCalled = true
assertNotNull(e)
}
)
Thread.sleep(1000)
assertEquals(true, errorCalled)
}
}
Repository
https://github.com/josmel/ChatbotKotlinOllama
Conclusion
Building a real-time streaming chatbot using Kotlin and Ollama AI is a rewarding challenge that showcases the power of modern AI and streaming capabilities. By following this guide, you can create a chatbot that not only responds quickly but also handles conversations smoothly. Remember to install Ollama and download the necessary model before running your project.
Happy coding! Feel free to reach out with any questions or comments. If you found this guide helpful, please share it with others and follow me for more Kotlin and AI tutorials!
Top comments (1)
Nice