We recently saw the announcement of Google’s new Gemini Generative AI model and the related set of SDKs/APIS and I was curious to see how this could be exercised in a Kotlin Multiplatform (KMP) project. There are dedicated SDKs for Android and iOS (and other platforms) but in the case of KMP shared code it looks like Gemini’s REST APIs need to be used right now. Those APIs support a range of text/image based queries but for this initial exploration we’re going to focus on text only.
Introducing Gemini, Google’s largest and most capable AI model. 🧵 #GeminiAI https://t.co/T0tIw9HQyO
— Google (@Google) December 6, 2023
Generating API key
Initially we need to create an API key in Google AI Studio.
Code
We create our project using the Kotlin Multiplatform Wizard. This will create a Kotlin/Compose Multiplatform project that supports Android, iOS (using shared Compose code), Desktop and Wasm based Web clients.
To make requests using Gemini’s REST APIs we make use of the
Ktor framework as shown below. Note we’re using BuildKonfig
to allow us to store the API key in local.properties
file.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
@Serializable
data class Part(val text: String)
@Serializable
data class Content(val parts: List<Part>)
@Serializable
data class Candidate(val content: Content)
@Serializable
data class Error(val message: String)
@Serializable
data class GenerateContentResponse(val error: Error? = null, val candidates: List<Candidate>? = null)
@Serializable
data class GenerateContentRequest(val contents: Content)
class GeminiApi {
private val baseUrl = " https://generativelanguage.googleapis.com/v1beta/models"
private val apiKey = BuildKonfig.GEMINI_API_KEY
@OptIn(ExperimentalSerializationApi::class)
private val client = HttpClient {
install(ContentNegotiation) {
json(Json { isLenient = true; ignoreUnknownKeys = true; explicitNulls = false})
}
}
suspend fun generateContent(prompt: String): GenerateContentResponse {
val part = Part(text = prompt)
val contents = Content(listOf(part))
val request = GenerateContentRequest(contents)
return client.post("$baseUrl/gemini-pro:generateContent") {
contentType(ContentType.Application.Json)
url { parameters.append("key", apiKey) }
setBody(request)
}.body<GenerateContentResponse>()
}
}
Right now the project includes a basic UI that allows entering a text prompt and calling the above generateContent
API. This is implemented
using the following Compose Multiplatform code (the same code is shared across all 4 clients).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
@Composable
fun App() {
val api = remember { GeminiApi() }
val coroutineScope = rememberCoroutineScope()
var prompt by remember { mutableStateOf("Summarize the benefits of Kotlin Multiplatform") }
var content by remember { mutableStateOf("") }
var showProgress by remember { mutableStateOf(false) }
MaterialTheme {
Column(Modifier.verticalScroll(rememberScrollState())
.fillMaxWidth().padding(16.dp)
) {
Row {
TextField(value = prompt,
onValueChange = { prompt = it },
modifier = Modifier.weight(7f)
)
TextButton(onClick = {
if (prompt.isNotBlank()) {
coroutineScope.launch {
showProgress = true
content = generateContent(api, prompt)
showProgress = false
}
}
},
modifier = Modifier.weight(3f)
.padding(all = 4.dp)
.align(Alignment.CenterVertically)
) {
Text("Submit")
}
}
Spacer(Modifier.height(16.dp))
if (showProgress) {
CircularProgressIndicator()
} else {
Text(content)
}
}
}
}
suspend fun generateContent(api: GeminiApi, prompt: String): String {
println("prompt = $prompt")
val result = api.generateContent(prompt)
return if (result.candidates != null) {
result.candidates[0].content.parts[0].text
} else {
"No results"
}
}
The following then shows example of the Wasm based Compose for Web client for the project.
The code shown here can be found in GeminiKMP project in Github.
Featured in Kotlin Weekly Issue #388 and Android Weekly #604
Related tweet
New basic Kotlin Mutliplatform sample to exercise Gemini Generative AI APIs (using @JetBrainsKtor with REST version of Gemini APIs - https://t.co/sfMOPSfcut).
— John O'Reilly (@joreilly) December 31, 2023
UI is using Compose Multiplatform running on iOS, Android, Desktop and (Wasm based) Web.https://t.co/7fK7wtRAKP