Exploring use of Gemini Generative AI APIs in a Kotlin/Compose Multiplatform project

We recently saw the announcement of Google’s new Gemini Generative AI model and the related set of SDKs/APIS and I was curious to see how this could be exercised in a Kotlin Multiplatform (KMP) project. There are dedicated SDKs for Android and iOS (and other platforms) but in the case of KMP shared code it looks like Gemini’s REST APIs need to be used right now. Those APIs support a range of text/image based queries but for this initial exploration we’re going to focus on text only.

Introducing Gemini, Google’s largest and most capable AI model. 🧵 #GeminiAI https://t.co/T0tIw9HQyO
— Google (@Google) December 6, 2023

Generating API key

Initially we need to create an API key in Google AI Studio.

Makersuite

Code

We create our project using the Kotlin Multiplatform Wizard. This will create a Kotlin/Compose Multiplatform project that supports Android, iOS (using shared Compose code), Desktop and Wasm based Web clients.

To make requests using Gemini’s REST APIs we make use of the Ktor framework as shown below. Note we’re using BuildKonfig to allow us to store the API key in local.properties file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
@Serializable
data class Part(val text: String)

@Serializable
data class Content(val parts: List<Part>)

@Serializable
data class Candidate(val content: Content)

@Serializable
data class Error(val message: String)

@Serializable
data class GenerateContentResponse(val error: Error? = null, val candidates: List<Candidate>? = null)

@Serializable
data class GenerateContentRequest(val contents: Content)

class GeminiApi {
    private val baseUrl = " https://generativelanguage.googleapis.com/v1beta/models"
    private val apiKey = BuildKonfig.GEMINI_API_KEY

    @OptIn(ExperimentalSerializationApi::class)
    private val client = HttpClient {
        install(ContentNegotiation) {
            json(Json { isLenient = true; ignoreUnknownKeys = true; explicitNulls = false})
        }
    }

    suspend fun generateContent(prompt: String): GenerateContentResponse {
        val part = Part(text = prompt)
        val contents = Content(listOf(part))
        val request = GenerateContentRequest(contents)

        return client.post("$baseUrl/gemini-pro:generateContent") {
            contentType(ContentType.Application.Json)
            url { parameters.append("key", apiKey) }
            setBody(request)
        }.body<GenerateContentResponse>()
        }
    }

Right now the project includes a basic UI that allows entering a text prompt and calling the above generateContent API. This is implemented using the following Compose Multiplatform code (the same code is shared across all 4 clients).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
@Composable
fun App() {
    val api = remember { GeminiApi() }

    val coroutineScope = rememberCoroutineScope()
    var prompt by remember { mutableStateOf("Summarize the benefits of Kotlin Multiplatform") }
    var content by remember { mutableStateOf("") }
    var showProgress by remember { mutableStateOf(false) }

    MaterialTheme {
        Column(Modifier.verticalScroll(rememberScrollState())
        .fillMaxWidth().padding(16.dp)
        ) {
            Row {
                TextField(value = prompt,
                onValueChange = { prompt = it },
                modifier = Modifier.weight(7f)
                )
                TextButton(onClick = {
                    if (prompt.isNotBlank()) {
                        coroutineScope.launch {
                            showProgress = true
                            content = generateContent(api, prompt)
                            showProgress = false
                        }
                    }
                },
                modifier = Modifier.weight(3f)
                .padding(all = 4.dp)
                .align(Alignment.CenterVertically)
                ) {
                    Text("Submit")
                }
            }

            Spacer(Modifier.height(16.dp))
            if (showProgress) {
                CircularProgressIndicator()
            } else {
                Text(content)
            }
        }
    }
}

suspend fun generateContent(api: GeminiApi, prompt: String): String {
    println("prompt = $prompt")
    val result = api.generateContent(prompt)
    return if (result.candidates != null) {
        result.candidates[0].content.parts[0].text
    } else {
        "No results"
    }
}

The following then shows example of the Wasm based Compose for Web client for the project.

Gemini Web Client

The code shown here can be found in GeminiKMP project in Github.

Featured in Kotlin Weekly Issue #388 and Android Weekly #604

New basic Kotlin Mutliplatform sample to exercise Gemini Generative AI APIs (using @JetBrainsKtor with REST version of Gemini APIs - https://t.co/sfMOPSfcut).

UI is using Compose Multiplatform running on iOS, Android, Desktop and (Wasm based) Web.https://t.co/7fK7wtRAKP
— John O'Reilly (@joreilly) December 31, 2023

Generating API key

Code

Related tweet