How to get the word before the last word from a string (edge‑case‑safe) in Kotlin

1 Answer

0 votes
// Kotlin’s String is based on the JVM's UTF-16 representation, 
// which handles characters like 世 and 界 natively.

/**
 * Returns the word before the last word.
 * Uses Unicode-aware regex to handle international scripts and punctuation.
 */
fun getWordBeforeLast(text: String): String {
    // Regex breakdown:
    // [^\p{L}\p{N}]+ : Match one or more characters that are NOT Unicode Letters or Numbers.
    // .filter { it.isNotEmpty() } : Remove empty strings from leading/trailing delimiters.
    val words = text.split(Regex("[^\\p{L}\\p{N}]+"))
        .filter { it.isNotEmpty() }

    return if (words.size < 2) {
        "null"
    } else {
        // .let or simple indexing is idiomatic; 
        // using size - 2 to get the second to last element.
        words[words.size - 2]
    }
}

fun main() {
    println("=== Testing: Get Word Before Last ===\n")

    val tests = listOf(
        "python kotlin",
        "  many   spaces   here   now  ",
        "OneWord",
        "",
        "   ",
        "Hello, world!",
        "Tabs\tand\nnewlines work too",
        "Unicode 世界、こんにちは",
        "Ends with punctuation.",
        "Multiple words, with punctuation, here!",
        "state-of-the-art program example"
    )

    for (t in tests) {
        val result = getWordBeforeLast(t)
        
        println("Input: \"$t\"")
        println("Output: $result")
        println("-".repeat(40))
    }
}



/*
OUTPUT:

=== Testing: Get Word Before Last ===

Input: "python kotlin"
Output: python
----------------------------------------
Input: "  many   spaces   here   now  "
Output: here
----------------------------------------
Input: "OneWord"
Output: null
----------------------------------------
Input: ""
Output: null
----------------------------------------
Input: "   "
Output: null
----------------------------------------
Input: "Hello, world!"
Output: Hello
----------------------------------------
Input: "Tabs	and
newlines work too"
Output: work
----------------------------------------
Input: "Unicode 世界、こんにちは"
Output: 世界
----------------------------------------
Input: "Ends with punctuation."
Output: with
----------------------------------------
Input: "Multiple words, with punctuation, here!"
Output: punctuation
----------------------------------------
Input: "state-of-the-art program example"
Output: program
----------------------------------------

*/

 



answered Mar 29 by avibootz

Related questions

...