TravisCI, TeamCity, and Kotlin

Andrew Hagedorn, March 2020

I have been using TeamCity for continuous integration for almost a decade. Once you get used to its quirks it is...fine. A few years back my team took a detour into using TravisCI mostly for NodeJS projects and initially it was a breath of fresh air. It seemed easier to setup and iterate on and possibly most importantly it was different.

TravisCI

There are some aspects of TravisCI that I really like and most of them are derived from the in respository YAML file that defines everything about how the CI should run:

sudo: false
language: node_js
node_js:
- '12.0'
before_script:
  - npm install --ignore-scripts
script:
  - npm test
  - npm run build
  - npm
deploy:
  skip_cleanup: true
  provider: npm
  email: $NPM_EMAIL
  api_key: $NPM_AUTH_TOKEN
  on:
    branch: master

In TeamCity I had been using the UI to define CI and I found that defining your CI in the repository has a couple of major advantages:

Having the defintion of CI in a consistent place next to the code makes it easier to discover and make modifications to the CI pipeline. Just look for the .travis.yml file in the root of the repository.
It's in version control so you can track changes. While you can configure TeamCity to track changes to the UI it is a inferior process since it can only be used to track down issues after the fact. Having the changes to the pipeline in the same pull request as the code that requires the change means the reviewer has all the context and the pipeline is much more likely to be working because CI just passed on it in the branch.

My favorite feature is the flexibility of the YAML file on branches. If you are on a branch and want to experiment with a new integration test runner or add an extra couple of build steps? No problem! I loved that the shape of your CI pipeline could be completely changed in a branch with no setup. If you decide that a proof of concept for CI was not the right path forward than there are no build artifacts leftover that you need to clean up. And if it is successful you just need to merge the branch in.

This is not to say it was perfect; in fact there were a lot of rough edges that got exposed with heavy usage:

Flakey tests seem to be a fact of life with selenium testing, and a lack of test history made detecting and eliminating flakey tests painful.
There is no concept of shared secrets. So rolling your NPM access token means manually updating every repository.
There is lots of copy pasted code across YAML files.

For a while I was really happy with TravisCI, but over time my view on it soured. In addition to those rough edges, having two CI systems that worked completely differently added unnecessary friction and complexity into my team's tech stack. Ultimately, the team came to the same conclusion and we decided to consolidated our CI systems back onto TeamCity.

Teamcity and the Kotlin DSL

As we moved back we didn't necessarily want to lose the good things from TravisCI, namely the in repository definition of the CI system. To that end we looked into using the TeamCity Kotlin DSL. We had high hopes that we could make it even better than the YAML files; in theory a programming language should allow us to share build steps more easily than configuration files.

We initially tried to solve for step duplication by using a shared library of build steps. For example, you could create a build type for linting the codebase, put it in a jar, and share it across builds:

import jetbrains.buildServer.configs.kotlin.v2019_2.*
import jetbrains.buildServer.configs.kotlin.v2019_2.buildSteps.script

version = "2019.2"

// used in each build
project {
  buildType(Lint)
}

// shared in a libary
object Lint: BuildType({
    name = "Linting"
    steps {
        script {
            scriptContent = "# make sure your dependences are set up"
        }
        script {
            scriptContent = "# do some linting"
        }
        script {
            scriptContent = "# does anything need cleaned up or cached between builds?"
        }
    }
})

For trivial things this approach works well. However, if they are trivial does sharing them actually gain you a lot?

The more interesting case is when you try and do something complex. What happens when you want to parallelize your integration tests across multiple build steps and potentially multiple build agents? In my experience you struggle for a couple reasons:

Each build step must exist in master before you can iterate on it. When you decide not to go with the complex approach you need to clean up the master CI.
There is no way to validate your library changes without publishing it and trying it on a live build. Speculative builds with a push and pray attitude is not great at the best of times. Add in a long feedback cycle for something like integration tests that can take 10 minutes to run and making changes to the pipeline is miserable.
You have lost the ability to quickly iterate on a branch in your repository to test out changes

There are more drawbacks than just how the code is organized; you also now need to really know Kotlin and TeamCity to iterate on your CI pipeline. Unless you are a Kotlin shop this means for all your engineers its just one more thing they need be up to speed on to be productive. That this happens infrequently is its own double edged sword; engineers who have added a build step in the past have likely forgotten most of what they learned so its always painful.

Our second approach was to mimic the structure of Travis and add helpers for cleaning up the syntax to create a VCS root or run a script:

import jetbrains.buildServer.configs.kotlin.v2019_2.*
import jetbrains.buildServer.configs.kotlin.v2019_2.buildSteps.script
import MySharedHelpers.*

version = "2019.2"

project {
    git("my-repo-name", "My Repo VCS")
    stage("Lint") {
        steps {
            sh("npm install")
            sh("npm run lint")
        }
    }
}

Personally, I find this approach much easier to reason about; you have essentially traded the dashes of YAML for brackets. You still have the ability to share code when that is high leverage, but for the average engineer there is a much lower barrier to entry when adding a build step. If you need to add or update the CI pipeline you don't need to understand a complex web of Kotlin; instead you just need to add another sh step.

Conclusion

As I think back on our initial decision to use TravisCI, I think much of it had to do with the broader team's perception of Teamcity rather than the reality. It was perceived as a complex and hard to use tool, which was largely due to two factors:

A centralized CI team had significant knowledge of the system, while the broader team had little knowledge
Our setup was complicated because its subject matter was complicated. Our whole codebase was a large monolithic application and as it grew the build pipeline needed to be complicated to ensure our CI was still fast and reliable.

For better or worse, those are not true any longer. In the intervening years we moved to AWS, broke portions of the monolith into services, and adopted new frameworks and langauges. More importantly for this topic, the CI for those systems has largely been federated out to the teams that own the codebases which means many more members of the team are knowledgable about the tools we use.

So while moving back to TeamCity has been not been incredibly satisfying, it has definitely been the correct move from a technical and organizational perspective. Our CI systems have been faster and more robust since we moved and we will continue to experiment with the right mix of sharing and ease of use to enable the team to be productive.

TravisCI, TeamCity, and Kotlin

TravisCI

Teamcity and the Kotlin DSL

Conclusion

Other Posts

Technology

When Technology Bites Back

Stories from Beyond

Random Commands