Using gitlab-ci as a build system for the monorepository

The constantly growing demand for scalability and fastness of development often results in the decision to start writing microservices. After the decision passes, it provides completely new problems to solve. One of them is especially painful. Tracking changes in places of communication between services. So, there comes another buzzword: monorepo.

Monorepository is not a panacea for everything and has its own cons. Tooling is poor and often done to solve the workflow of makers. But not everybody have time to write their own tools to solve things around. As I use gitlab-ci in my current stack I would like to show a few tips on how to use it with the monorepository.

Structure of the repository in the presented case:

services/
services/project-one/
services/project-two/
.gitlab-ci.yml

Let’s pretend project-one is a service written javascript and project-two in PHP, by that I can show that these tips are build-system specific and programming language agnostic.

Use include to organise .gitlab-ci.yml

Having one file declaring pipelines in the repository with multiple services will be completely unreadable and overall big. It is possible to have separate file declaring pipeline per service, by including them in the root .gitlab-ci.yml.

.gitlab-ci.yml

include:
  - local: ./services/project-one/.gitlab-ci.yml
  - local: ./services/project-two/.gitlab-ci.yml

That way the root file can be filled with stages:, include: and generic job definitions, like for example commit message linter.

Control workflow by only:changes

The jobs are now separated by services but pushing to repository triggers running pipelines for all of the projects. To prevent this behaviour and trigger jobs by done changes in catalog add only:changes to jobs in projects:

services/project-one/.gitlab-ci.yml

project-one test:
  stage: test
  before_script:
    - cd services/project-one
    - yarn install
  script:
    - yarn run test
  only:
    changes:
      - services/project-one/**/*

Disclaimer: It triggers specified jobs based on changes in the last push, not in the merge request. Still, there is a way to manually trigger all jobs in CI for your branch just go to project pipelines and use the Run pipeline.

Cache dependencies per service

To speed up a pipeline you can cache the dependencies between jobs and even runs. It is especially useful when service have few jobs requiring dependencies and the pipeline does download them in each of these jobs.

services/project-two/.gitlab-ci.yml

project-two install:
  stage: install
  before_script:
    - cd services/project-two
  script:
    - composer install -n
  cache:
    key: $CI_COMMIT_REF_SLUG-project-two
    policy: pull-push
    paths:
      - services/project-two/vendor

project-two test:
  stage: test
  before_script:
    - cd services/project-two
    - composer install -n
  script:
    - composer test
  only:
    changes:
      - services/project-two/**/*
  cache:
    key: $CI_COMMIT_REF_SLUG-project-two
    policy: pull
    paths:
      - services/project-two/vendor

For the cache key, this example uses branch name and service name so, it will sustain between commits in that branch.

Why composer install is still in the test running job? The cache can be purged between the run of install and test job or storage service can fail. This way you can be sure that the failure of the caching mechanism will not create failures on the CI pipelines.

Reduce jobs definitions with extends

The job definitions are already growing with lines that are the same between them, imagine adding job linting your code project-two. It would have declared cache, changing directory and etc.. Let’s change that situation and prevent the repetition of settings.

services/project-one/.gitlab-ci.yml

.project-one:
  before_script:
    - cd services/project-one
  cache:
    key: $CI_COMMIT_REF_SLUG-project-one
    policy: pull
    paths:
      - services/project-one/node_modules
  only:
    changes:
      - services/project-one/**/*

project-one install:
  extends: .project-one
  stage: install
  script:
    - yarn install
  cache:
    policy: pull-push
  
project-one test:
  extends: .project-one
  stage: test
  script:
    - yarn install
    - yarn run test

project-one lint:
  extends: .project-one
  stage: test
  script:
    - yarn install
    - yarn run lint

Adding another job to run additional tests no more creates lots of lines and focuses on what commands it has to run.

Define needs of jobs to start

The pipeline from example starts to grow but what if installing dependencies in project-one install job fails? Failure of a job in one service prevents running CI jobs from another. That behavior can be changed too, by the usage of needs keyword.

service/project-two/.gitlab-ci.yml

.project-two:
  before_script:
    - cd services/project-two
  cache:
    key: $CI_COMMIT_REF_SLUG-project-two
    policy: pull
    paths:
      - services/project-two/vendor
  only:
    changes:
      - services/project-two/**/*

project-two install:
  extends: .project-two
  stage: install
  script:
    - composer install -n
  cache:
    policy: pull-push
  
project-two test:
  extends: .project-two
  stage: test
  needs:
    - project-two install
  script:
    - composer install
    - composer test
  
project-two lint:
  extends: .project-two
  stage: test
  needs:
    - project-two install
  script:
    - composer install
    - composer lint

Under the needs keyword, you should specify jobs that must succeed to run this job. That way failure of command in another service’s pipeline won’t prevent running jobs in another one.


Link to the repository of example: https://github.com/Azaradel/gitlab-ci-monorepo