#backend
Jan 9, 2023

Backend Developer’s Guide to Monorepos

Jan Zlámal
Jan Zlámal
BackendMonorepos
Backend Developer’s Guide to Monorepos

Some time ago, along with the migration of the platform from an on-premises solution to AWS, we migrated almost all of our git repositories (one project, one repository) into one git repository (monorepository/monorepo) that contains everything. As Showmax is built on a microservices architecture, we had a lot of repositories so it was a dramatic change. In this post I'd like to capture in a few lines some of the pros and cons from my perspective as a Ruby backend developer.

To begin with, I would note that when the migration to monorepo was announced, I was skeptical about this change. My first experience with the monorepo approach was from about 13 years ago. Back then it was in SVN, which works a bit differently than git and the workflows were arguably worse. Simply, with SVN, the monorepo approach didn't work very well. Today, I can say that the change to a git monorepo doesn’t differ significantly from the one git repo per project approach. Let’s see the changes:

Negative

  • `git pull` takes a few seconds longer because the repository contains many more tags.
  • All IDE instances share one underlying source code repository, and so:
  • it gets more complicated to work on multiple tickets at the same time – you need to stash, shelve, or commit split changes (my colleague pointed out that there is a git-worktree command that could be used to manage multiple working trees in one repository. I personally never worked with this so perhaps I’ll write more about it in the future);
  • it is necessary to check if there isn't something unfinished in another IDE of another project.
  • Boundaries of big changes through several services are not obvious and are sometimes so large that when a merge request is created it needs to be split before the code review can start.

Positive

  • Merge requests can contain multiple services and so the code reviewer sees the feature as a whole package.
  • All changes through multiple services for a given feature can be seen in `git log`.
  • It only takes one pull for everything to be up-to-date.

Let’s have a practical example of the positive and negative scenarios of creating a merge request through multiple services and how to split a big merge request. This demonstration will be using only standard `git` commands. At the end of the article, I share a simpler solution that uses JetBrains products.

All examples are demonstrated on a simplified monorepository. To create the example repository and simulate the changes, I wrote a quick script mrsc.rb that you can use.

Development environment setup

We will use a standard linux distribution with `git` and `tree` commands that should be already available to you. The `tree` command is used to visually display file structure.

~$ uname -a
Linux 5.18.0-2-amd64 #1 SMP PREEMPT_DYNAMIC Debian 5.18.5-1 (2022-06-16) x86_64 GNU/Linux
~$ git --version
git version 2.35.1
~$ tree --version
tree v2.0.2 (c) 1996 - 2022 by Steve Baker, Thomas Moore, Francesc Rocher, Florian Sesser, Kyosuke Tokoro

Creating the monorepo

First we create the structure of the future monorepo:

~$ mkdir monorepo && cd monorepo
~/monorepo$ ruby ../mrsc.rb init

Then we create a git repository from it and stage the created structure:

~/monorepo$ git init && git add . && git status && tree .
Initialized empty Git repository in ~/monorepo/.git/
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
	new file:   gems/gem1/gem_file1.rb
	new file:   services/service0/app/app_file1.rb
	new file:   services/service0/service_file1.rb
	new file:   services/service1/app/app_file1.rb
	new file:   services/service1/service_file1.rb
	new file:   services/service2/app/app_file1.rb
	new file:   services/service2/service_file1.rb

.
├── gems
│   └── gem1
│       └── gem_file1.rb
└── services
    ├── service0
    │   ├── app
    │   │   └── app_file1.rb
    │   └── service_file1.rb
    ├── service1
    │   ├── app
    │   │   └── app_file1.rb
    │   └── service_file1.rb
    └── service2
        ├── app
        │   └── app_file1.rb
        └── service_file1.rb

9 directories, 7 files

Let’s commit changes and show the change in git:

~/monorepo$ git commit -m "init example repo" && git log --stat
[master (root-commit) d96dc0d] init example repo
 7 files changed, 23 insertions(+)
 create mode 100644 gems/gem1/gem_file1.rb
 create mode 100644 services/service0/app/app_file1.rb
 create mode 100644 services/service0/service_file1.rb
 create mode 100644 services/service1/app/app_file1.rb
 create mode 100644 services/service1/service_file1.rb
 create mode 100644 services/service2/app/app_file1.rb
 create mode 100644 services/service2/service_file1.rb

commit d96dc0db07383d25c70414b97223747a98b94a35 (HEAD -> master)
Author: Jan Zlamal <jan.zlamal@showmax.com>
Date:   Thu Jul 14 16:34:25 2022 +0200

    init example repo

 gems/gem1/gem_file1.rb             | 3 +++
 services/service0/app/app_file1.rb | 5 +++++
 services/service0/service_file1.rb | 5 +++++
 services/service1/app/app_file1.rb | 5 +++++
 services/service1/service_file1.rb | 1 +
 services/service2/app/app_file1.rb | 2 ++
 services/service2/service_file1.rb | 2 ++
 7 files changed, 23 insertions(+)

Small change through more services

Now that we have the monorepository prepared, let’s demonstrate the advantage of the monorepo approach. In this example we will make a change across multiple services. Thanks to the monorepo approach, the code reviewer will have a better overview of what has changed across the whole architecture and how it works together.

First we create a feature branch, simulate the changes, stage, and commit:

~/monorepo$ git checkout -b feature1
Switched to a new branch 'feature1'
~/monorepo$ ruby ../mrsc.rb example1
~/monorepo$ git add . && git status && tree .
On branch feature1
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	modified:   services/service0/app/app_file1.rb
	new file:   services/service0/app/app_file2.rb
	new file:   services/service2/app/app_file2.rb

.
├── gems
│   └── gem1
│       └── gem_file1.rb
└── services
    ├── service0
    │   ├── app
    │   │   ├── app_file1.rb
    │   │   └── app_file2.rb
    │   └── service_file1.rb
    ├── service1
    │   ├── app
    │   │   └── app_file1.rb
    │   └── service_file1.rb
    └── service2
        ├── app
        │   ├── app_file1.rb
        │   └── app_file2.rb
        └── service_file1.rb

9 directories, 9 files
~/monorepo$ git commit -m "feature1" && git log --stat -n 1
[feature1 8913d49] feature1
 3 files changed, 10 insertions(+), 5 deletions(-)
 create mode 100644 services/service0/app/app_file2.rb
 create mode 100644 services/service2/app/app_file2.rb

commit 8913d49209d443e3f9a3b3584d52a8ce6a60ed4d (HEAD -> feature1)
Author: Jan Zlamal <jan.zlamal@showmax.com>
Date:   Thu Jul 14 16:44:40 2022 +0200

    feature1

 services/service0/app/app_file1.rb | 8 +++-----
 services/service0/app/app_file2.rb | 4 ++++
 services/service2/app/app_file2.rb | 3 +++
 3 files changed, 10 insertions(+), 5 deletions(-)

We see that the change is small and ideal for one merge request. Let's assume the change has been reviewed and approved, and so we merge it into the master branch and delete the feature branch. Success, everyone is happy.

~/monorepo$ git checkout master
Switched to branch 'master'
~/monorepo$ git merge feature1 && git branch -d feature1
Updating d96dc0d..8913d49
Fast-forward
 services/service0/app/app_file1.rb | 8 +++-----
 services/service0/app/app_file2.rb | 4 ++++
 services/service2/app/app_file2.rb | 3 +++
 3 files changed, 10 insertions(+), 5 deletions(-)
 create mode 100644 services/service0/app/app_file2.rb
 create mode 100644 services/service2/app/app_file2.rb

Deleted branch feature1 (was 8913d49).

Big change through more services

For the second example, we will demonstrate a disadvantage of the monorepo approach. The starting steps are the same as in the previous example but the change is considerably larger than expected. Since each service is changed as a project in its own instance of the IDE, we may not even realize that it's so big until we actually create the merge request.

It’s very difficult to review big merge requests. Without a “pair code review session” with the one who created the merge request, it may sometimes be simply impossible to correctly understand the overall concept of the change.

Twitter post

In this example we will show how to revert the big commit and split it in a way that every impacted service can have its own merge request.

First we create a new git branch called feature2. Then we stage and create the big commit with all the changes.

~/monorepo$ git checkout -b feature2
Switched to a new branch 'feature2'
~/monorepo$ ruby ../mrsc.rb example2
~/monorepo$ git add . && git status && tree .
On branch feature2
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	modified:   gems/gem1/gem_file1.rb
	modified:   services/service0/app/app_file1.rb
	deleted:    services/service0/app/app_file2.rb
	new file:   services/service0/app/app_file3.rb
	modified:   services/service1/app/app_file1.rb
	modified:   services/service2/app/app_file1.rb
	modified:   services/service2/app/app_file2.rb
	new file:   services/service2/lib/lib_file1.rb
	modified:   services/service2/service_file1.rb

.
├── gems
│   └── gem1
│       └── gem_file1.rb
└── services
    ├── service0
    │   ├── app
    │   │   ├── app_file1.rb
    │   │   └── app_file3.rb
    │   └── service_file1.rb
    ├── service1
    │   ├── app
    │   │   └── app_file1.rb
    │   └── service_file1.rb
    └── service2
        ├── app
        │   ├── app_file1.rb
        │   └── app_file2.rb
        ├── lib
        │   └── lib_file1.rb
        └── service_file1.rb

10 directories, 10 files
~/monorepo$ git commit -m "feature2" && git log --stat -n 1
[feature2 b4077d1] feature2
 9 files changed, 130 insertions(+), 9 deletions(-)
 delete mode 100644 services/service0/app/app_file2.rb
 create mode 100644 services/service0/app/app_file3.rb
 create mode 100644 services/service2/lib/lib_file1.rb

commit b4077d1926a5729911fa9c85ad84f8b8c3442247 (HEAD -> feature2)
Author: Jan Zlamal <jan.zlamal@showmax.com>
Date:   Fri Jul 15 13:42:26 2022 +0200

    feature2

 gems/gem1/gem_file1.rb             |  1 +
 services/service0/app/app_file1.rb | 26 ++++++++++++++++++++++++++
 services/service0/app/app_file2.rb |  4 ----
 services/service0/app/app_file3.rb | 13 +++++++++++++
 services/service1/app/app_file1.rb |  8 +++-----
 services/service2/app/app_file1.rb |  1 +
 services/service2/app/app_file2.rb | 45 +++++++++++++++++++++++++++++++++++++++++++++
 services/service2/lib/lib_file1.rb | 39 +++++++++++++++++++++++++++++++++++++++
 services/service2/service_file1.rb |  2 ++
 9 files changed, 130 insertions(+), 9 deletions(-)
~/monorepo$ git status
On branch feature2
nothing to commit, working tree clean

In practice the change can often be even larger, however, for this demonstration it is enough. Every time I create a merge request, I look at it myself before assigning someone to review it for me, and in this case I would notice that it is just too big. It would be better to revert the commit and split it into multiple smaller merge requests. Now that we know what we want to achieve, let’s see how to do it.

First we need to "uncommit" the changes. This is done simply by setting the git HEAD reference to the previous commit with the `git reset` command. In our case we will use a soft reset, which will leave the changes from the last commit as that is exactly what we need. If we used a hard reset, the changes from the last commit would be removed from the working tree.

~/monorepo$ git reset --soft HEAD^ && git status
On branch feature2
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	modified:   gems/gem1/gem_file1.rb
	modified:   services/service0/app/app_file1.rb
	deleted:    services/service0/app/app_file2.rb
	new file:   services/service0/app/app_file3.rb
	modified:   services/service1/app/app_file1.rb
	modified:   services/service2/app/app_file1.rb
	modified:   services/service2/app/app_file2.rb
	new file:   services/service2/lib/lib_file1.rb
	modified:   services/service2/service_file1.rb

Now we have the working tree in its pre-commit state. The `HEAD^` part means we want to go back one commit. It's the same as `HEAD~1`. We can use `HEAD~2` or `HEAD~5` if we want to go back 2 or 5 commits respectively, or we can directly specify the commit hash we want to get to. This can be viewed using the `git reflog` command, which is a log of all the actions that have been performed in git.

~/monorepo$ git reflog
8913d49 (HEAD -> feature2, master) HEAD@{0}: reset: moving to HEAD^
b4077d1 HEAD@{1}: commit: feature2
8913d49 (HEAD -> feature2, master) HEAD@{2}: checkout: moving from master to feature2
8913d49 (HEAD -> feature2, master) HEAD@{3}: merge feature1: Fast-forward
d96dc0d HEAD@{4}: checkout: moving from feature1 to master
8913d49 (HEAD -> feature2, master) HEAD@{5}: commit: feature1
d96dc0d HEAD@{6}: checkout: moving from master to feature1
d96dc0d HEAD@{7}: commit (initial): init example repo

In this example, the `git reset --soft HEAD^` command would have the same effect as `git reset --soft b4077d1`. Thanks to the `git reset` and `git reflog` commands, you basically don't have to worry about "screwing up" something in git, like a bad commit or merge, because you can get back to any previous state.

Let’s finally split the big merge request into, let's say, 3 merge requests, service0, service2, and service1 + gem1 because these two were closely related.

First, we `git stash` everything and find out the stash number.

~/monorepo$ git stash push -m "feature2" && git stash list
Saved working directory and index state On feature2: feature2

stash@{0}: On feature2: feature2

The stash number is 0. Since we are in the feature2 branch, which is in the same state as the master, we can rename it to feature2_service0. We unstash the changes related to service0 using the `git restore` command and then we stage the changes:

~/monorepo$ git branch -m "feature2_service0"
~/monorepo$ git restore --source=stash@{0} -- services/service0 && git add . && git status
On branch feature2_service0
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	modified:   services/service0/app/app_file1.rb
	deleted:    services/service0/app/app_file2.rb
	new file:   services/service0/app/app_file3.rb

We then commit and update our big merge request with this change.

~/monorepo$ git commit -m "feature2 for service0"
[feature2_service0 ea8057e] feature2 for service0
 3 files changed, 39 insertions(+), 4 deletions(-)
 delete mode 100644 services/service0/app/app_file2.rb
 create mode 100644 services/service0/app/app_file3.rb

Next, we create a branch and merge request for service1 + gem1 in the same way.

~/monorepo$ git checkout master && git checkout -b feature2_service1_gem1
Switched to branch 'master'
Switched to a new branch 'feature2_service1_gem1'
~/monorepo$ git restore --source=stash@{0} -- services/service1 gems/gem1 && git add . && git status
On branch feature2_service1_gem1
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	modified:   gems/gem1/gem_file1.rb
	modified:   services/service1/app/app_file1.rb
~/monorepo$ git commit -m "feature2 for service1 and gem1"
[feature2_service1_gem1 aca2003] feature2 for service1 and gem1
 2 files changed, 4 insertions(+), 5 deletions(-)

Now we just need to make changes for service2.

~/monorepo$ git checkout master && git checkout -b feature2_service2
Switched to branch 'master'
Switched to a new branch 'feature2_service2'
~/monorepo$ git restore --source=stash@{0} -- services/service2 && git add . && git status && git commit -m "feature2 for service2"
On branch feature2_service1_gem1
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	modified:   services/service2/app/app_file1.rb
	modified:   services/service2/app/app_file2.rb
	new file:   services/service2/lib/lib_file1.rb
	modified:   services/service2/service_file1.rb

[feature2_service2 9e641ff] feature2 for service2
 4 files changed, 87 insertions(+)
 create mode 100644 services/service2/lib/lib_file1.rb

Finally, we delete the no longer needed stash (which is at position 0) and switch to the master.

~/monorepo$ git stash drop 0 && git checkout master
Dropped refs/stash@{0} (1c0d783c065a2b45365102ad9c9b1b6719164a5c)

Switched to branch 'master'

Just to check that everything looks as it is supposed to:

~/monorepo$ git log --all --stat
commit 9e641ff098084dd329f8731ccf04a6db9cb293cf (feature2_service2)
Author: Jan Zlamal <jan.zlamal@showmax.com>
Date:   Fri Jul 15 15:08:15 2022 +0200

    feature2 for service2

 services/service2/app/app_file1.rb |  1 +
 services/service2/app/app_file2.rb | 45 +++++++++++++++++++++++++++++++++++++++++++++
 services/service2/lib/lib_file1.rb | 39 +++++++++++++++++++++++++++++++++++++++
 services/service2/service_file1.rb |  2 ++
 4 files changed, 87 insertions(+)

commit aca20037700ab9e820fe2c2d1a53a241a2165287 (feature2_service1_gem1)
Author: Jan Zlamal <jan.zlamal@showmax.com>
Date:   Fri Jul 15 14:36:41 2022 +0200

    feature2 for service1 and gem1

 gems/gem1/gem_file1.rb             | 1 +
 services/service1/app/app_file1.rb | 8 +++-----
 2 files changed, 4 insertions(+), 5 deletions(-)

commit ea8057e50de93c7ae8defdee0d4b0841f4d1dc3d (feature2_service0)
Author: Jan Zlamal <jan.zlamal@showmax.com>
Date:   Fri Jul 15 14:31:11 2022 +0200

    feature2 for service0

 services/service0/app/app_file1.rb | 26 ++++++++++++++++++++++++++
 services/service0/app/app_file2.rb |  4 ----
 services/service0/app/app_file3.rb | 13 +++++++++++++
 3 files changed, 39 insertions(+), 4 deletions(-)

commit 8913d49209d443e3f9a3b3584d52a8ce6a60ed4d (HEAD -> master)
Author: Jan Zlamal <jan.zlamal@showmax.com>
Date:   Thu Jul 14 16:44:40 2022 +0200

    feature1

 services/service0/app/app_file1.rb | 8 +++-----
 services/service0/app/app_file2.rb | 4 ++++
 services/service2/app/app_file2.rb | 3 +++
 3 files changed, 10 insertions(+), 5 deletions(-)

commit d96dc0db07383d25c70414b97223747a98b94a35
Author: Jan Zlamal <jan.zlamal@showmax.com>
Date:   Thu Jul 14 16:34:25 2022 +0200

    init example repo

 gems/gem1/gem_file1.rb             | 3 +++
 services/service0/app/app_file1.rb | 5 +++++
 services/service0/service_file1.rb | 5 +++++
 services/service1/app/app_file1.rb | 5 +++++
 services/service1/service_file1.rb | 1 +
 services/service2/app/app_file1.rb | 2 ++
 services/service2/service_file1.rb | 2 ++
 7 files changed, 23 insertions(+)

Conclusion

In the end, switching to a monorepo approach was not a big change. One just needs to keep an eye on a few things. I only ran into the situation where I needed to split a large merge request once and it was related to renaming a payment method in all the services that use it. If you analyze the feature a bit in the beginning, you're very likely to break it down into smaller merge requests that make sense on the first try.

The ability to make merge requests across multiple services seems great to me. As the person doing the code review, this gives me a lot more insight than having separate merge requests for individual services, and perhaps with more time delays. Often I don’t have a full understanding of what exactly my colleagues are working on and so seeing the changes as a whole gives more context.

Tips for JetBrains products users

At Showmax, the vast majority of Ruby developers use Rubymine, which simplifies many operations, especially by making them easier to understand than the terminal.

For me, it's a great tool that saves a lot of time in many areas, including the situation of splitting up large changes. In a git window, in local changes, you can create a shelf from selected files and folders and then unshelve it at any time to any change list. You can also use Git to work on several features simultaneously, which is radically shorter than a pure terminal and git procedure. Nevertheless, it is good to know both because it may come in handy in the future.

Share article via: