Mono Repo vs Multi Repo

Tech Evermaps
10 min readMay 17, 2022

--

There are two main strategies for hosting and managing code via Git: mono-repo ( single repo) vs. multi-repo. Both approaches have their advantages and disadvantages. We can use either approach for any code base in any language. You can use any of these strategies for projects containing a handful of libraries to thousands of them. Even if it involves a few team members or hundreds, or if you want to host private or open source code, you can still opt for mono-repo( single-repo) or multi-repo depending on various factors.

What are the pros and cons of each approach? When should you use one or the other? Let’s find out!

What are repositories?
A repository is a storage for all changes and files in a project, allowing developers to “version control” the project’s resources throughout its development phase.

We usually refer to Git repositories (as provided by GitHub , GitLab or Bitbucket ), but the concept also applies to other version control systems (such as Mercurial).

There are two main strategies to host and manage our code base via Git: the mono-repo approach and the multi-repo approach

What is a Mono repository?

The mono-repo approach uses a single repository to house all the code for the multiple libraries or services that make up an organization’s projects. At the extreme, an organization’s entire code base — spanning various projects and coded in different languages — is housed in a single repository.

Advantages of mono repository
Hosting the entire code base on a single repository offers the following benefits.

  • Lowers barriers to entry
    When new staff members start working for a company, they have to download the code and install the necessary tools to start working on their tasks. Let’s assume that the project is scattered across many repositories, each with its own installation instructions and tooling requirements. In this case, the initial setup will be complex and, more often than not, the documentation will not be complete, forcing these new team members to ask their colleagues for help. A single repository simplifies things. Because there is a single location containing all the code and documentation, you can streamline the initial setup.
  • Centralized code management
    Having a single repository gives all developers visibility to all code. This simplifies code management because we can use a single issue tracker to monitor all issues throughout the application lifecycle. For example, these features are valuable when an issue spans two (or more) child libraries with the bug existing on the dependent library. With multiple repositories, it can be difficult to find the piece of code where the problem occurs. On top of that, we would need to determine which repository to use to create the problem, and then invite and report members of other teams to help solve the problem. With a single repository, however, locating code problems and collaborating on troubleshooting becomes easier to accomplish.
  • Painless application-wide refactorings
    When creating an application-wide code refactoring, multiple libraries will be affected. If you host them via multiple repositories, managing all the different pull requests to keep them in sync with each other can be a challenge.
    A single repository makes it easy to make all the changes to all the code for all the libraries and submit it under a single pull request.
  • Harder to break adjacent functionality
    With mono-repo, we can set up all tests for all libraries to run whenever a single library is modified. As a result, the likelihood of making a change in some libraries minimized the negative effects on other libraries.
  • Teams share a development culture
    While not impossible, with a single-repo approach, it becomes difficult to inspire unique subcultures among different teams. Since they will share the same repository, they will most likely share the same programming and management methodologies and use the same development tools.

Problems with the mono-repo approach

Using a mono repository for all our code has several drawbacks.

  • Slower development cycles
    When the code in one library contains breakage changes, which cause the dependent libraries to fail tests, the code must also be fixed before merging the changes.If these libraries depend on other teams, who are busy working on another task and are unable (or unwilling) to adapt their code to avoid breakage changes and pass the tests, the development of the new feature may come to a halt.In addition, the project may only start moving at the speed of the slowest team in the company. This result could frustrate the fastest team members, creating conditions for them to want to leave the company.In addition, one library will also have to run the tests for all the other libraries. More tests to run, plus the time it takes to get them to work, which slows down how quickly we can iterate on our code.
  • Requires downloading the entire code base
    When the single repository contains all of a company’s code, it can be huge, containing gigabytes of data. To contribute to any library hosted within, anyone would need a download of the entire repository.Dealing with a large code base means poor use of space on our hard drives and slower interactions with it. For example, daily actions such as running git status or searching the code base with a regular expression can take several seconds or even minutes longer than with several repositories.
  • Unmodified libraries can have a new version
    When we tag the single repository, all code is allocated within the new tag. If this action triggers a new version, then all libraries hosted in the repository will be newly published with the version number of the tag, even if many of these libraries have not undergone any changes.
  • Forking is more difficult

Open source projects should make it as easy as possible for contributors to get involved. With multiple repositories, contributors can go directly to the specific repository of the project they want to contribute to. With a single repository hosting various projects, however, contributors must first find their way into the right project and will need to understand how their contribution may affect all other projects.

Google decided early to use monorepo that have:
86 terabytes of data.
2 billion lines of code.
9 million unique source files.

What is a multi repository

The multi-repo approach uses multiple repositories to host the multiple libraries or services of a project developed by an organization. At its extreme, it will host each minimal set of reusable code or standalone functionality (such as a microservice) under its repository.

Advantages of multi-repo

Hosting each library independently of all others offers a plethora of benefits.

  • Independent library version management
    When tagging a repository, its entire code base is assigned the “new” tag. Since only the code for a specific library is in the repository, the library can be tagged and versioned independently of all other libraries hosted elsewhere.Having an independent version for each library helps define the application’s dependency tree, allowing us to configure which version of each library to use.
  • Independent service versions
    Since the repository contains only the code of a service and nothing else, it can have its own deployment cycle, independent of any progress made on the applications that access it.The service can use a rapid release cycle such as continuous delivery (where new code is deployed after passing all tests). Some libraries accessing the service may use a slower release cycle, such as those that produce a new version only once a week.
  • Helps define access control throughout the organization
    Only team members involved in the development of a library should be added to the corresponding repository and download its code. Therefore, there is an implicit access control policy for each layer of the application. People involved in the library will be granted editing rights, and not everyone may have access to the repository. Or they may be given read rights but not edit rights.
  • Allows teams to work independently
    Team members can design the library architecture and implement its code working independently of all other teams. They can make decisions based on what the library does in the general context without being affected by the specific requirements of an external team or application.

Problems with the multi-repo approach

Using multiple repositories can lead to several problems.

  • Libraries must be constantly resynchronized
    When a new version of a library containing breaking changes is released, the libraries dependent on that library will have to be adapted to start using the latest version. If the release cycle of the library is faster than that of its dependent libraries, they could quickly fall out of sync with each other. Teams will need to constantly update to use the latest versions from other teams. Since different teams have different priorities, this can sometimes be difficult to achieve. As a result, a team unable to catch up may end up sticking with the outdated version of the library it depends on. This outcome will have implications for the application (in terms of security, speed and other considerations), and the development gap between libraries may only widen.
  • Can fragment teams
    When different teams don’t need to interact, they can work in their own silos. In the long run, this could lead to teams producing their subcultures within the company, such as using different programming or management methodologies or using different sets of development tools.
    If a team member eventually has to work on another team, they may experience culture shock and learn a new way of doing their job.

Mono-repo vs. multi-repo: main differences

Both approaches ultimately address the same objective: managing the code base. Therefore, they both have to solve the same challenges, including version management, promoting collaboration between team members, issue management, test execution, etc.

Their main difference is the timing of team members to make decisions: either upstream for mono-repo or downstream for multi-repo.

Let’s analyze this idea in more detail.

Since all libraries are versioned independently in multi-repo, a team that releases a library with significant changes can do so safely by assigning a new major version number to the latest version. Other groups can have their dependent libraries stick with the old version and move to the new one once their code has been adapted.
This approach leaves the decision to adapt all other libraries to each responsible team, which can do so at any time. If they do it too late and new versions of libraries are released, it will become increasingly difficult to bridge the gap between libraries.
As a result, while one team may iterate quickly and often on its code, other teams may be unable to catch up, ultimately producing libraries that diverge.
On the other hand, in a mono-repo environment, we cannot release a new version of a library that breaks another library because their tests will fail. In this case, the first team must communicate with the second team to integrate the changes.This approach forces teams to completely adapt all libraries whenever a change needs to occur for a single library. All teams are forced to talk to each other and find a solution together.
As a result, the first team will not be able to iterate as fast as they want, but the code between the different libraries will not start to diverge at any point.

In short, the multi-repo approach can help create a culture of “go fast and break things” among teams, where independent agile teams can produce their output at their speed. Instead, the mono-repo approach fosters a culture of awareness and care, where teams should not be left to deal with a problem on their own.

Summary advantage between Multi and Mono Repo (Images may be subject to copyright)

Mono -repo vs. multi-repo: how to choose?

This is not a one-size-fits-all solution. The answer lies in your team’s collaborative analysis of the project requirements and available resources. All possible approaches need to be considered, and you obviously need to tailor your decision to the specific requirements and needs of your organization, just as we did at Outbrain.
As with many development issues, there is no predefined answer on which approach to use. Different companies and projects will benefit from one strategy or another depending on their unique conditions, such as:

  • How large is the code base? Does it contain gigabytes of data?
  • How many people will be working on the code base? Is it about 10, 100 or 1,000?
  • How many packages will there be? Is it about 10, 100 or 1,000?
  • How many packages does the team have to work on at any given time?
  • How closely are the packages related?
  • Are different programming languages involved? Do they require the installation of any special software or hardware to work?
  • How many deployment tools are needed and how complex are they to implement?
  • What is the culture in the company? Are teams encouraged to collaborate?
  • What tools and technologies do the teams know how to use?

The WordPress account on GitHub hosts examples of the mono-repo and multi-repo approaches.

Hybrid-repo

With hybrid-repo we have one repo that is responsible for keeping internal shared libraries and APIs between teams. To maintain its high compilation speed and keep the repo lean we make sure not to use slowly-compiled libraries, like libraries written in Scala or libraries that have long-running tests.

One of the advantages of hybrid-repos is that they reduce dependency conflicts, since we have one place with a repo that manages all dependencies. Now repositories with services have to upgrade only one dependency instead of a couple of versions from multiple teams, and we can still use the bumper tool to bump (external/internal) versions, which make version alignment easier.

But, we need to make sure the repo remains in a manageable size and doesn’t scale up.

Evermaps Use Case

We wrote an in-depth article on why we believe mono repos are the right choice for our Store Locator Factory

--

--

Tech Evermaps
Tech Evermaps

Written by Tech Evermaps

0 Followers

Evermaps is a SaaS company and web solutions provider that helps its customers develop and manage their reputation, visibility, and local SEO.

No responses yet