Skip main navigation

Centralised vs Distributed Source Control Systems

.
15.3
Hello and welcome to this video on centralised versus distributed source control system. In the last video, we talked a little bit about continuous integration and the building blocks of continuous integration. We touched on source control system and the importance it has. In this video, we’re going to go into a lot of detail about the difference between the centralised source control system and the distributed source control system. We’ll also talk a bit about the benefits of each type of system. That will help you make the choice when you go and select a source control system in your organisation. And we’ll do a demo.
51
We’ll jump into Visual Studio Team Services and have a quick-whistle tour of the code hub within Visual Studio Team Services. And this slide, for reference, as I said, I’m highlighting that we’re going to be talking about version control– in specific, the difference between centralised and distributed version control system. All righty. So code doesn’t exist unless it’s committed into source control. And as we’re moving into this new era of source code-driven concepts where database is treated like code, infrastructure is treated like code– and today, security and process are even treated like code.
94
And if you don’t have your code committed in a central repository, well, it’s as good as saying that you don’t have a single version of truth about your software that you’re creating. You know, just talking about the infrastructure as code, I would recommend that you go and check out the course that I’ve got up in the library in EDX that takes you into infrastructure as code and talks about how you can apply automation to get to the state of nirvana and have self-healing infrastructure in place. An interesting trend that has come out, as I was fiddling around with Google Trends, is anything that’s searched enough is probably being used enough. Now, that’s just a hypothesis there.
136.4
But if there’s a technology out there and you’re trying to implement it, well, you might not find that it’s working. So you do a few Google searches to figure out how it works. And so Google Search is recording all these data points. And then when you go into Google Trends, you can pivot the search results over a duration of time by category. And in this case, the chart that I’ve created, I’ve done a pivot on– a compare on centralised versus distributed source control systems for computer and IT as a business area over the last 10 to 15 years.
179.4
And there’s a very interesting trend that you see there where SVN, which is the centralised source control system, showing up in green, was at its peak back in the days when Waterfall was the official process for delivery, right? It kind of makes sense because centralised version control systems are best suited for that model of delivery. Whereas the distributed source control system– in this case, Git– starts to spike when agile and scrum came into being. And more so over the last five, seven years where open source has really taken the front seat. Now, this is just the trends off of Google. But again, don’t use that as the pure basis to make the judgement on the distinction.
225.2
I will talk through the data points so that you’re better equipped to take that decision. Centralised source control system, in this case, I’m talking about Team Foundation Version Control. But it’s similar to SVN. The course trends and the best use– but before we get there, the thing, if you haven’t used Centralised source control system, is a Centralised source control system has one server somewhere which acts as the central authority. And anyone who wants to pull code down from the central server needs to connect up to it. So as you connect up to it and download the code, you create a working copy of the code base that you’re working on.
262.4
Now, you don’t tend to download the full history, which means you’ve only got the last change that was committed for the file or for the bunch of files that you’ve downloaded, which means if you wanted to– you would always need to be connected back to the server to get the latest and greatest back from the server. So the moment you go into the file to edit it, it sends a notification back to the server saying that I’m about to edit the file. And it records that. So if someone else tries to edit the file, it will tell them that this file is already being edited by person X.
291.8
And it’s able to do that because every time you want to edit something, you have to go back to the server and pull that information down. And in the process, the server records these key metrics for you. Now, what that establishes is that in a Centralised source control system, you always need connectivity back to the server. Now over the last few years, Team Foundation Version Control released another type of workspace which is called Local Workspace, which includes the history of the last two changes, the current and the previous. Which means you don’t always have to go back to the server. It’s better than absolutely Centralised, but it still has its shortcomings.
332.4
So what are the key strengths of a Centralised source control system? The first is it easily scales to very large code bases. A few years ago, Microsoft themselves had all the teams using Team Foundation Version Control. And they had huge code bases, as huge as 500 gigs to a terabyte. And it can deal with that, no issues. The other benefit that it offers you is very granular permission control. What that means is if you wanted to apply permissions at the file level and audit permissions at the file level, this allows you to do it.
364.7
Because you have that constant connection back to the server where you can validate if someone is authorised to see that file and then deny them if they’re not. It permits monitoring and usage. So if you’re working in a space where you have IP in code and you kind of want to track who’s accessing that code and why and when they’re using it, then it gives you that auditing. What it also allows you to do is set up exclusive locking. So if you go on the server and say, I want to exclusively lock this file out so no one can edit it, then it allows you to do that. It’s best suited, as I said, for large code bases.
400.5
It’s best suited when you want to have an audit and access process. And it’s also meaningful to have TFVC as your preferred source control system if the files and the types of projects that you’re creating within that are hard to merge. For example, a few years back, SSIS or [INAUDIBLE],, these kind of– it was very hard to merge them. You literally had to override the code in branch B from branch A if you were going to merge them. And they’re really just suited for Centralised, where you can have exclusivity in the locking and have a limited number of people make the changes so the merging becomes easier. Distributed, on the other side, as the name suggests, is distributed.
439.7
You don’t have a central authority. Every node is a server in itself, if you want to put it that way. That could be on a [INAUDIBLE] machine or on a server. It doesn’t really matter. But the biggest benefits with distributed is that you have cross-platform support. So if you were working in a team where you had people working on Windows, Linux, Mac, then this would just work where all of you, across all of the devices, types of devices, and types of OS could use the same source control model. The other benefit that it gives you is it has a lot of automation baked in into the source control system.
477.1
And we’ll talk about that in future videos like Git Hooks, like a pull request, and really fantastic features that allow you to review the code and automate a lot of the processes that you would run on the code. It has complete offline support. When you clone a repository, you get the full history of the repository, which means you could go back to as far back as you wanted without having connectivity back to the server. The drawback, of course, is with offline connectivity, with full history, if the history was really, really large or if the code base was really large, that operation would take a rather long time to complete.
516.1
And that is why Git is best suited– or distributed source control systems are best suited for smaller code bases, not for huge code bases. The other benefit with this type of code base is that it’s got a very enthusiastic user base. And a lot of this user base is evolving the underlying framework in the open source. So it’s gaining a lot of new features because of open-source contributions to it. It’s best suited for small and modular code bases, for highly distributed teams, for teams that are working across platforms, and especially for projects that are greenfield. So without further ado, I’m going to flip into a demo. And we’ll talk about– I’ll show you the code hub within Team Services.
560.4
I will give you a brief walkthrough of how the code hub is plugged in into the different parts within Team Services and how that overlap really helps you benefit from the integration that it offers. All righty. So I’m going to navigate into my Visual Studio Team Services account here. Now, if you don’t have a Visual Studio Team Services account, you can navigate to visualstudio.com and create a free account for yourself. In this Team Services account, I’ve got a demo project already. Now, if I go into the demo project and I click on Code, it will directly take me to the code hub.
599.9
Now, as you come into the code hub, you can see at the top-left side here you’ve got a repository called Demo. Now, this Demo repository is one that I’d created when I created the project. But as you can see, I haven’t put any code in here. And that’s why it’s saying that I should initialize the repository by adding a Readme file. I’m not going to do that. Instead, I’m going to pull this down and show you some other repositories that I have here as well. Now, one really cool feature that’s available within Visual Studio Team Services is that it gives you the ability to import an existing repository.
631.3
If you’re already using Git locally or on another platform like GitHub or privately hosting it, and you wanted to import a repository that you’ve been working on, then simply choose the Import Repository section and select the source type as Git. And specify the clone URL of your Git repository. Now, I’ve got to Git a repository on GitHub that I have been working on. So I’m going to click on Clone or Download and copy the endpoint. Take it back to Team Services. Put the clone URL here. Now, if this was a private repository that needed authorization, then of course, I could select this box, specify my username and the personal access token.
673.2
But because GitHub is open-source as is, and my project as well, I don’t need to specify that. And here, you just specify the name of the target repository that you want to create at the point of the import. So for demo’s sake, I’ll call this ImportFromGitHub.
692.9
And click Import. Now, the import process entirely depends on how big your repository is. My repository is probably a few megabyte at best. And so the import process will finish fairly, fairly quickly. There you go. The import process is succeeded. Now, if I click on the files, I can see my full code base has been imported. And the history is intact as well. So if I go and click on History, I can see all the commits that were done. Now, this is really, really cool, right?
722.5
If I had tags, or if I had branches, multiple branches and history, then all of that gets ported over, which means you could just finish something here and start from exactly where you left within Team Services. What this also gives you is the ability to import a repository from TFVC. Now, within Team Services, I have a project called CentralizedSystem. That has some code in it. As you can see on your screen now, we’ve got a branch called Main. And within the Main branch, we’ve got some legacy code lying in there. So I’m going to take the URI of this path, bring it back in here. As I have TFVC selected, I’m going to stick that in here.
766.2
And then I have an option to migrate the history from this code base as well. I can go as far as 180 days’ worth of history. But this is something I created yesterday. So it’s not going to have a lot of history in there. You have to call out what you want the target repository to be called. And I’m going to call it ImportFromTFVC.
789.2
Click Import.
792.8
Oops, I have a space in the path here. So I’m going to remove that and try again. Brilliant. There we go. Again, this is a tiny, tiny Centralised repository. So it should probably be five megs or so. So the import should be fairly quick. There we go. The import is complete. If I click on Files now, and I click on History, you can see I’ve got the exact changes in there. And if I click on the item, it actually does a comparison as well. So I can see I’m really putting the history to use now. So with that, I can import from existing repositories. I can go on the Manage Repository section, where I have the ability to specify permissions.
835
So if I wanted to lock down the master branch, for example, here, I could do that. At the same time, if I wanted to give certain people access to this repository and specify what sort of accesses they had, then I can control that from here. There’s also this really cool thing called branch policies, which we’re going to cover in a separate video in great detail. But you’ve got all the admin functions here as well. So because I’m the collection administrator here, I have the option of deleting the repository if I wanted to, as well.
864.7
Done. Now, if I go back, I want to show you a few more things here. The first thing I want to show you is the Git setup within Visual Studio Team Services is the same Git version that you have in the open-source. So if you wanted to use SSH, for example, then it allows you to do that. If I navigate into the Security section here, then in the SSH public key section, I can add an SSH key. I’ve already added one for the laptop that I’m using. But you could, of course, use the instructions here to generate a new SSH key and then add that here.
900.6
The other thing I wanted to quickly show you was the integration that the repository has within the other parts of Visual Studio Team Services. For example, if I go into the Dashboard section, there is a widget available that will show me the churn in the code in my repository. Let me click Add. Click Configure. And then as you can see here, it’s showing me from the import here that I have two commits in the repository in the last seven days. So if tracking the code church is something that you’re interested in, then the out-of-the-box widgets kind of support that.
938.2
The other thing that I wanted to show you was if you go into the Work section in the Backlogs and select on User Story, I’ve got a user story here. Team Services allows you to create Git repository right from within Work Items here. So let me just create a new work item, call it, maybe, Demo Integration.
965.5
Right-click that. And say New Branch. And it gives me an option here to say, all right, this work item, you want to create a new branch. What branch do you want to spin this off from? What is the branch name? And it just associates this work item and links it so that you have that end-to-end traceability in place. So that was a very quick demo of the code hub. Of course, in the future videos, we’re going to go into a lot of detail. But there, I just wanted to set the stage and show you the integration that’s available.

In the previous step, you gained a great understanding of CI and the building blocks of CI.

In this step, we are going to differentiate between Centralised Source Control Systems and Distributed Source Control Systems.

Source Control is the practice of tracking and managing changes to software code. A Source Control System helps you keep track of features, bug tickets and any other changes in your development project over time.

Source Control Systems help software teams work faster and smarter and are specifically useful for DevOps teams since it helps teams to reduce development time and increase successful deployments.

Centralised Source Control System

A Centralised Source Control System keeps the history of changes on a central service from which everyone requests the latest version of the work and pushes the latest changes to.

In Centralised Source Control, there is a server and a client. The server is the master repository which contains all of the versions of the code. To pull code from the server, you have to be connected to the server.

As a result, you will always only retrieve the latest commits and not the full history of changes.

Key Strengths of Centralised Source Control include:

  • Scalability to very large codebases
  • Granular permission control
  • Permits monitoring of usage
  • Exclusive locking

Centralised Source Control is Best Suited For:

  • Large codebases
  • Audit and access control process
  • Hard to merge files

Distributed Source Control System

In a Distributed Source Control System, each developer or client has their own server and will subsequently have a copy of the entire history or version of the code and all of its branches in their local server or machine.

Each client can work locally or disconnected from the master repository and commit changes to a local repository. To communicate a set of changes to the master repository, you issue a request to the master repository and push your local repository code to the master repository.

Key Strengths of Distributed Source Control include:

  • Cross-platform support
  • Open source friendly code review model via pull requests
  • Complete offline support
  • Portable history
  • Enthusiastic growing user base

Distributed Source Control is Best Suited For:

  • Small and modular codebases
  • Evolving through open source
  • Highly distributed teams
  • Teams working across platforms
  • Greenfield codebases

In the next step, we investigate basic Git commands.

This article is from the free online

Microsoft Future Ready: Continuous Integration Implementation

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education