By:Gediminas Backevicius Posted On: Topic:Engineering

Create your Own Git Server on Azure Cloud

Introduction

Git is a distributed revision control and source code management (SCM) system with an emphasis on speed. It can be adapted from small to very large projects. The key difference comparing with the Team Foundation Server (TFS) or Subversion (SVN) is that it is decentralized. With Git, you can do practically anything offline, because developers have their own repositories on their machines. Git has many other advantages - one of them is branching. With Git you can create and merge branches very easily. It's also open source and free, so it can be used with any charge.

Here at Devbridge we already use a centralized workflow with our own Git server hosted in the cloud. It means that we have shared repositories for projects which are stored in a dedicated server. Developers exchange changes through these shared repositories.

This technically deep article describes how to setup a private Git server on Microsoft Azure Cloud.

Setup Git Server

As Git was initially designed for Linux kernel development and has better support for this operating system, it is better to use Linux for the Git server installation. There are various tools developed on other operating systems such as Windows, Mac OS, BSD, Solaris, Android, etc. but all instructions are given to install the Git server on Linux based Ubuntu Server.

To manage the Git server we recommend a Gitolite server. It is easy to install, and it's open source and free. Gitolite also allows controlling easy access to git repositories.

The installation prerequisite is to have an Azure Cloud account. If you haven't - visit the Microsoft Azure Cloud website and try it for free for 90 days.

The first step is to create a virtual machine. Choose the Ubuntu image from the virtual machine gallery and wait a few seconds. To connect to a created Ubuntu Server virtual machine you must install a Secure Shell (SSH) client on your computer. In the Windows operating system you can use PuTTY client.

Choose Ubuntu LTS version - it has long-term 5 year support (LTS).

After the virtual machine is created you need to attach a new empty virtual disk where Git repositories will be stored. You need to do so because by default the virtual machine is created with two disks. The first one has 30GB space dedicated to the operating system, which is not enough space for storing both the operating system and Git repositories together. The second disk is temporary.

To attach a new disk, go to menu and click on “Attach empty disk.”

Disk setting dialog will appear.

After having selected the disk settings, wait for a minute and then you can connect to Ubuntu. When are connected, you need to mount a new disk where Git repositories will be stored. To mount a new disk, you must first initialize it:

  1. Create a new device:

sudo fdisk /dev/sdc
  1. Type n to create a new partition, type p to make the partition the primary partition, type 1 to make it the first partition, and then hit enter to accept the default value for the cylinder. After all of this, type w to write the settings for the disk.

  2. Create the file system on the new partition

sudo mkfs –t ext4 /dev/sdc1

You have initialized a new disk, but you don’t have a target in which to mount the new disk. Therefore, you must create a dedicated user that will own the repositories:

sudo adduser --system –-shell /bin/sh –-gecos ‘git version control’ –-group –-disabled-password 
–-home /home/git git

When you have created a Git user with its home directory you can mount the new disk. To achieve this you need to type the following commands:

sudo mkdir /home/git/repositories

sudo mount /dev/sdc1 /home/git/repositories

sudo chown -R git:git /home/git/repositories/

There is one problem. When the server is restarted you must mount the disk again. It is very inconvenient. This is why you need to automate this process by adding an entry to the configuration file ‘/etc/fstab‘ which contains the necessary information to automate the process of mounting partitions:

UUID=c026b981-874d-484d-a7c8-e591ef3f809d       /home/git/repositories  
ext4    defaults        0 0

To get a list UUIDs of all attached devices type the blkid command:

sudo blkid

Setup Gitolite Server

Before installing Gitolite you need to install git-core library:

sudo apt-get install git-core

When you have installed git-core library you may start installing Gitoltite version 3.x. To install Gitolite you must login as a Git user and download the Gitolite source code from GitHub:

cd /home/git

sudo su – git

mkdir bin

git clone https://github.com/sitaramc/gitolite.git

gitolite/install –to /home/git/bin

To set up Gitolite you need to upload the admin’s public key to the server and then type these commands:

bin/gitolite setup -pk admin.pub

exit

Now you can clone ‘gitolite-admin’ and manage your users and repositories.

Setup Gitweb

You can also install Gitweb which is a Git web interface. By using gitweb you can browse directory trees, see the log or shortlog of a given branch, examine commits, commit messages and changes made by a given commit.

To install Gitweb type this command:

sudo apt-get install gitweb

To access Gitweb you must enable 80 port for the virtual machine.  To achieve this, you must go to the virtual machine’s configuration and add an endpoint with specific settings. We have configured Gitweb so that it will show our repositories grouped by client.

Gitweb configuration can be found here: /etc/gitweb.conf. In this file you need to change the projectroot and project_list values:

$projectroot = "/home/git/repositories";
$projects_list = "/home/git/projects.list";

By default Gitweb is public. This means that anyone can access it. If you are not satisfied with it, you can  enable basic authentication.

Gitolite Server Administration

The biggest disadvantage to Gitolite is that it doesn‘t have any user interface. Users, repositories and access rules are maintained by making changes to a special repository called 'gitolite-admin' and pushing those changes to the server. When there aren‘t many repositories it doesn‘t cause any problems, but  once the number of repositories increases, the management time gets longer.

Structure of our 'gitolite-admin' repository:

/keydir
     admin.pub
         firstname.lastname.pub
/conf
     gitolite.conf
     Client1/
          Repository1.conf
          Repository2.conf
     Client2/
         Client3/

In the main configuration file you can declare teams. You can divide developers into teams, and it will make repository management easier.

‘gitolite.conf’ file’s example:

@admins = admin dev1
@team1 = developer2 developer2 developer3
@team2 = developer4

include "*/*.conf"

‘repository1.conf’ file’s example:

repo Client1/Repository1
     owner = Devbridge
     category = Client1
     desc = Repository’s description goes here…
     RW+ = @admins @team1
     R = gitweb

Create Git Server Backups on Azure Blob Storage

When you have a Git server in which all your repositories are stored, it is good practice to make a backup of your repositories. Backups are necessary in order to restore shared repositories if the server crashes. If you don’t have a backup or it’s corrupted, you can restore a shared repository from your local repository.

We store our backups in Windows Azure Blob Storage. Once a day, all repositories are archived and copied into storage. In order to do this you can use our script written in Python. This script is based on Python Windows Azure SDK, which allows accessing Windows Azure Storage.

git_backup.py file’s content:

import sys
import os
import datetime
import ConfigParser
import shutil
from azure.storage import *

def read_config():
        config = ConfigParser.ConfigParser()  
        config.optionxform = str
        config.read("config.ini")
        return config

def create_repositories_backup(tmpdir, repositories):
        shutil.make_archive(tmpdir + get_file_name_without_extension(),
                          get_achive_format(),
                          repositories)   

def get_achive_format():
          return "zip"

def get_file_name_without_extension():
         return "repositories_%s" % datetime.now().strftime("%Y-%m-%d")

def get_file_name():
         return get_file_name_without_extension() + "." + get_achive_format()

def upload_file(name, key, container, tmpdir, directory, filename):
         blob_service = BlobService(account_name=name, account_key=key)
         blob_service.create_container(container,
x_ms_blob_public_access='container')
            with open(tmpdir + filename, "rb") as file:
                     block_size = 4 * 1024 * 1024
                     block_list = []
                     while True:
                               block = file.read(block_size)
                               if not block:
                                       break;
                               block_id =  hashlib.sha256(block).hexdigest()
                               block_list.append(block_id)
                               blob_service.put_block(container, directory + filename,
block, block_id)
                               print("Block was uploaded...")
          blob_service.put_block_list(container, directory + filename,
block_list)
          print("File '%s' was uploaded to blob storage." % filename)

def remove_tmp_file(tmpdir, filename):
          os.remove(tmpdir + filename)

def main():
         print("git-dump v1.1")
         print("Copyright (c) 2012 Devbridge")
         if os.path.exists("config.ini"):
                config = read_config()
                tmpdir = config.get("git", "tmpdir")
                create_repositories_backup(tmpdir, config.get("git",
"repositories"))
                   upload_file(config.get("storage", "account-name"),
                                      config.get("storage", "account-key"),
                                      config.get("storage", "container"),
                                      tmpdir,
                                      config.get("storage", "directory"),
                                      get_file_name())
                   remove_tmp_file(tmpdir, get_file_name())
          else:
                   print("Configuration file 'config.ini' not found.")

if __name__ == "__main__":
         main()

[storage]
account-name = enter your account name here...
account-key = enter your account key here...
container = backups
directory = git/
[git]
repositories = /home/git/repositories/
tmpdir = /tmp/

To run this script, you must install Python Windows Azure SDK. First, you must install the python package manager:

sudo apt-get install python-pip

When you have installed Python Manager, you can install Azure SDK. To do so, type this command:

sudo pip install azure

To automate this script, you need to create a daily job file ‘git-backup-daily’ and copy to ‘/etc/cron.daily/’:

   #!/bin/sh

   cd /home/git

   python backup-git.py

You also need to give execution rights for the created script:

sudo chmod +xxx git-dump-daily

To test the daily job, type this command:

sudo chmod +xxx git-dump-daily

Conclusion

After all configuration and tweaking are complete, we have a fully-functional Git server with repositories and users administration, backup, and repositories web access. On top of all of this - all solutions are hosted in a solid Azure Cloud.

Gediminas Backevicius

Want more industry news?

comments powered by Disqus
Let's Talk