Git submodule is very useful when you need to use a different project in the form of source code. You could copy the plain files of that other project into your repo, but then you have the problem of whether they should be added as part of your repo or otherwise how to track them.
In the example below, the repo using a submodule repo is called the outer repo.
Add submodule to a repo
Run git submodule add
From the outer repo, run:
git submodule add <submodule_url> <dir>
e.g.:
git submodule add https://github.com/example/example.git example
This creates a directory example in your repo, which is a submodule referring to https://github.com/example/example.git.
git submodule add basically adds an entry in .gitmodules file in your repo with the URL and directory above. It also immediately clones the files from submodule repo to specified directory.
Now your outer repo can use the files from the submodule repo.
Commit the change
Adding a submodule is a change to the outer repo, so we need to commit it.
git add .gitmodules example/
git commit -m "added submodule"
Notice that not only the entry in .gitmodules is added, but also an index in example/, which basically points to specific commit in the submodule repo. You can think of a submodule is tracked in outer repo by:
- submodule directory in outer repo,
- submodule repo URL,
- submodule commit index.
Any change above needs a commit in the outer repo. A lot of confusion is from the fact that people forget the outer repo points to a specific commit in submodule repo, unless it’s updated.
Now, the outer repo is equiped with a submodule on your local computer. You likely also push the change to the remote repo of your outer repo.
Clone git repo with submodules
Now other folks need to work on your repo. They will clone your repo:
git clone /url/to/repo/with/submodules
Note that this does not automatically pull the files from submodules in this repo.
git submodule init
They first need to run
git submodule init
git submodule init copies the mapping in .gitmodules into the local .git/config file.
git submodule update
Then they need to run git submodule update to fetch the specific commit from the submodule repo. This will pull files from the submodule repo.
Note that, this does not pull the latest commit from the submodule repo.
Work in submodule
If you cd into the submodule directory, git assumes you now work on the submodule. All the git commands will be regarding the submodule, like it forgets the outer repo. You can change branch, make changes, commit, push etc., like that it is just a normal git repo.
If you cd back to the parent directory, git comes back to your outer repo.
Use a different commit in submodule
You may need to use a different commit for the submodule, for example, when some new work is done in the submodule’s remote repo.
You can cd to the submodule directory, do a checkout and pull.
When you cd back to the parent directory, git status tells you that submodule index has changed, and that is considered a change in the outer repo. You commit (and push) this change (little change), so the outer repo tracks the new index of submodule repo.
Keep submodule up-to-date
If you let your submodule track a specific commit, its HEAD may become detached.
It’s easier to let the submodule to stay in the main branch of the corresponding git repo, and whenever it needs to use the latest from the repo, just pull.
$ cd my_submodule
$ git checkout main
$ git pull