“You are not using Git correctly”

Paul Marsh
4 min readMay 28, 2021

--

Recently I was reading a plea for help where someone was getting a, “your file is too large to upload” style error message from Git. That made me question, are we using Git correctly?

What is Git?

Whilst Git came from Linux as a source repository it has now taken on a life of its own. Honestly, I cannot definitively answer this question, but I can offer up some possible overlapping rationales. Note, many of these rationales apply equally well to other source repositories, not just Git.

Version Separation

If you are working in a team, or simply want to try to develop a new feature, the ability to branch-out into an pseudo isolated version of the code is really helpful. It allows you to make changes without necessarily updating existing code. I.e. you current code is safe, if your new code does not work, at worst you can just throw it all away and go back the current good version.

Version Time Machine

Whilst having separation via branches is good, you also have a timeline of changes within a branch. This allows you to move your code back to a particular change-set or point in time. It is a bit like a Time Machine back-up, if you do not like your current changes you can reverse back to an earlier point.

Version Paper Trail

If you store all your change-set information as you go then you create a virtual paper trail of all the changes. If you want to know when a certain block of code was changed, and the author of the change, it is all there. This is useful as it can also be associated with other items such as a bug report. Therefore you have the opportunity to not only see the change but understand why the change was made.

Gated Review

If you are using a branching strategy to help ensure the quality of specific branch, typically some form of release branch, then you can enforce rules where merges between branches can only occur when certain criteria have been met. E.g. has the code built correct, have all the tests passed. One typical criteria is has a certain group of people looked at and reviewed your changes. Typically this requires the changes to be in a human digestible form. E.g. comparing source code, or changes of an icon. It typically is not very useful for other forms of data, e.g. DLLs, Terrain Data, Lighting Data, etc.

Build Repository

Since the repository is a focal point for the project it could make sense for it also to be the place were everything related to the project is stored. I.e. if I want to create a fully reproducible build I could store everything I need to create the project. You could put the required Visual Studio installer in there, the database files, the output binaries, etc. Whilst that is an extreme example, having a central repository of the majority of the project’s assets has its advantages. Git Large File Storage (LFS) can support this style of repository.

What is the Conclusion?

As with almost everything, it depends on the context as to what rationales make sense. Practically I would guess, and it is a guess, that the majority of times you will NOT want to use Git as a Build Repository. It you do not then it does require that you have supporting storage mechanisms for releases, such as a cloud file store, but typically you only put items into Git that fulfill one of the other rationales. i.e. when you are wondering if something should be included or .gitignored then ask yourself:

  1. Can a human compare one version of the item with another? If so DO include it. E.g. source code — yes, icons — yes, DLL — no
  2. Is the item generated? If so do NOT include it. E.g. lighting data, intermediate compilations, Terrain Data

NB As with everything common sense and pragmatism should be used. E.g. in Unity source code items have a .meta file that will confuse Unity if it’s missing but you would not typically “review it”.

--

--

Paul Marsh
Paul Marsh

Written by Paul Marsh

Unity, VR, Enterprise and .Net Developer

No responses yet