Git: a guide to proper use

Everyone knows how to use git nowadays, or do they? I've found that many engineers, while being able to use git, do not know how to use it effectively or why we do certain things in git.

Of course, I know that many just commit everything with same message, but I've never had the pleasure of working closely with them. Thanks to the newly joined members of my team, that changed. They thought they could get away with commits such as "fix tests" - I had a different idea and it took me a considerable amount of effort to get them on track and explain them the fine art of git commit.

We use git to enable many engineers to work on a project at the same time. That much is always clear, because it's the explicit mechanism implemented by git. Subtly, git allows us to work with time and knowledge. With git we can share the knowledge of the development process with others, in the present and in the future.

The code is the result of the development process, and yes we share that in git, but that's the equivalent of sharing the output of an application. In software engineering, we need to know how that output was reached. Sharing this information with git requires a more careful use than the one most do.

Making small commits is the first step. How small? A single function, if the function is simple. A new module, also if the module is simple.

Writing clear commit messages is the second step, deeply interwined with the first one. It's hard to say in 50 characters what you're doing in 200 lines, so to have well-written commits you need small commits first. Having a common commit template is very helpful, because everyone knows what to write in the commit. For example:

feat/cli: create command shows a success message

This commit is very effective at communicating that the create command now shows a success message. feat/cli clarifies it even more: it's an addition and it's in the Cli module. Implicitly, it's also saying that this codebase has a command-line interface that has a create command. If the author is adding a success message, we can guess that the create command was added recently or that it was not so important.

As we read the git log, we can adjust our guess and obtain information about the development process that we could have not obtained if the commit was just a little less precise:

create shows a success message

Now that we have small and descriptive commits, it's easier to know when there is an unclear passage that others may not understand and add a body to the commit. And because commits are now meaningful, developers will be more likely to pay attention them, both when reading and writing them.

We also have unlocked the power of git blame. We can go back in time and know with pinpoint accuracy who made a change, instead of getting lost through large unhelpful commits. We can also undertand that change because there are descriptive messages.

To streamline further our git experience, we need to do one more thing: use rebase for pull requests. Rebasing means having a cleaner git log with fewer interruptions due to merge commits and a cleaner git blame, for the same reason.

I've met skepticism with using rebase, the r-word of programming according to some engineers (documentation is the d-word, according to one certain manager, and at some point, I'll write about that story too). Rebase rewrites the commits history and the timestamp of reapplied commits can sometimes be before previous commits. For example you can have the last commit at 11:00, and the following commit instead of being at 10:00, it's at 12:00.

Git knows better. It does not change the timestamps of rebased commits because it knows the importance of time. We can use this mechanism to identify rebases. By looking at commits with non-linear sequences of timestamps, i.e. moving back and forth instead of just going back, we now know where rebases happened.