Writing Code? You Need Git, and ChatGPT Can Help
Years ago, I was on a project where we wrote code which took in raw data sets, filtered and aggregated them in various ways, and then sent the results to an external partner. We were frequently making changes to the code, and sometimes we’d get asked for the version of the code which generated a particular set of results.
But it was difficult to deliver that because we weren’t using version control tools, or tools that keep copies of code or other documents and let you see and compare previous versions. We might have an earlier saved version of the code which had generated a particular set of results – or we might not.
Not using version control tools leads to other problems as well:
If you change your code and break something, it’s difficult to go back to the version that worked – or even to compare different versions to see what the bug is or where it was introduced.
If multiple people need to collaborate on code, you then have to figure out how to manually merge the changes they made.
If you’re engaging in primitive types of version control, like naming documents version_1, version_2, version_final, etc., you might forget which is actually the most recent version, or someone else might not be able to tell from looking at your files.
There's a tool that can be incorporated into your workflow—and your organization’s workflow—that solves all of these problems. It’s called git and it’s free and open-source and it’s for everyone who is writing code, regardless of what language they’re using or what they’re doing. If you have code, you need git.
Using version control is about tracking code changes, but it’s also about a culture of transparency and collaboration. The next time someone asks “how did you get that number?”, your answer should be: “Here’s the link to my code, along with an explanation of what it does and why I implemented it this way.”
How to Use Git Locally on Your Computer
First, you need to have git installed in whatever environment you’re using. Then, your process will look like the following:
When you start a new project, you’ll put the files related to that project in a folder, and you’ll initialize a “repository” in that folder.
Your repository will include all of the files you want to track each version of. These will include your code, as well as a “readme” file which explains your project in terms of its inputs (like the data you need to run it), its outputs (the things it produces, like a graph or report), and its general purpose.
Each time you make a major change or fix something, you’ll save your files and “commit” them, which will make git track those changes.
By doing this, git will save a whole history of each of those commits so that if you want to compare different versions or go back (“revert”) to previous versions, you can do that.
You can interact with git using the command line, but you can also use the git GUI, or graphical user interface:
If instead you're using a cloud environment like Databricks, they'll handle version control somewhat differently – but the same basic functionality will still apply.
How to Use Your Git Server
Having individuals using version control is a good start, but for the big benefits, you also need a git server, which is a location that members of your organization use to share code with each other. You can use GitHub—including using private repositories that only your organization members can see—or you can buy a private git server, either with GitHub or another service like GitLab.
The first part works the same: you have local repositories (or code that you’re using with your cloud version control tools), but in addition, individuals will “push” their local code repositories to this server. They can then make those repositories visible to anyone within your organization, or restrict visibility as needed. When other people need to collaborate, or work on the same code, they can make a copy of that repository, download it, make changes, and then push those changes back to the original, where the person who owns the repository can accept (or reject) those changes.
This is what a basic GitHub repo looks like. The readme file describes the project and is a particularly useful resource, including for people in your organization who may need to understand where the numbers in a report came from but not interact with the code itself.
ChatGPT Lowers The Barriers to Getting Started
I’m a regular git user, but because I mostly use it myself and not for collaboration, there are things I don’t do very much. And frequently, when I have to do something that I only do rarely, it’s a frustrating experience.
ChatGPT has been enormously helpful in these kinds of situations. For instance, last week I needed to copy some code from someone else’s repository, make some changes locally, and push it back and manage the changes. I am capable of doing this with help from Google, but it probably would have taken me 20 minutes to figure out, and it would have been annoying.
Instead, I asked ChatGPT for help and it wrote me the 7-step process I needed, including the commands for each one. And when I ran into a small issue, I told it and it quickly was able to troubleshoot with me. It took a process that would have been frustrating and made it work nearly as smoothly as with the commands that I already know.
Why is this a Good Use Case for ChatGPT?
There are a few reasons why git is a particularly good use case for ChatGPT:
There are a million and a half resources for how to use git on the internet, and ChatGPT has trained on them – but they won’t necessarily have the 7 specific steps you need, and they won’t troubleshoot with you.
You can verify whether what ChatGPT gave you worked.
It’s difficult to cause a security issue. Just don’t paste your GitHub account password into ChatGPT!
If you’re coding and not using version control, adopting git is, in my opinion, the most critical improvement you can make to your workflow. If your organization isn’t using GitHub (or something like it) then adopting it is also the most important change your organization could make.
Writing documentation is also important, and it complements git. Documentation, such as project-level readme files and inline comments within your code, can also be stored and tracked in your git repositories — and ChatGPT can help you get started with writing good documentation as well.