Learning a New Codebase

These are some notes for jumping into new codebases. Lately at work, I’ve had to jump into brand new codebases almost every day. These are some tips and methods I’ve found for getting to know a codebase as soon as possible.

First: Vendoring & Dependencies

When opening a new codebase, one of the first things you need to do is find out: what are the dependencies of this codebase? Find the vendor manifest of a codebase and use that as an intro to see what kind of libraries, packages, and external dependencies this codebase is using.

This could be:

  • package.json in a Node JS or frontend JS codebase
  • build.gradle for a Java application
  • go.mod for golang
  • gemfile for a Ruby application

Once you locate this file, look up some of the dependencies to see what they do. Try installing all the dependencies of the application (npm install or bundler) and see if there are any errors.

Locate the Entrypoint

If you can successfully get dependencies all resolved and installed, the next step is to get the application up and running. To do this, and to understand this, you need to find the application entrypoint. This is the first file/highest-level file of the application.

Look for this pattern:

<command> <file>.<extension>

The command here could be the language command-line command (node or python), the extension would be the extension you use when writing source files of the language (.js or .py).

Identify Routers

Every application, regardless of frontend or backend, should have some concept of routing. This could be pages or views in a frontend application; or it could be a list of backend API routes for a backend codebase.

Regardless, there should be some kind of file listing out routing. This should be one level deeper than the high level application entrypoint, and should also be quite high-level in your application. Be aware: there can be more than one router in an application, or none at all.

Fixing Warnings and Outdated Dependencies

Once you’ve understood the dependencies, the entrypoint, and the routing, a good step to get reallly familiar with the codebase is to try fixing some of the warnings. Every application should have some stdout or logging. Check the stdout for warnings. Look for WARN. Warnings don’t stop the application from working, they don’t cause breakage. Yet.

By fixing a warning you can get more familiar with the app, how it works, and maybe even start to do some debugging.

Try to fix a warning. This could be through updating a dependency, removing some dead code, or refactoring a function body. Anyhow, if you’ve removed a warning without breaking anything, you’ve done a great job!

If there are no warnings, you can also try updating an outdated dependency.

Note: this might truly be the best way to get well-acquainted with a codebase.

Read Tests

Fixing warnings and dependencies is a great way to understand your application on a mechanical level. Perhaps more importantly when jumping into a new codebase, however, is understanding the behavior of the application.

We use tests to assert and guarantee that our application can perform the tasks we write it to perform. Hopefully the codebase you’re jumping into has tests.

Usually tests are written in a way that summarizes the application behavior in a succinct, human-readable way. For example:

test('Application counter', () => {
  it('should increment counter', () => {
    const res = app.increment();
    expect(app.count).toBe(1);
  })
});

By reading this test, you can tell quite easily that this application maintains the state of an integer called count, and the application has a function called increment.

end of storey Last modified: