Can The Real Codeowners Please Stand Up? Code Provenance at Scale
Time to read: 5 minutes
Figuring out code ownership at a large company can be challenging. And identifying code owners during code related incidents is hard – with an element of stress to boot. The Product Security team at Twilio set out to solve our code ownership challenges in a way that we think can help you as well.
Today, we’re proud to release two things that go along way towards solving the problem:
- about.yaml - a new code ownership file specification that has all the information you need to trace any code’s current owner across your company
- Gordon - a Github app service to monitor repositories for keeping about.yaml files up-to-date.
Why do we need this?
More times than we’d like to admit, we found ourselves in a situation where we find a bug or vulnerability in a piece of code, do a git blame
to see who last touched that code, and find out that person no longer works at Twilio – or is on PTO. Then, the adventure starts: pick the next name in the git blame
timeline and go down the rabbit hole to find the right owner to work on a fix.
That’s a lot of time wasted in a state of emergency, isn’t it? Every Security team out there had to go through this situation at some point.
Now imagine a galaxy far, far away (spoiler alert: not far at all, actually) where the code ownership information and all your required metadata (e.g., owning team, Jira project, PagerDuty information) for every piece of code lives within that codebase and is machine parsable.
The Product Security team at Twilio set out to see if they could make that a reality. And thus – the about.yaml and Gordon initiative.
about.yaml: The YAML file that knows it all
“about.yaml” is the file specification we came up with to solve our difficulties finding code owners. It’s designed to be included in all repositories company-wide and have all the information we need to track ownership.
But why that name? The file is essentially talking “about” the codebase and its ownership, hence our choice of “about.yaml” (yes – our humor is quirky!). The YAML specification is extendable and can be modified to fit the needs of your company. We at Twilio use multiple specifications to scale this file to our codebases while leaving it adaptable for when new teams and companies join us.
One about.yaml specification we think would serve as a great example of the power of the paradigm – and also can be widely adopted – is below:
This specification has the following fields:
- Version - this can be used for the versioning of the file in case you ever decide you want to change the YAML specification and do not want to break any automation tied to specific file formats.
- Organization - this field helps identify which organization this file is coming from during automation efforts – useful for if a team joins your company, perhaps through an acquisition. For us, this might allow Twilio and any child organizations to maintain slightly different formats when needed.
- Jira_id - If you use Jira then you’d add the Jira project ID of the team owning the codebase.
- Pagerduty_id - Pagerduty schedule ID of the team owning the codebase which can be used to page the team when there’s an incident on a particular codebase.
If you introduce an about.yaml file like the above, you have a specification that gives you ownership information on repositories.
But – there are still a few important questions. How do you get everyone to add this to their repositories? And even if you can convince people to do so – how do you know the data in these files is actually valid, and not gibberish?
As a Security team of developers favoring an automated solution as a way out of our problems, we introduced Gordon* – an automated service to validate the contents of about.yaml files – company-wide.
Gordon: an automated service to help determine code ownership
Gordon is a Github app that you can install on your Github organization. Gordon runs as a status check on every commit of a pull request to get the about.yaml file from the default branch and commit reference branch, and validate each piece of information mentioned in it. If it can validate all the data mentioned in the about.yaml file, it passes the status check, otherwise it fails and users see a cross mark on their pull requests.
Here’s a pull request with a successful Gordon app status check due to passing validation of the contents of the about.yaml file:
And here’s a pull request with a failed Gordon app status check due to failed validation of the contents of the about.yaml file:
Why we needed Gordon
In 2017 GitHub released CODEOWNERS to help solve this problem. One limitation with CODEOWNERS is that it is tied to users – and they might be out of office or no longer with the company. We also did not find a place for metadata such as jira project id
to automate our code security vulnerability management.
We wanted a deploy-and-forget solution that constantly monitored the validity of the contents of the about.yaml file and was reliabled. It had to be a service we never needed to touch except when we decided to add a new version of the YAML specification for the organization, or if we decided to propagate this file to a new acquisition.
Dealing with shared ownership
One of the main challenges we faced while enforcing the Gordon status check across all of Twilio’s code bases was “shared code”.
We had (and have!) repositories which had code developed in collaboration with a lot of people and teams across the company. While collaboration is good, it is an issue when you need to quickly determine ownership in an emergency. We have been working to find a single owning team for every piece of code in the company, but as a workaround, we use shared_repos.json
files where we list all of the “shared repos” on which Gordon checks are ignored.
The benefit of this approach is that you can roll out Gordon without having figured out how to deal with your own shared repositories – and it also gives you an inventory of shared repos you need to eventually handle.
Try using Gordon and about.yaml
For the past two years, Gordon has proven to be very helpful in promoting the adoption of about.yaml files on repositories and has helped us determine code ownership across the organization. We’re excited to release it to the open source community – and can’t wait to hear about how you use Gordon in your organization.
To learn more about how to deploy Gordon and start using about.yaml files, see: https://github.com/twilio-labs/gordon
Laxman Eppalagudem is a Senior Product Security Engineer at Twilio focused on securing Twilio’s products before they go out to customers. He can be reached at seppalagudem [at] twilio.com
Related Posts
Related Resources
Twilio Docs
From APIs to SDKs to sample apps
API reference documentation, SDKs, helper libraries, quickstarts, and tutorials for your language and platform.
Resource Center
The latest ebooks, industry reports, and webinars
Learn from customer engagement experts to improve your own communication.
Ahoy
Twilio's developer community hub
Best practices, code samples, and inspiration to build communications and digital engagement experiences.