Skip to main content

Bazelcon 2022 Recap

Siggi Simonarson
Siggi Simonarson, Co-founder @ BuildBuddy

Earlier this month we wrapped up the 2022 Bazelcon conference in New York City. The conference featured talks by many prominent Bazel users including Lyft, Spotify, Tesla, Slack, Stripe, Tinder, Tecton, Coinbase, Uber, more.

Here's are some of the highlights from the conference:

  • We gave a talk previewing the upcoming BuildBuddy 3.0 release
  • Six companies that are BuildBuddy Enterprise customers presented on the main stage
  • We co-hosted a Bazelcon happy hour with Google Cloud that had over 200 attendees
  • We shared what we've been working on over the past year with rules_xcodeproj
  • We gave away 370 BuildBuddy shirts and 1600 BuildBuddy stickers
  • We finally got to meet so many of incredible customers & open source contributors in person!

Tweets, talks, and pictures from the event below!

How We Use ClickHouse to Analyze Trends Across Millions of Builds

Lulu Zhang
Lulu Zhang, Engineer @ BuildBuddy

When you use Buildbuddy with Bazel to build and test software, Buildbuddy captures information about each Bazel invocation, such as number of builds, build duration, remote cache performance, and more. Buildbuddy has a Trends page to visualize trends in this data over time.

The trends page allows you to see how improvements you are making to your builds affects your average build duration and other stats. It also exposes areas that might need improving. For example, if you see the cache hit rate go down over time, your build might have some non-deterministic build actions that could be improved, or some newly introduced dependencies that result in more frequent cache invalidations.

When we first created the Trends page, we used MySQL queries to aggregate build stats and generate the data we wanted to display. For a time this worked well, but we quickly ran into performance issues for customers that had very large numbers of builds. We were able to temporarily improve performance by adding various indices, and though this helped to reduce the number of rows read, it was not sufficient. Some customers do millions of builds monthly, and the Trends page (which can look back up to a year) for these customers was taking more than 20 minutes to load.

The queries behind the trends page require aggregation of multiple columns, such as cache hits and cache misses. A traditional row-based database like MySQL is not always ideal for such a use case. In row-based databases, data is stored row by row. When aggregating columns, more I/O seeks are required than a column-based database, which stores the data of each column in contiguous blocks. Moreover, column-based databases have a higher compression rate because consecutive values of the same column are of the same type and may repeat.

With a row-based store, we can see from this diagram that computing a sum of cache hit count would require us to load both block 1 and block 2. With a column-based store, all the cache hits data are stored in the same block.

Therefore, we felt that using ClickHouse, a column-based database, would improve the performance of required queries for the trends page. We validated ClickHouse’s performance against our use case: it took ClickHouse 0.317 seconds to process 1.5 million rows and calculate the stats. The same query took MySQL about 24 minutes.

One of our goals for data migration is to make sure the data is accurate. We added monitoring and compared data between MySQL and ClickHouse after we enabled double writing in production. One source of inconsistency was that data was inserted into ClickHouse both by the backfill script and production servers. Different to a traditional database, ClickHouse’s ReplacingMergeTree engine only deduplicates data in the background at an indeterminate time. As a result, we needed to manually run the OPTIMIZE operation to force ClickHouse to deduplicate data after the backfill was done. After we were confident in the data consistency, we finally enabled the Trends page to read from ClickHouse.

What's next

We are excited how ClickHouse unlocks more possibilities for us to provide analytical insights into builds, targets, tests and remote execution. For example, we want to add graphs that show how remote actions are spending most of their time. These insights can be used to guide remote execution performance optimizations.

We would love to hear your feedback about what stats and graphs you are interested in seeing. Join our Slack channel or email us at hello@buildbuddy.io with any questions, comments, or thoughts.

Welcoming Iain Macdonald

Siggi Simonarson
Siggi Simonarson, Co-founder @ BuildBuddy

To fulfill our mission of bringing the world's best developer tools to every company, we're intensely focused on hiring outstanding Software Engineers. That's why we're excited to share today that Iain Macdonald is joining BuildBuddy's engineering team!

Iain joins us from Google, where he spent over 10 years as an engineer working across the company from Gmail to Google Maps.

We look forward to working alongside Iain to build the future of developer tools.

Welcome to BuildBuddy, Iain!

Welcoming Maggie Lou

Siggi Simonarson
Siggi Simonarson, Co-founder @ BuildBuddy

To fulfill our mission of bringing the world's best developer tools to every company, we're expanding our team to keep up with our evergrowing customer base. That's why we're excited to share today that Maggie Lou is joining BuildBuddy's engineering team!

Maggie joins us from Thumbtack.

We look forward to working alongside Maggie to build the future of developer tools.

Welcome to BuildBuddy, Maggie!

Bazel Remote Cache Debugging

Brandon Duffany
Brandon Duffany, Engineer @ BuildBuddy

Using a remote cache is a great way to speed up your Bazel builds! But by default, Bazel uploads almost everything to the remote cache.

If your network is slow and your build artifacts are very large (like a docker image) this can lead to poor performance.

To address this, and make it easier to fix, we built the new cache requests card.

In this post we'll explore what insights this card can give you into your builds, as well as some fun details about how the card works under the hood.

Distributed Scheduling for Faster Builds

Tyler Williams
Tyler Williams, Co-founder @ BuildBuddy

Let's start with "what's BuildBuddy" for the kids in back. In short, we provide a UI, distributed cache, and remote execution platform for your Bazel builds. That means we securely compile your code, cache the artifacts, and help you visualize the results. We make it possible to build projects like Tensorflow from your laptop in under 5 minutes instead of 90 minutes.

Obviously to do all this, we have to handle some thorny engineering challenges, one of which is scheduling remote executions. For that, we have a scheduler. The scheduler just matches actions (basically jobs) received by our API to remote workers that actually do the work. If you think of a full build of something like Tensorflow as a 10 course meal, a single action is like a recipe for a tiny part of that meal. To make it easier to visualize, here's a real action from building BuildBuddy:

Bazel's Remote Caching and Remote Execution Explained

Brentley Jones
Brentley Jones, Developer Evangelist @ BuildBuddy

Bazel's famous remote caching and remote execution capabilities can be a game changer, but if you're not familiar with how they work, they can be a bit of a mystery.

Well, don't worry. I'm here with to go over the fundamentals of remote caching and remote execution, with a nuts and bolts (or rather actions and spawns 😄) overview of Bazel's remote capabilities.

How Bazel 5.0 Makes Your Builds Faster

Brentley Jones
Brentley Jones, Developer Evangelist @ BuildBuddy

In our last post, we summarized the changes that were in the Bazel 5.0 release. There were a lot of changes though, so it can be hard to determine which ones are impactful to you and why.

Don't worry, we've got your back. In this post we highlight the changes that help BuildBuddy users build even faster!