Tales From Hack Day: How Braze Senior Site Reliability Engineers Brian Bernstein and Matt DiSipio Went “Off the Rails”

Published on October 05, 2022/Last edited on October 05, 2022/5 min read

Tales From Hack Day: How Braze Senior Site Reliability Engineers Brian Bernstein and Matt DiSipio Went “Off the Rails”
AUTHOR
Brian Bernstein and Matt DiSipio

Three times a year, employees from around Braze take two days away from their normal duties to participate in Hack Days. These events—a long-running Braze practice that reflects how the company creates space for dreaming up and implementing new ideas—provide a chance to encourage innovative thinking, highlight pet interests, and even optimize the Braze platform in ways big and small.

To recognize the work that goes into each Hack Day, Building Braze will be profiling participants with particularly memorable projects or experiences. This week, we’re talking to Brian Bernstein and Matt DiSipio, Senior Site Reliability Engineers here at Braze*.

Exploring New Challenges

We’ve each done several Hack Days over the last few years—Matt participated in some of the very first ones going back to 2019. We’ve both done solo Hack Day projects over the years, but we agree that it’s better to be on a team.

Hack Days are a great way to collaborate with people you don’t normally get to work with, and to dig into some new challenges. In some cases, you might want to solve a specific problem you’ve identified, but it’s totally fine to just come in to help out and have a learning experience.

A Benchmark for Performance

Our Hack Day project, which we called “Going Off the Rails,” started as a “What if we tried this?” solution to what Brian saw as a potential barrier to scaling. We partnered with Elliot Foster, a Staff Software Engineer at Braze, to see if we could improve performance by making our API endpoints asynchronous using the Rust programming language instead of Ruby on Rails.

The idea was we would set a benchmark with our current setup, run the same loads against our new version, and then compare the results. We wanted to see how the asynchronous model would perform compared to our synchronous model. This was an opportunity to possibly make a huge change that would impact the whole company by helping our systems work better.

Brian had experience with Rust, but Matt and Elliot saw Hack Day as an opportunity to get more familiar with it as they built out this project. Matt set up the test environment and Brian did the coding in Rust, giving Matt an opportunity to learn the language a bit and try some coding himself. This is actually what drew Matt into this projecthe wanted more exposure to Rust.

We saw some issues with our front end with respect to scaling. Because Rails is synchronous, one worker is tied to one single request at a time. When that worker blocks for a DB connection or similar, it has to pause processing. With asynchronous endpoints, a worker can continue processing another request when a previous request gets blocked. That was the main impetus for us re-implementing this endpoint in Rust—to see if an asynchronous frontend can scale better because of this difference in behavior.

Clear Goals and Good Time Management Skills

We’ve done enough Hack Days to have learned a thing or two about getting the most out of the experience. The first, and largest challenge is the time crunch. It’s a short period of time to create something when you factor in preparing for the presentation. You have to be very careful with your scope to be sure you have a project that you can finish. A lot of people now do a fair amount of pre-work on their Hack Day projects, which is great if you have something you believe in and you want to make a splash.

The second challenge is scope creep and going down rabbit holes. It's easy to get distracted when you’re just ‘exploring’. From the very beginning it was important to have clear goals regarding what we wanted to do. We thought a good approach was to find a simple use case and set up an A/B comparison of performance for the two versions. This is the approach we used for “Going off the Rails,” and from the standpoint of being able to compare performance, it was a success—driving significantly stronger performance and giving us the ability to handle high concurrency without errors.

Your own personal success benchmark depends on what you want to get out of Hack Day. It could just be learning a new technology, or it could be demonstrating that a new technology might be a better fit or or perform better than what we’re using at Braze. For others, it’s not necessarily about having a finished product, but about taking advantage of the experimental context to just go for it.

Interested in getting involved in our hack days? Braze is hiring for a variety of roles across our Engineering, Product Management, and User Experience teams. Check out our careers page to learn more about our open roles. For more on asynchronous workloads, see our article on creating integration tests.

*All content in the blog post is of a general nature and for example only. It does not refer to any specific problems, features, or products that Braze intends to make available to customers, either now, or in the future. Nothing in the article should be considered as advice, nor does any information on the post constitute a partial, nor complete roadmap of future technologic features or function of the Braze platform.

Related Tags

Related Content

View the Blog

Join the movement to journey orchestration.

The move to highly-intelligent, always-on journey orchestration is happening. And much of it is happening on our platform. Join brands of all sizes who are taking the craft of customer engagement to the next level.