Datto Engineering Blog

Proactively Monitoring TLS Certificate Deployment

Unless you're using the ACME protocol with a certificate authority such as Let’s Encrypt, you're probably well aware of the annoyance of certificate rotation. Here at Datto, we use certificates in many places with a validity period of around a year, depending on the Certificate Authority. Last February, we noticed that several production hosts were providing expired certificates for one of our major Internet-facing domains - a mistake that many other companies suffer from, as well. This caused several problems, and it was decided that after the issues were addressed, we needed to take a very proactive stance in monitoring certificates for all of our TLS-enabled services. I will not dive into the details about why the certificates weren't properly rotated, but rather, what we're doing from now on so this sort of issue never occurs again.

How I stumbled upon CVE-2021-21702 in PHP’s SOAP extension

Over the past year or so, I’ve really been focused on fuzzing research and the different areas I could apply the techniques and tools I’ve come across/created. During this time, I decided to take a break mainly due to feeling burnt out and went back into web pentesting. While looking for some classes of web vulnerabilities, I focused heavily on XXE (XML External Entity) injection as an attack vector. In order to understand how PHP7 mitigates this class of vulnerability, I looked at the SOAPClient library for parsing returned XML data from a SOAP server. After some trial and error, I was able to identify a null dereference bug in the PHP SOAP library that resulted in CVE-2021-21702.

Engineer to Manager, There and Back Again

"What do you want?" A question I had to find the answer to in order to preserve my sanity and my career.

Troubleshooting TLS Session Re-Use and Mutual Authentication in HAProxy

We take data protection seriously at Datto, which is why we’ve been increasingly using mutual TLS authentication to secure communications between components in our application stack. Our use of Hashicorp Vault has accelerated this security pattern, as Vault makes it easy to deploy and manage multiple CAs. Recently, we saw an increase in TLS-related errors for one of our mutually-authenticated application endpoints. In this article, I’ll walk you through how we debugged and resolved this problem. I’ll also take you on a deep dive into reproducing this issue, and I’ll hopefully teach you some fun OpenSSL commands along the way.

Integrating Okta authentication into Nginx reverse proxies

Recently the Datto SaaS Protection SRE team was met with a challenge to add authentication onto an open source web application that didn’t have a strong authentication story with it. We knew that we didn’t want to write an entire authentication layer just for this one application as the return on that time investment would be rather low. Instead we looked for a solution that would be easy to implement, easy to automate, and easy to understand months down the road long after the shine had worn off and was just another application we managed.

Predicting Hard Drive Failure with Machine Learning

We’ve all had a hard drive fail on us, and often it’s as sudden as booting your machine and realizing you can’t access a bunch of your files. It’s not a fun experience. It’s especially not fun when you have an entire data center full of drives that are all important to keeping your business running. What if we could predict when one of those drives would fail, and get ahead of it by preemptively replacing the hardware before the data is lost? This is where the history of predictive drive failure at Datto begins.

Automated OS Qualification with Ansible

Upgrading thousands of servers is challenging and filled with uncertainty. This article describes how we leveraged Ansible to build automation that increases confidence in our upgrade process.

Reliably rebooting Ubuntu using watchdogs

Rebooting Ubuntu is hard. I don’t really know why, but in my twelve years as an Ubuntu user, I’ve encountered countless “stuck at reboot” scenarios. Somehow, typing reboot always comes with that extra special feeling of uncertainty and the thrill of danger. This post describes the short story of how we managed to make Ubuntu machines reliably reboot using Linux watchdogs.

How Datto manages trust within a fleet of devices

Learn how Datto manages the rollout of trusted root certificates to a fleet of hundreds of thousands of devices without causing a single failed backup!

Unifying Our User Interfaces

Datto has a lot of great products in its arsenal that only keeps growing as we acquire more businesses. Because of this, integration is very important and a key item of that is user experience. To accomplish this, we knew we had to streamline our user interfaces. As product designers, we are advocates for the user and have a proclivity for good aesthetics. So for us, this was not only a time to stretch our visual design muscles but to also improve the usability across our products.