Datto Engineering Blog

Predicting Hard Drive Failure with Machine Learning

We’ve all had a hard drive fail on us, and often it’s as sudden as booting your machine and realizing you can’t access a bunch of your files. It’s not a fun experience. It’s especially not fun when you have an entire data center full of drives that are all important to keeping your business running. What if we could predict when one of those drives would fail, and get ahead of it by preemptively replacing the hardware before the data is lost? This is where the history of predictive drive failure at Datto begins.

Automated OS Qualification with Ansible

Upgrading thousands of servers is challenging and filled with uncertainty. This article describes how we leveraged Ansible to build automation that increases confidence in our upgrade process.

Reliably rebooting Ubuntu using watchdogs

Rebooting Ubuntu is hard. I don’t really know why, but in my twelve years as an Ubuntu user, I’ve encountered countless “stuck at reboot” scenarios. Somehow, typing reboot always comes with that extra special feeling of uncertainty and the thrill of danger. This post describes the short story of how we managed to make Ubuntu machines reliably reboot using Linux watchdogs.

How Datto manages trust within a fleet of devices

Learn how Datto manages the rollout of trusted root certificates to a fleet of hundreds of thousands of devices without causing a single failed backup!

Unifying Our User Interfaces

Datto has a lot of great products in its arsenal that only keeps growing as we acquire more businesses. Because of this, integration is very important and a key item of that is user experience. To accomplish this, we knew we had to streamline our user interfaces. As product designers, we are advocates for the user and have a proclivity for good aesthetics. So for us, this was not only a time to stretch our visual design muscles but to also improve the usability across our products.

ROP Chaining on ARM for Research Purposes

Tutorial on how to construct ROP chains from difficult ROP gadgets in ARM assembly.

Building OpenTSDB for 10,000 hosts

Over the past 2 years, Datto has been working on how we could collect consistent data from our entire fleet of hosts. In this post, we'll discuss how we leveraged OpenTSDB to collect nearly 1 million metrics a second across our infrastructure.

A few police-action stories from working on the Datto Linux backup agent

A few years ago, Datto released its very own Linux backup agent. In this post, we reminisce about lessons learned from developing and supporting the Datto Linux Agent (DLA) shortly after its production release.

Wi-Fi Roaming: how it works, how to successfully deploy it and how to troubleshoot it

Understanding how Wi-Fi roaming works is key to a successful Wi-Fi deployment. This post focuses on the most difficult Wi-Fi environment, the enterprise business, and explains how to fix problems and why things sometimes don't work.

Automating Vault and Consul Template Management

Configuring and managing Vault isn't too difficult, but integrating it with our existing configuration management tools provided a unique challenge. We wanted to ensure that we could continue using Puppet, our config management tool of choice, to automatically handle the day-to-day operations of our new Vault deployment.