Risks of Ad Hoc Database Scripts in Engineering

Writing ad hoc database scripts might seem faster for modifying records than the more structured production code (like endpoints, cron jobs, backfills), but are we really convinced they are reliable? These scripts come with a host of drawbacks that can’t be ignored. It’s telling that mature organizations resort to using DB scripts only in the heat of active incident mitigation and resolution, having learned the hard way about the risks they pose, often as a lesson learned from previous incidents.

Never “one-time”

In my experience, db scripts always have a way of being reused and the next engineer inherits all of the below risks again. Instead with code, there is more structure and testing.

No application logs

An on-call engineer responding to an incident will first look at the logs for the effected entity and will find nothing. Junior engineers will be stumped while Senior engineers may think to look at Change Management tickets for potential outside sources for changes.

When updating records with a WHERE clause, the records impacted are even more difficult to identify afterwards. While the database table’s createdAt, createdBy, updatedAt, updatedBy, etc. columns may help, the updated* columns will get overwritten and this is not the first place that is searched.

No automated testing

How can reviewers verify that the database was set up to handle all use cases? How can the reviewers verify the output of the script? After pull request comments, does the author manually rerun all the test cases? If the script is reused, how will the next engineer replicate all of the testing? Did the author look at just the “expected” impact, or did he check all other tables too?

Bypasses service and DAO

Systems codify logic and checks into multiple layers. Services also have rate-limiting, locking to avoid database contention, and monitoring. DAOs know the acceptable Enum fields to insert. db scripts bypasss all safeguards.

Difficult to code review

Missing a statement in a WHERE clause can increase the scope of the change. DELETE records can easily CASCADE to other tables.

Visually, if a DB script inserting fewer than 100 records is difficult to review, anything beyond that will receive a no-look LGTM.

Gives Human Database Write Access

Humans make mistakes. They can miscopy a DB script. They can run the wrong script. They can miss a step in the change management ticket. Giving humans the least access is best.