Today was a nightmare. I was up super early (right at 5AM – thanks bladder), but made a tactical error (insofar as sleep goes). I looked at my phone. I had a notification on Mastodon telling me that my personal blog story was unreachable with a 404 error. OH NO! I instantly was jerked awake, and sprung to action (at 5AM). Discovered that quite a lot of stuff was down. Decided to write a story ab out what happened (given that’s what I do), so if you want to read my horror story, check this out:
My web server has a total of 9 domains on it. Of the nine, sox of them were down. I’ve had this happen before, usually rebooting the server will fix it. So I did that, and it didn’t fix it. Uh-oh. I dug into it deeper and discovered that two major problems had occurred.
- My server’s hard drive filled up
- The MySQL server crashed
Both of which would prevent websites that used WordPress from working, as they require the MySQL database behind them to be running, and with the drive full and databases down, nothing would work. Several of my sites were down. No worries, I have backups for that.
There are two kinds of backups made on my server daily. The first is a full server backup, I get a tarball of each of the domains, the theory being I could restore the entire server from that. In fact, in prior server issues, I’ve used these full server backups before. It worked for the other domain names today – but not the Rangers site. I keep a weeks’ worth of this type of backup. Problem is the three most recent backups (including today’s) were corrupt, they didn’t have a backup of the Rangers site (again, due to drive filling up). That left four backups going back to a week ago. I restored all of them. NONE of them had the database.
It sent me into a MAJOR panic, as without the database, my site was gone. With no database, all the files on the server were pointless as this is all driven from WordPress. This Rangers site has been around since 1998, and while I don’t do the daily updates anymore, there’s a lot of archives here, and a few bits I still maintain. I’m very proud of the uniform number coverage, so the very real possibility this morning of having lost all that work was seriously distressing to me.
So OK, that brings me to the second kind of backup. All my sites that use WordPress (of which this is one) do a nightly backup via a software plugin called BackupBuddy. The problem is that BackupBuddy requires you to have database access to use it, and without that, I couldn’t use that backup, so I thought I was screwed since the other kind wasn’t working. In this section of the recovery, I discovered that my root problem is that when my system filled up it was doing a sql dump, and it appears that the drive full situation while accessing SQL caused a mess which meant that most of my domain names that used databases lost their access to the data – even if it was still there. I had to reset the passwords on everything, and then it let me clean up the right way.
Fortunately this cleanup allowed me to reinstall my Rangers site from scratch. I had to pretend WordPress was a virgin install – it basically reset the database. Given my old WordPress install was actually still THERE, just unreferenced because of the new database, I was then able to restore from one of the existing BackupBuddy backups. It took some time to restore everything (as the tarball for my Rangers site is about 3.18 Gb). But EVENTUALLY I got everything going.
I didn’t declare everything restored until about 2:20PM on Monday afternoon, and I first discovered the problem right at 5AM.
So now what? I cleaned up all the mess, the drive is back to about 50% usage, which is where it normally lives with the week’s worth of backups there, but I need to make sure this doesn’t happen again. the full server backups I will be going to remote backups. I used to, but Dropbox decided some time ago to do away with their command line backup option. I discovered today in fixing everything that the server maintenance software I use has an option to backup to Google Drive, something I will be signing up for. Will cost me $75 a year to do that, but I think the cost is worth it for piece of mind.
Bottom line is that everything is working here again. It was all due to to me losing database access due to a drive filling up situation. What a mess. Serious stress bomb this morning too. Normally I take my blood pressure every morning at home, my wife and I figured this was a good day to skip that today, there was no way I was going to get a good reading today. I will tonight now that everything’s running again.
The moral of the story. Don’t let a database heavy web server fill up. All kinds of chaos ensues when it happens.