So I don’t think I have enough traffic yet that anyone is likely to have noticed, but, I had some downtime last night. Completely self-inflicted, of course. I’m not sure exactly what specific things were going wrong, but that’s exactly what the problem is. I was less careful than I should have been. Here’s the story:
So I’ve been working on my site lately trying to add new stuff to it. A few plugins that look like they might be more useful than problematic. Then I realized my site didn’t have https yet. Not that I previously thought I did, I just hadn’t fully realized the implications of this until a few days ago.
It turns out, trying to make a wordpress site use https correctly is a pretty common stumbling block. There’s a couple plugins that are supposed to help, but even then, lots of people still wind up with redirect problems, or with mixed content being served (ie, things referenced in the page still using http even when the page is requested with https, causing browser warnings even if the certificate is signed).
I made myself a self-signed certificate to use for testing. I didn’t wanna use yet another plugin when I already feel like I might be pushing too-many, so I thought I’d see if I could do it just with wordpress settings. But once I changed the url settings, boom! redirect loops. Firefox detected that the site was redirecting in a way that would never complete.
After trying to find where in the files or databases those are stored and failing (one of these days I really need to learn how a wordpress database is actually organized), and checking all the .htaccess files for redirects, I decided to try renaming all the .htaccess files (so that they’re not named .htaccess anymore). I even removed the certificate too. Then it was almost, sorta-working. It looked like a webpage that was missing its stylesheets, or rather, what I think of a webpage as looking like when missing stylesheets. The links were all there, but everything was in one column and there were no pictures. There was a button though which seemed a little odd, so I’m probably not understanding exactly what it is it was doing, but here’s the tricky part:
The login page was still using the https url. So even though the page was sort-of loading, I couldn’t log in to try disabling plugins and stuff. My best guess at the time was that it was somehow in the database. I also thought it might be something caching something (I’ve recently enabled a lot of caching settings (but not using disk caching! but more on that topic in another post (technically I think nginx’s cache is currently on a disk. I intend soon to make a tmpfs for it though so that that would be on ram too. I want to make sure I first understand how the virtualization affects mounting and whether /tmp is already special; if I decide it probably won’t hurt anything I’ll probably mount all of /tmp to ram. ) ) ) but I’d already restarted everything that should be storing the caches, and removed all files from the nginx cache. I was able to grep the table files for https to determine it was in one of three tables. And I know how to log into mysql and use a database and describe a table and select from a table where things are like things, which sounds like it should be enough. But I didn’t know quite what exactly it was I was even looking for, so I’d either find no results, or way too many results to see what was going on. This is part of why I mentioned earlier wanting to learn more about wordpress databases and their structure and how it organizes things so that it’s easier to find things. Like using awk in exim logs, it’s very powerful but is most useful when you know all three of command syntax, structure of what you’re looking in, and what you’re actually looking for.
I think I might have had a little more success if it wasn’t already past my usual bedtime, and/or if I didn’t have a headache at the time (I had less caffeine than I’ve gotten used to lately). But I was tired and annoyed. Annoyed at wordpress for not working how it seemed like it should, annoyed at myself for not having figured out how to fix it yet, and embarrassed that my site that I was so excited about building and just made a post that I was telling people about, so they might see that I’d broken my site and not yet fixed it. I really didn’t wanna leave it visibly broken overnight. So I tried restoring from a backup.
First, I used the backup restore functionality from within WHM. I have my automatic backups configured to run daily and I keep a lot of them, so I did have a backup, and it wasn’t all that old either. The behavior was unchanged though. My main site page was still only sort-of loading, and the login link still had https and was getting 404 (and when I tried to use plain http it redirected to https and thus still got the 404). I looked in the folder and saw both .htaccess and .htaccess-bak existing. The backup restoring didn’t remove files that weren’t in the backup so it wasn’t everything exactly as it was at two in the morning.
I almost decided to go to sleep then, but then I remembered that if I remove the cPanel account and then restore it from the backup, then it should be exactly how it was. So I tried that.
But then everything was getting 404. And I couldn’t figure out why it would be doing that if all the files and databases were the same as they were at 2am when it was working fine. But I also knew whatever was happening was probably more complicated than I was going to figure out that late at night. So I went to go sleep.
This morning, I tried making a mysqldump of the database, and seeing what happened if I made a new wordpress install in a different account, then imported the database dump. The main page loaded fine, but all the links were still pointing at the other site. Apparently the links are all in the database. (There is a lot of stuff kept in the database. I really should learn how it’s organized). So then, after saving another copy of the files I already had, I tried to do a new wordpress install in the original account, with the intention of later trying to find just the parts of the database that pertain to the posts to import just that. But even the wordpress install page was giving a 404.
I almost thought it was going to end up easier to start the whole thing over, but wanted a break first. After Brad got home from class, I told him about it and he suggested taking one more look at the apache error logs. And it is a good thing he did. There were mod_ruid2 errors about it not being able to change directory correctly. It turned out something weird was happening with the apache jailshell. Brad pointed out that it is still listed as experimental, so I should check if it’s any better if I disable it. At first it didn’t make a difference, but then he also pointed out sometimes apache isn’t always restarted fully enough after making some types of change. Once I restarted apache, then the wordpress install page was loading. Oh yeah, I deleted my files (after copying them to somewhere else).
Then I did another backup restore, from yesterday morning’s backup, and then my page was loading just fine, and I was very happy. I logged in and started writing this post, to explain what was wrong with my site, but also to share things I have learned.
I think now that the apache problem might have been that the jailshell was confused about the mountpoints. One of them wasn’t unmounted correctly when I terminated the account, so I unmounted it manually so that the repquota cron would work again. I think some of the other jailshell settings might have not gotten cleaned correctly when the account was deleted and recreated.
Now that it’s working again, I might try briefly to see if apache jailshell will work right if I re-enable it. If it doesn’t work I will re-disable it. It is still experimental after all, so some bugs are to be expected. After that, my plan is to go over some settings changes I made yesterday that got lost when I did the backup restore, and then research how to use https with wordpress. I do still want to do that, but after what happened yesterday, I’m going to try to research it better before jumping into things I don’t actually know how to do.