Alt title: How I cut GIGABYTES off a WordPress database
One of the things we care about here at Cinch is website performance. We provide our customers support and guidance on how to keep a website fast, as well as managing some of those tweaks ourselves. Because I’m sure you’ve heard this before: your website speed is directly related to your blah blah blah SEO blah blah conversions something something customer satisfaction. Blah blah. And it’s certainly true, but it’s only one small part of those things, and there is an ROI on how much effort put towards site speed. If the effort is more than what’s being put towards creating good, relevant content, the there’s an anecdote about a tree, some barking, and potentially finding another tree.
But when a site is visibly slow it’s time to take a look under the hood. For the last few months, my own site Spigot has been suffering from periodic downtime and visible slowdowns on both the front end and WordPress admin. I put it aside for months hoping it would just get better and guess what? It didn’t. So it was time to investigate.
All the Tweets
I’m a true believer in the power of self-publishing, and of owning your content. I believe that blogging on self-hosted platforms is better than posting on privately owned platforms. I thought Facebook would have died off by now but I’m still convinced that some day all the content posted there will disappear. And so it was with my own Tweeting… I enjoy using Twitter, but also wanted to own my own tweets. So I’ve spent countless hours over the years importing and organizing them on the Spigot site. Why? I don’t know any more… but the thousands of tweets, and script I used to import those tweets were bogging the site down.
It’s been my experience that WordPress is very good at dealing with large volumes of things, but I think the number of posts (tweets), Akismet spam slinging, and tweet redirects caused a perfect storm. So it was clearly time for them to go, all I needed was to convince myself that I no longer needed them. And since the tweets brought no real content value to the site, and the only page visits were from bots, the decision was easy. So I bulk deleted all tweets, removed the import script, and killed off the Custom Post Type.
I did the same with the Instagram imports.
My work was done here… [Double hand swipe]
Not quite enough
A few weeks passed and even though the site was no longer experiencing periodic downtime, it still felt slow. And someone pointed out that it was slow. And even though GTMetrix said everything is fine, it was time for a deeper look.
Gigabase
I’d guessed that the site database had to be pretty big with all those tweets, and I’d assumed that deleting tweets would also remove them from the database. But in some reason that didn’t happen. And I can’t figure it out if it’s supposed to happen… But the database was nearly 3GB in size. Searching the database for the string tweet
returned over 23 million results. There were nearly 6000 tweets, which translates into 3800-ish records for each tweet! What were all those records for? I’m not well versed enough in SQL to start poking around, so it was time for something extreme.
Extreme Clean
With both Cinch and Spigot, we import and migrate a LOT of websites. When rebuilding an existing website, we always start from scratch and selectively import the important content – posts, products, etc. Starting from scratch gives us a clean slate to start from and rids us of any legacy cruft. It’s the right way to do it…
But I wasn’t about to recreate all the pages for Spigot. I’ve never tried this technique before, but thought I’d give it a try: Export all content from the live site, and import to a fresh WordPress install. Put all the plugins and themes in place and recreate the site that way. Reconnect all the connections and see what happens…
Well it worked
There were a few hiccups – mostly with importing and images and attaching attachments – but in the end the gambit worked. I’m pretty sure all the pieces are put back together (and if not I’ve got a backup). The database size is now 60MB and the site is snap snap snappy McSnapperson. And GTMetrix still likes it too :)
What’s the lesson here?
First of all, don’t import all your tweets into a website that has nothing to do with your tweets. Don’t mix your ketchup with your ice cream. They don’t belong together. But there are other lessons:
- Keep an eye on your database. I had no idea it had gotten that big.
- Optimize it periodically, but that wouldn’t have helped in this case
- Extreme circumstances require extreme tactics
- A self published blog is better than Facebook.
I know this post wasn’t dense with actual technique, but it’s a good story about sleuthing, sledge hammer action, and scalpel precision. I’m also pretty sure it won’t be read by many, putting a big hole in my “A blog is better than Facebook” theory.