Note: This post was originally published on May 3, 2021. It has been updated to reflect new information, including an extension of the Drupal 7 end-of-life date.
Drupal 7 was originally scheduled to reach its end-of-life date in 2021. That date has been extended several times and is now scheduled for November 1, 2023. A lot of organizations are reconsidering their website plans and many of them are looking to switch to WordPress.
Over the years WordPress has continued to dominate the CMS landscape, now powering over 43% of the entire web (compared to Drupal’s paltry 1.7%). There are many reasons to consider WordPress; ease of use, security, flexibility, etc., but for this post, we’ll assume you’ve already decided to make the switch.
Making that switch, especially If you have a large website and need to migrate all of your content out of Drupal and into WordPress, can be a challenge. So, let’s break down some approaches you can use here.
There are essentially three key issues: getting your content out of Drupal, getting your content into WordPress, and preserving your search engine rankings in the process. Let’s take them one at a time.
Getting your content out of Drupal
This is probably the biggest challenge. There are basically 4 ways:
AKA good ol’ fashioned copy and pasting. For small sites, this is honestly often the easiest way. A manual approach is also the best way for your main site pages since you’ll likely reorganize and rewrite content during a redesign process. For larger content-heavy sites though, this is a tedious nightmare that will take a ton of time and introduces a large margin for error. If you’re trying to migrate, for example, 10 years of press releases or blog posts, you’ll want a more automated approach.
2. Use a Drupal export module
There are a handful of Drupal modules designed to help with data exporting. The most popular is Views data export which allows you to take data from Drupal’s views module and export it into a format like a CSV or XML file. In our experience, this approach will work well for simple configurations but sites that have elaborate setups with custom fields, taxonomies, etc. may prove to be more difficult. Pro tips:
- Use the Image URL Formatter module to make it easier to export your featured images as absolute URLs
- Exporting into XML and wrapping your main content in CDATA blocks makes it much easier to preserve complex formatting than a traditional CSV export
- If you’re dealing with complex fields like repeaters (where a resource might have several associated references, for example) you can use the hooks_view_pre_render function in a custom module to loop through them and create nested XML tags for easy importing later
3. Write SQL scripts
This approach involves going directly into the database and writing code to export exactly the information we need in exactly the correct format. This approach gives us a ton of flexibility in how we export the data. That said, it also requires extensive knowledge of SQL scripting along with Drupal’s database architecture. Here’s an example of a sample export script for blog posts with a couple of custom fields and taxonomies.
SELECT `node`.nid AS drupal_id, `node`.vid AS drupal_revision_id, FROM_UNIXTIME(`node`.created) AS created_date, FROM_UNIXTIME(`node`.changed) AS last_modified_date, `node`.title, `field_revision_field_summary`.`field_summary_value` AS excerpt, `field_revision_body`.`body_value` AS main_content, `field_revision_field_date`.`field_date_value` AS published_date, `url_alias`.alias AS drupal_url, GROUP_CONCAT(DISTINCT tax1.name ORDER BY tax1.name ASC SEPARATOR ', ') AS categories, GROUP_CONCAT(DISTINCT program_node.title ORDER BY program_node.title ASC SEPARATOR ', ') AS programs FROM `node` LEFT JOIN `field_revision_field_summary` ON `node`.vid = `field_revision_field_summary`.`revision_id` AND `node`.nid = `field_revision_field_summary`.`entity_id` LEFT JOIN `field_revision_body` ON `node`.vid = `field_revision_body`.`revision_id` AND `node`.nid = `field_revision_body`.`entity_id` LEFT JOIN `field_revision_field_date` ON `node`.vid = `field_revision_field_date`.`revision_id` AND `node`.nid = `field_revision_field_date`.`entity_id` LEFT JOIN `url_alias` ON CONCAT('node/', `node`.`nid`) = `url_alias`.`source` LEFT JOIN `field_revision_field_categories` ON `node`.vid = `field_revision_field_categories`.`revision_id` AND `node`.nid = `field_revision_field_categories`.`entity_id` LEFT JOIN `taxonomy_term_data` AS tax1 ON `field_revision_field_categories`.`field_categories_tid` = tax1.tid LEFT JOIN `field_revision_field_programs` ON `node`.vid = `field_revision_field_programs`.`revision_id` AND `node`.nid = `field_revision_field_programs`.`entity_id` LEFT JOIN `node` AS program_node ON `field_revision_field_programs`.`field_programs_target_id` = program_node.nid WHERE `node`.type = 'blog_post' AND `node`.status = 1 AND FROM_UNIXTIME(`node`.created) >= '2015-01-01' GROUP BY `node`.nid;
Each Drupal installation will be very different so this approach will require a lot of manual setup work to get the correct data, but if you’re good at writing SQL queries, it can be a terrific way to export the information.
4. Use a content scraping tool to pull it directly from the public website
One of the potential drawbacks of using options 2 and 3 is that it will give you what is stored in Drupal’s database. Sometimes that includes Drupal code that won’t work on WordPress. For example, if you use an image management tool like Scald, your content may include coded references to images that Drupal converts on page load, rather than the images themselves.
What we need is the converted output that appears on the actual page. The manual cut/paste approach will get us that but what if there was a way to automate that process. Enter ScrapeStorm, an AI-powered web scraping tool. We can essentially feed it a list of URLs and select which area(s) on the site to grab (some Drupal template modifications to add classes or IDs can help here) and it will load each page and copy the desired content for us.
Getting your content into WordPress
As we mentioned at the beginning of this post, a manual approach is going to be best for the main pages on your site; the homepage, about us page, main program areas, etc. because it allows for maximum flexibility and those pages will likely have much more customization.
For the bulk of the legacy content from your Drupal site (press releases, resources, blog posts, news items, etc.), you’ll want to automate that import. For that, we highly recommend WP All Import. It’s an industry-leading plugin that easily allows you to create a batch important. It will pull in the content, create taxonomy terms, download images from your old site and import them into the WordPress media library, and it integrates with other WordPress plugins you’re probably already using like Advanced Custom Fields, Yoast SEO, or WPML.
WP All Import is also very developer friendly so if you have data that needs to be transformed (taxonomy terms being re-named in the new site architecture, for example), it will allow you to write some simple code to update the data during the import process. You can also do this for recurring imports if you routinely need to sync data from an external system, like a grants database.
Preserving your search engine rankings
When you move content from one site to another, it is imperative that you pay attention to the actual URLs for each page. If they change in any way, you need to make sure that you create a 301 Redirect from the old URL to the new one so that Google and other search engines know where the content has moved. If you don’t do this, you may end up losing any search engine reputation you’ve built over the years.
There are a couple of different ways to manage this. We typically import the old Drupal URL into a custom field during the migration process. We can then use WP All Import’s sister plugin, WP All Export, to export a list of every page’s old and new URLs, which can then be imported into the Redirection plugin to create the actual redirects.
If you take one thing away from this post, it should be having a plan for redirecting any URLs that change during a website redesign. No matter how you do the content migration, this step absolutely cannot be skipped.
There’s no magic recipe for this. We often find a combination of these techniques is the best approach. Work with your website agency to make sure that they have a plan for how best to accomplish this.
We obviously spend a lot of time thinking about things like this, so if you’re considering a Drupal migration or website redesign project, we’d be happy to see if we can help. Contact Us to set up a time to chat.