WPMUDEV Autoblog Feed processing causing load, duplicate issues when network activated

Optimally, the Autoblog plugin for WordPress does exactly what it sets it out to do. However, the recent update to version has caused some significant issues when “Network Activated” on WordPress Multisite installations.

For those who aren’t familiar with it, Autoblog is essentially a feed reposter. Users can configure any number of RSS feeds to be imported into the blog and turned into posts. It’s a great way to automatically populate a site with fresh content.

The Cause

In previous versions, the plugin could be activated per blog, but the configured list of feeds were stored in a central table. This behavior changed in the latest version: Unless the plugin was “Network Activated,” which forced the plugin to be enabled on all blogs, settings would now be kept in separate tables for each blog. Current users who hadn’t activated the plugin across their network would suddenly find their feeds missing, since Autoblog was now looking for their feeds in a new table that had not previously existed.

Users could either recreate the list or network activate Autoblog, causing their feeds to appear in their previous state. However, blogs with multiple feeds would start to see problems almost immediately. The key to these issues is related to two different philosophies between WordPress’ WP Cron scheduler and Autoblog’s network-wide behavior.

WP-Cron was designed with the understanding that it would often be used to execute long-running, high load tasks. Therefore, it takes steps to prevent multiple instances from running at the same time. However, it is designed to do this within the context of a single blog, even when running a multi-site network.

When activated across a network, Autoblog’s feed processor uses WP-Cron to process any feed for any blog in the shared table. Based on this and other glitches in the current codebase, scheduled tasks will be started by multiple blogs at practically the same time. Oblivious to each other, they will begin to process the same feed listings simultaneously. This potentially causes:

  • High load and RAM use
  • Fewer HTTP threads for use by real visitors
  • Data corruption and duplicate posts

One of the more esoteric data corruption bugs is potentially caused by different timezone settings between blog with the feed and the blog context processing new posts.

The Fix

A partial fix that I am currently testing adds a network-wide check for other Autoblog scheduled tasks. The code is similar to WP-Cron’s own lock mechanism, but altered to work across the network.

  1. if ( is_multisite() ) {
  2.     add_action( 'autoblog_pre_process_feed', 'autoblog_site_cron_lock' );
  3.     add_action( 'autoblog_post_process_feed', 'autoblog_site_cron_unlock' );
  4. }
  6. function autoblog_site_cron_lock() {
  7.     $gmt_time = microtime( true );
  8.     $doing_cron_transient = get_site_transient( 'autoblog_doing_cron' );
  10.     if ( $doing_cron_transient && ( $doing_cron_transient + WP_CRON_LOCK_TIMEOUT > $gmt_time ) ) {
  11.         exit();
  12.     }
  14.     $doing_cron_transient = $doing_wp_cron = sprintf( '%.22F', microtime( true ) );
  15.     set_site_transient( 'autoblog_doing_cron', $doing_wp_cron );
  16. }
  18. function autoblog_site_cron_unlock() {
  19.     delete_site_transient( 'autoblog_doing_cron' );
  20. }

If you find this helpful or have some feedback, please let me know via Twitter, @deltafactory.