Hello!
One of the projects that we are currently working on is a WordPress plugin that integrates with the Toronto Real Estate Board to pull new listings from their systems and import them into WordPress as posts.
The mechanics required to connect to TREB are very basic. One might call TREB’s systems out of date. They put their listing data in a downloadable CSV and store all the listing images on an FTP site. This hasn’t changed for years and likely will not change in the near future without some serious technological overhaul.
Due to this technological restriction, in order to connect modern content management systems such as WordPress to this information, we can write a WordPress plugin that pulls the CSV data, processes it and imports it as as a post. The listing images can be retrieved via PHP Curl and stored in the media library. The problem with things like FTP is that you end up waiting a long time for the process to complete.
The problem with waiting in the context of a web server is usually the web server has strict timeout variables (sometimes 30 seconds , longer or shorter depending). Since this type of an execution timeout variable is different on many systems, that would pose a problem when trying to initiate an FTP based retrieval of an image from a notoriously slow FTP server like TREB’s. It can take upwards of 15-20 seconds per file to retrieve!
How would you deal with this type of a bottleneck? Well you would run it as a background process, ideally to run concurrently with the other processes. In PHP this is not really easily possible.
What we found was a very interesting class that can allow you do to just that : run asynchronous non-blocking tasks in the background.
PHP Functions that take too long to process will usually fail
Lets look at our PHP function to use CURL to retrieve images via FTP :
function treb_get_images($remote_url, $remote_user, $remote_pass, $local_file) { try { $ch = curl_init(); $fp = fopen($local_file, 'w'); curl_setopt_array( $ch, array( CURLOPT_URL => $remote_url, CURLOPT_HEADER => 0, CURLOPT_VERBOSE => 0, CURLOPT_RETURNTRANSFER => 1, CURLOPT_BINARYTRANSFER => 1, CURLOPT_CONNECTTIMEOUT => 140, CURLOPT_TIMEOUT => 300, CURLOPT_NOSIGNAL => 1, CURLOPT_FILE => $fp ) ); // Set CURL to write to disk // Execute download $response = curl_exec($ch); if (FALSE === $response) { throw new Exception(curl_error($ch), curl_errno($ch)); } } catch(Exception $e) { trigger_error(sprintf( 'Curl failed with error #%d: %s', $e->getCode(), $e->getMessage()), E_USER_ERROR); } curl_close($ch); fclose($fp); }
Now triggering the above function is doing is a simple curl_exec of a remote URL , requesting a remote file. Executing this function in a PHP Web process will likely fail and generate a fatal error, something along the lines of “exceeded allocated maximum_execution_time”. To get it to work, you could adjust the php.ini file in your web environment to something abnormally high. This is not really recommended, so we found that wrapping functions that take too long in the WP Background Process class
Import & Use the WordPress Background Process Class
Again, take a look at the Github project for WP Background Processing. What we want to do is create a wrapper function that will handle this particular function ,along with any other functions that may take a long time to process.
add_action( 'init', 'process_handler' );
You need to hook into WordPress’ init and attach the process handler function :
function process_handler() { $treb_import = new StdClass; $treb_import->treb_import_process = new Treb_Import_Process(); if ( 'treb_images' === $_GET['process'] ) { // Parse date , otherwise assign current date if ($_GET['date']) { $date = explode("-", $_GET['date']); } else { $date = explode("-", date('d-m-Y')); } $treb_data = treb_get_csv($date); $loop_count = 0; foreach ($treb_data as $item) { $loop_count++; // Prep multidimensional array $item_array = array( count($treb_data), $item, $loop_count ); // Queue the import $treb_import->treb_import_process->push_to_queue($item_array); } $treb_import->treb_import_process->save()->dispatch(); } }
Whats happening in the above function? The only thing you need to worry about is the following lines of code :
$treb_import = new StdClass; $treb_import->treb_import_process = new Treb_Import_Process(); $treb_import->treb_import_process->push_to_queue($item_array); $treb_import->treb_import_process->save()->dispatch();
The first two lines, we’re initializing the class. The third line we are pushing items to the queue, in this case it is listing data. The last line saves the queue and dispatches it.
In the referenced Treb_Import_Process we have all the functions and tasks that take time which are run asynchronously and in the background. Hopefully by now you will see the benefits of this. Large batches of jobs can be run safely in the background and the WP Background Process will run the jobs in small batches until the job is deemed complete. This is completely independent of any server settings such as max_exectuion_time.
I hope you find this helpful! Eventually we will be releasing the TREB wordpress plugin to the WordPress community, however for now you can view our github project to see the full code examples illustrated above.