Blog Post Icon
Blog
03/03/2017

How to use PHP to recursively transfer files in parallel over HTTP

Clone your site with PHP Curl

Hello!

There may be some scenarios where you might want to clone your site or push files to a remote location completely and 100% using PHP as a web service, without touching the command line.

There are many console or command line utilities to help complete this type of job such as rsync, scp, ncftp, ftp or any of the wide assortment of network file copy utilities that are available today.

But what if you want to have a system in place that migrates your site files on a web host where the console or command line is not available? For example someone on a shared hosting plan with Godaddy or a similar service will not be allowed command line access to linux utilities like rsync without paying for a VPS or similar plan that affords such access.

Rest easy, there are ways to synchronize your files even on a basic shared web hosting platform! As long as you are able to host PHP files, you should be able to do this. I’ll try to go over the process on the sender and receiver end first. Then I’ll go over the restrictions put in place by some web hosts and how to possibly accommodate those restrictions.

Ideally this should be run on an environment with PHP >= 5.3

Send files to a remote location with PHP over HTTP

No console, no shell_exec needed! As a systems administrator, writing PHP code that executes linux utilities just doesn’t feel right. No matter how secure you can make your code, you can very rarely (if ever) be 100% certain that the code cannot be circumvented, or privileges escalated to execute arbitrary commands and compromise your system.

Furthermore, why not keep everything within the web services? Pure web services instead of web services that interface the command line are safer and more compatible across the wide assortment of hosting environments. Its more feasible to write a utility that can run across many platforms and environments instead of a php -> shell_exec utility.

The code I will be demonstrating leverages the php curl library. Dont worry if your hosting environment doesn’t have that module installed! I will go over workarounds for some of the more common shared hosting environments.

So whats the objective? Lets define a root directory in your web accessible site and recursively transfer every file and folder to a remote location, effectively cloning your entire site to the remote server.

Define your variables to send files remotely over HTTP with PHP

First of all , we want to define the root directory so that we can recursively iterate over the sub folders and files and transfer them.

$target_url = 'https://destination.com/receive.php';
$src = '/your/full/path/to/your/web/directory/and/root/path/';
$path = realpath($src);
$basename = basename($src);
$basepath = dirname($src) . '/' . $basename;

All self explanatory, right? The only thing I want to note is that we are making our connection over HTTPS with PHP Curl. The reasons behind this are obvious, but I should probably note that transferring your web files (which may include database config files and other sensitive items) over non-secure protocols like HTTP are not advisable. HTTPS is ideal.

Use PHP’s RecursiveIteratorIterator function to build an array of all the files, sub files and sub folders

We want to build a (potentially) substantial array that contains all the files and folders from the $src variable. The most ideal way to do this is use a built-in PHP function called RecursiveIteratorIterator.

$objects = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($path), RecursiveIteratorIterator::SELF_FIRST);

Iterate over all the files

Before we talk about transferring the files, which is going to be fundamentally part of the iteration, I’ll just quickly go over how we iterate over the files and folders first.

foreach ($objects as $key => $file) {
        if ($file->isDir()) continue;
        $realfile = realpath($key);
}

You can see above that obviously not everything that should be happening is happening in the snippet. In this example, we are simply iterating over all the files and assigning the $realfile variable to files only, bypassing folders outright. Why aren’t we telling curl to make the folder remotely? Because directory creation will be handled on the receiving end.

Use Curl Multi to send multiple files in parallel

Since PHP is inherently single threaded, we dont want to sit there and upload every single file one at a time. For a basic wordpress site, that could take a while because of all the sub files sub folders and includes and everything in between.

So we want to use curl_multi_init to initialize the handler and build a parallel transfer of all the many iterated files.

If you want to learn more about Curl’s multi handler, or if you have never even heard about it before, here’s two guides that may help you wrap your head around it. Useful reads!

So what we want to do is initialize curl_multi_init before the for-loop :

$mh = curl_multi_init();

Then we can go back to that for-loop and add all the recursive files to the handler before even sending :

$count = 0;
foreach ($objects as $key => $file) {
        if ($file->isDir()) continue;
        $ch[$count] = curl_init();
        $realfile = realpath($key);
        $curl_file['file'] = new CurlFile($realfile, '');
        $curl_file['basepath'] = substr(dirname($key), strlen($basepath)) . '/';
        curl_setopt($ch[$count], CURLOPT_URL,$target_url);
        curl_setopt($ch[$count], CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch[$count], CURLOPT_SSL_VERIFYPEER, false);
        curl_setopt($ch[$count], CURLOPT_VERBOSE, 1);
        curl_setopt($ch[$count], CURLOPT_SSL_VERIFYHOST,  2);
        curl_setopt($ch[$count], CURLOPT_POST,1);
        curl_setopt($ch[$count], CURLOPT_SAFE_UPLOAD, true);
        curl_setopt($ch[$count], CURLOPT_POSTFIELDS, $curl_file);
        curl_setopt($ch[$count], CURLOPT_FOLLOWLOCATION, 1);
        curl_setopt($ch[$count], CURLOPT_HEADER, 0);
        curl_multi_add_handle($mh, $ch[$count]);
        $count++;
}

Initialize the multi file transfer with Curl

Once all the files have been added to the curl_multi_add_handle, we want to transfer the files! This can be done with a simple do-while loop :

$running=null;
do {
        curl_multi_exec($mh,$running);
        curl_multi_select($mh);
} while($running > 0);

Close all those file handles!

Once everything has been transferred, we want to close all the curl_multi_add_handle file handles! For this we use the curl_multi_remove_handle function :

//close all the handles
for ($i = 0;$i <= $count;$i++) {
        $result[$i]['data'] = curl_multi_getcontent($ch[$i]);
        $result_info[] = curl_getinfo($ch[$i]);
        curl_multi_remove_handle($mh, $ch[$i]);
}
curl_multi_close($mh);

Before we get into the receiving end, we wouldn't mind getting some output after everything is completed, preferably a confirmation message from the receiving end. You can obviously expand on all of this, once you get the fundamentals working.

echo print_r($result,1);

Receive the files sent over HTTP with PHP Curl

I want to keep these examples dead simple. The onus is up to you to ensure things like security checks, authorization, encryption (HTTPS) and all those other best practices are followed.

To receive the file, I just want to iterate over each received file, create the sub folder if it doesnt exist, and return a message indicating that the process is complete.

$uploaddir = '/the/receiving/folder/you/want/to/save/your/files';
$mi = new MultipleIterator();
$mi->attachIterator(new ArrayIterator($_FILES));
$mi->attachIterator(new ArrayIterator($_POST));

foreach($mi as $value) {
    if (!is_dir($uploaddir . $value[1])) {
        mkdir($uploaddir . $value[1]);
    }
    move_uploaded_file($value[0]['tmp_name'], $uploaddir . $value[1] . $value[0]['name']);
}

echo 'File transfer complete!';

We are using PHP's MultipleIterator function to iterate over the $_FILES and $_POST fields , if you want to send POST fields.

Then we just iterate over the new MultipleIterator array, identifying if the folder doesnt exist first in order to create it before receiving the file in the sub directory, using the move_uploaded_file function.

At the end of the file we are echoing "File transfer complete!" which you should see when you run the sending code. If you wanted to output more verbose information for debugging you could try echoing the following within the foreach loop :

echo print_r($value[1],1);

There you have it! Recursively transferring files over HTTPS with PHP-Curl, no commandline tools needed whatsoever.

Web Host Environment Restrictions

Many web hosting environments implement restrictions to mitigate over capitalization of resources from scripts, end users and everything in between. This is especially regimented in shared web hosting environments where many end-users are sharing the same resources in order to serve websites.

Many common restrictions in place could be things like execution time limits, file upload limits and other restrictions with PHP's configuration. Similar restrictions could be implemented in the web services like Apache or Nginx such as limits on how big HTTP POSTS can be.

How would you get around these types of things? Well you have to wrap your head around writing your code to anticipate some of these restrictions and work around them.

For example, with execution time limits you could implement libraries to run your PHP code as asynchronous background tasks (see this example done with WordPress).

You could also split files that exceed the max POST site or max file upload size restrictions into chunks and transfer them one chunk at a time with PHP. You have to get creative, but its completely possible to write a dynamic purely web based application that can clone your website files to a remote location with secure and clean code šŸ™‚

Lastly, you could integrate an AJAX long poll to leverage the browser to keep restarting the transfer, continuing where the last transfer left off until its complete. That might take some more research to implement, but could be possibly more reliable than other workarounds.

I hope this was helpful and informative! Coming soon : How to migrate your database with PHP to a remote location over HTTPS!

At Shift8, we cater to all sorts of businesses in and around Toronto from small, medium, large and enterprise projects. We are comfortable adapting to your existing processes and try our best to compliment communication and collaboration to the point where every step of the way is as efficient as possible.

Our projects are typically broken into 5 or 6 key “milestones” which focus heavily on the design collaboration in the early stages. We mock-up any interactive or unique page within your new website so that you get a clear picture of exactly how your design vision will be translated into a functional website.

Using tools like Basecamp and Redpen, we try to make the process simple yet fun and effective. We will revise your vision as many times as necessary until you are 100% happy, before moving to the functional, content integration and development phases of the project.

For the projects that are more development heavy, we make sure a considerable amount of effort is spent in the preliminary stages of project planning. We strongly believe that full transparency with a project development plan ensures that expectations are met on both sides between us and the client. We want to ensure that the project is broken into intelligent phases with accurate budgetary and timeline breakdowns.

Approved design mock-ups get translated into a browse-ready project site where we revise again and again until you are satisfied. Client satisfaction is our lifeblood and main motivation. We aren’t happy until you are.

Need Web Design?

Fill out the form to get a free consultation.

shift8 web toronto – 416-479-0685
203A-116 geary ave. toronto, on M6H 4H1, Canada
Ā© 2023. All Rights Reserved by Star Dot Hosting Inc.

contact us
phone: 416-479-0685
toll free: 1-866-932-9083 (press 1)
email: sales@shift8web.com

Shift8 Logo