How to export and import field collection data in Drupal

Hello!

There are many great tools out there that make importing and exporting content into Drupal nodes very easy. We deal with custom Drupal content often and have extensively used many of the (amazing) tools available, such as node export, views data export and feed import , specifically with the field collection feeds addon.

With all of these solutions, we had limited but no success in exporting our custom content + field collection data. Our field collections contained fields that were configured to store multiple values as an array, which may be why. We never tried the tools that didn’t work with different variations of field options in our field collections. They just didn’t work for us.

So we looked into how hard it would be to write our own drush command to easily integrate into our code push system (the system through which code and data is propagated from our staging to production environments). All we needed was a simple way to export Drupal content to a CSV and then import from the CSV back into Drupal whenever needed.

We thought maybe we could save other people from going through (most) of the trouble that we had to go through to put this together by sharing our process and the code we used to create the drush command. Ultimately we could release this as a drupal module for a wider audience if there’s enough demand for it.

Export data to a CSV

The easier part of this process involves pulling all the node data and exporting it as a CSV for later use. Putting a CSV together in PHP is relatively straightforward, however I’ll paste the function for exporting below as a reference and explain further down :

function star_export() {

        $fc_name = drush_get_option('c');

        // we will use SPL to send our data to STDOUT formatted as CSV
        $fout = new \SplFileObject("php://stdout");
        // write our headers
        $fout->fputcsv([
                'nid', 'title', 'path_alias', 'field_collection_name', 'assigned_nodes'
        ]);
        // use a generator to loop through nodes
        foreach (nodes_generator($fc_name) as $node) {
                $fout->fputcsv([
                // nid
                $node[0]->nid,
                // title
                $node[0]->title,
                // path alias
                url('node/' . $node[0]->nid),
                // field collection name
                $fc_name,
                // assigned nodes
                $node[1]
                ]);
        }
}

This is the function triggered by an “Export” command within drush. You can see we’re looking for an option passed called “c”. The syntax would be –c=”field_collection_name”. Because this module was written for our own use, we didn’t put much emphasis on implementing a dynamic command that can simply read all field collections for a content type and then parse all the fields, though if someone wanted to branch our code on github (link at bottom) and contribute that kind of dynamic functionality, that would be most helpful.

You can also see that we have a simple format for our CSV :

$fout->fputcsv([
                'nid', 'title', 'path_alias', 'field_collection_name', 'assigned_nodes'
        ]);

Everything’s pretty self explanatory, right? Well you’ll notice in the code snippet that I’m calling a function called node_generator. This is where most of the magic happens when exporting the data.

function nodes_generator($fc_name) {
        static $count; // prevent infinite loops
        // query for nodes newer than the specified date
        $query = $query = new EntityFieldQuery();
        $query->entityCondition('entity_type', 'node')
                ->propertyCondition('type', 'custom_content')
                ->propertyCondition('status', 1);
        $result = $query->execute();
        if (!empty($result)) {
                foreach ($result['node'] as $nid => $row) {
                        $count++;
                        // TRUE will reset static cache to keep memory usage low
                        $node = node_load($row->nid, null, TRUE);
                        $collections = array();
                        $field_collection_fields = field_get_items('node', $node, $fc_name);
                        $collection_item_final = array();

                        if (!empty($field_collection_fields)) {
                                $cnt = 0;
                                foreach ($field_collection_fields as $field_collection_field) {
                                        $item = field_collection_field_get_entity($field_collection_field);
                                        $collection_item = $item->field_first_option['und'][0]['value'] . ',' .  $item->field_second_option['und'][0]['value'];
                                        (count($field_collection_fields) >= 2) ? $field_divider = '^' : $field_divider = NULL;
                                        for ($i = 0;$i < count($item->field_grp_assigned);$i++) {
                                                $collection_items = array();
                                                for ($j = 0;$j < count($item->field_grp_assigned['und']);$j++) {
                                                        $collection_items[] = $item->field_grp_assigned['und'][$j]['target_id'];
                                                }
                                                // dont add divider if last iteration of loop
                                                if ($cnt == count($field_collection_fields) - 1) {
                                                        $collection_item_final[] = $collection_item . ',' . implode('|', $collection_items);
                                                } else {
                                                        $collection_item_final[] = $collection_item . ',' . implode('|', $collection_items) . $field_divider;
                                                }

                                        }
                                        $cnt++;
                                }
                                // return assigned collection
                                yield array($node, implode($collection_item_final));
                        }  else {
                                // return node without assignments
                                yield array($node);
                        }
                }
        }
        return;
}

The bulk of the work done in the above nodes_generator function is pulling the fields from the provided field collection (passed from the export function that triggers the nodes_generator function) and dealing with the data by way of a few nested for-loops.

Again this function is not very dynamic and either would have to be customized to accommodate your custom content or perhaps customized so that you could use additional argument and option fields in the drush command to customize the output. When run, you will see a CSV file get generated with output that looks similar to the following :

nid,title,path_alias,field_collection_name,assigned_nodes
0001,"node title 1",/en/node/0001,,,
0002,"node title 2",/en/node/0002,"field_collection_grp1",,"option 1","option 2",0003|0004|0005|0006"

The “assigned_nodes” field is the field collection field data I’m pulling and exporting. The “field_collection_name” is self explanatory. The fields “option 1” and “option 2” are fields in the collection group that only contain 1 value (not multi value as with the last field).

The last field in the CSV is assigned NIDs from an Entity reference views widget that allows the user to search for nodes and “Assign” them to the field. All I want to export for this is the node IDs that serve as the “target_id” reference in the backend. You can see each field in the CSV is separated by a comma. In the last field, multi fields are further separated by the pipe character for parsing during import.

Import data from the CSV

The export process again is much simpler than the import process. For the import process, we need to specify a source file and then we need to loop through all the CSV items in order to prepare the data to create a new node and save the data. All the data creation and manipulation is done using Drupal’s Entity metadata wrappers.

function star_importer() {

        $file = drush_get_option('f');
        if (!empty($file) && file_exists($file)) {
                $csv_data = array_map('str_getcsv', file($file));
                // loop through array and skip first line
                for ($i = 1; $i < count($csv_data);$i++) {
                        $nid = $csv_data[$i][0];
                        $title = $csv_data[$i][1];
                        $path_alias = $csv_data[$i][2];
                        $fc_name = $csv_data[$i][3];
                        $fc = $csv_data[$i][4];

                        // fail if key fields are not present
                        if (empty($nid) || empty($title)) {
                                return 'Key fields are missing from array '.$i.'. Check the file and try again';
                        }

                        // create a new node and assign the data
                        $values = array(
                                'type' => 'custom_content',
                                'uid' => 1,
                                'status' => 0,
                                );
                        $entity = entity_create('node', $values);
                        $ewrapper = entity_metadata_wrapper('node', $entity);
                        $ewrapper->title->set($title);
                        // if there's field collection defined
                        if (!empty($fc)) {
                                $fc_items = explode('^', $fc);
                                foreach ($fc_items as $fc_item) {
                                        $fc_array = explode(',', $fc_item);
                                        $first_opt = $fc_array[0];
                                        $second_amt = $fc_array[1];
                                        $fc_nodes = explode('|', $fc_array[2]);
                                        if (empty($first_opt) || empty($second_amt)) {
                                                echo 'This assignment is missing either the first or second field option, skipping ..';
                                                continue;
                                        }
                                        $collection = entity_create('field_collection_item', array('field_name' => $fc_name));
                                        $collection->setHostEntity('node', $entity);
                                        $cwrapper = entity_metadata_wrapper('field_collection_item', $collection);
                                        $cwrapper->field_first_opt->set($first_opt);
                                        $cwrapper->field_second_opt->set($second_opt);
                                        $cwrapper->field_grp_assigned->set($assigned_nodes);
                                        $cwrapper->save();
                                }
                        }

                        // save
                        $ewrapper->save();
                }
        } else {
                return 'No file given to import or file does not exist';
        }
}

There’s quite a lot going on in the above function! Sometimes nested for-loops can make one’s head spin. However its the only way to deal with the multi dimensional arrays as well as the multitude of arrays themselves. There could be much more sanity checks and whatnot, but again remember this is just proof of concept.

The main thing that should be discussed here within the context of the above function is the field collection portion of the function. Basically what took a bit of troubleshooting was how to take the data saved from the CSV and then create a field collection reference for the newly created node and save all the field collection data, including the multi-valued field_grp_assigned collection field.

Remember that CSV field that is separated by the pipe character? Well we use the explode php function to convert it into a proper array and then use the entity wrapper set function to assign the array to the field. The other field sets are either static (single value) variables in the collection or within the regular node fields themselves.

This took some time to put together so I hope its of use to someone out there! Feel free to check out the github repo for the above code. Contribute or branch it into a functional + dynamic module! 🙂

GitHub Repository