Thursday, 6 November 2014

Magento Part 5 - Bulk Importing

<< Back to Magento Performance Tips for Scalability Homepage


This part isn't really related to Magento performance, but it does provide some helpful recommendations about efficiency when it comes to importing bulk products/images into Magento.

Magmi (Magento Mass Importer)

If you need to import products in bulk (once off or frequently then this is the extension for you, if not feel free to skip this section). I have setup many different Magento sites that import over 80,000 product updates each night, to other sites that are updating products from their ERP system constantly throughout the day all using MAGMI.

MAGMI essentially uses SQL to import your products into Magento REALLY FAST. I have been able to import around 15,000 products in 10-15 seconds.

Some handy extension in MAGMI to enable are:
  • On the fly category creator/importer (This will automatically create categories for you)
  • On the fly indexer v0.1.5 (This will index as you import products)

Bulk image import

I don't want to go into too many details, but here are 2 options to import bulk images into Magento.

Use Magmi to import images

Pros

  • Can import images from a URL (remotely scrape an download images from URLs).
  • It is quite fast.

Cons

  • It's really fiddly to get working.
  • If you only have 1 image, you need to import it 3 times (for the base_image, small_image & thumbnail image).
  • I had problems with it correctly setting the 'Base Image', 'Thumbnail', 'Image' radio buttons for the media gallery.
  • I found it would only import the image to the filesystem (I couldn't get it to upload to the database file storage).

DataFlow Profiles Image Import

Pros

  • Magento built-in functionality.
  • Can import images.
  • Can store images in the database file storage (if enabled in Magento admin).

Cons

  • It's a little slow.
  • Same as MAGMI, you cannot import a single image and set it for 'Base image', 'Small image' & 'Thumbnail' in the gallery of a product.
  • It automatically triggers a re-index after uploading (this could be a 'pro' in some scenarios).

An approach I've used

I ended up using Magento dataflow profiles to import a single image for a product, and then used a custom script to set that image correctly in the media gallery for all image types. A little hacky, but I'm sure there are a few of you out there who may want to do something similar.

Basic steps

  1. Create a new DataFlow profile in System -> Import/Export -> DataFlow Profiles.
  2. Create a CSV with 2 columns, with the headers "sku","image"
  3. Create a record for each SKU you want to import an image for, eg: "1023819","/1023819.jpg"
    Hint: Test with just 1 image first
  4. Notice the '/' at the start of the image filename.
  5. Now upload all of your images that correspond to what you entered in the CSV to /media/import/ directory in Magento.
  6. Upload this file in the 'Upload File' tab.
  7. Once uploaded run the profile in the 'Run Profile Tab'.
  8. Once import is completed I run the below script to set the 'Base image', 'Small image' & 'Thumbnail image' as the first image in the gallery (if not already set).
Create a PHP file in the root Magento directory called 'script-fix-product-images.php' and copy this content in (run at your own risk and always test first):
<?php 

setCurrentStore(Mage_Core_Model_App::ADMIN_STORE_ID);

$entityTypeId = Mage::getModel('eav/entity')->setType('catalog_product')->getTypeId();
$mediaGalleryAttribute = Mage::getModel('catalog/resource_eav_attribute')->loadByCode($entityTypeId, 'media_gallery');

// All products
$products = Mage::getModel('catalog/product')->getCollection()->addAttributeToSelect('thumbnail');

$productCount = count($products);

foreach ($products as $product) {

    $product = Mage::getModel('catalog/product')->load($product->getId());

    $gallery = $product->getMediaGalleryImages();

    if( NULL === $product->getThumbnail() || 'no_selection' === $product->getThumbnail() ||
         NULL === $product->getImage() || 'no_selection' === $product->getImage()) {

        $paths = array();
        foreach ($gallery as $image) {
            $paths[] = $image->getFile();
        }
        sort($paths);

        $path = array_shift($paths);

        if( NULL !== $path ) {
            try {
                $product->setSmallImage($path)
                        ->setThumbnail($path)
                        ->setImage($path)
                        ->save();
            } catch (Exception $e) {
                echo $e->getMessage();
            }
        } else {
            var_dump('No Base Image ' . $product->getId() . ' of ' . $productCount . ' [' . $product->getSku() . '] [' . $product->getThumbnail() . ']');
        }
    }

    // If product has multiple images, delete all but the first one (optional if you only ever have 1 image per product like one of my clients):
    if( count($gallery->getItems()) > 1 ) {
        $index = 1;
        foreach( $gallery->getItems() as $item ) {
            if( $index > 1 ) {
                $mediaGalleryAttribute->getBackend()->removeImage($product, $item->getFile());
                $product->save();
            }
            $index++;
        }
    }
}

print 'Images Done';
exit();

This isn't the most straight forward process but it does work and get around some of the frustrations when trying to bulk upload image in Magento. There are extensions out there that apparently do this.

Part 6 - Magento Caching

1 comment:

  1. you can configure magmi to replicate "image" value automatically to "small_image" & "thumbnail" either using column mapper plugin or value replacer plugin.
    so your input data could only contain a single "image" column.

    ReplyDelete