Thursday, 6 November 2014

Magento Part 2 - Prepare for Scalability

<< Back to Magento Performance Tips for Scalability Homepage


Use a continuous integration/deployment server

Whenever you deploy a Magento store, there are always a bunch of scripts/processes that you will need to run to ensure the environment is setup and running correctly (you really don't want to be doing this manually on each release). I use Capistrano for all code deployments - it's lightweight and relatively easy to setup. Below are some of the tasks the deployment process does:
  • Swaps in production or staging configuration files (discussed below).
  • Clears filecache.
  • Clears memcache.
  • Clears varnish cache.
  • Removes previous builds.
  • Sets correct file permissions on directories (for uploads/imports etc...).

How to setup

deploy.rb

set :application, "APPLICATION_NAME"
set :scm, :git
set :repository, "YOUR_GIT_REPOSITORY"
set :user, "USER"
set :use_sudo, false
set :deploy_via, :remote_cache
set :copy_exclude, ['.git']
set :ssh_options, {:forward_agent => true}
set :keep_releases, 5

set :stages, ["staging", "production"]
set :default_stage, "staging"

default_run_options[:pty] = true

namespace :cache do
  desc "Clear Magento cache\nUsage: cap [stage] cache:clear -s type=[all|image|data|stored|js_css|files]"
  task :clear do
    if type.nil? || type.empty? || type == "all"
      cache_type = "all"
    else
      cache_type = "--clean #{type}"
    end
    run "if [ -e '#{current_path}/magento/shell/clearCache.php' ]; then cd #{current_path}/magento/shell && php clearCache.php -- #{cache_type}; else rm -rf #{current_path}/magento/var/cache/* ; fi "
  end

  task :flush do
    run "cd #{current_path}/magento/shell && php cleanCache.php -- flush"
  end

  task :varnish do
    #run "if [ -e '#{current_path}/magento/shell/varnish.php' ]; then cd #{current_path}/magento/shell && php varnish.php -- apply; fi "
  end
end

# if you want to clean up old releases on each deploy uncomment this:
after "deploy:restart", "deploy:cleanup"
after "deploy:restart", "cache:varnish"
after "deploy:restart", "cache:flush"

deploy/production.rb

server "localhost", :app, :web, :db, :primary => true
set :deploy_to, "DEPLOY_DIRECTORY"
set :branch, "master"

namespace :deploy do
  task :restart, :roles => :web do
    # Copy production config into local.xml
    run "cp #{ current_path }/magento/app/etc/local.xml.production #{ current_path }/magento/app/etc/local.xml"
    run "cp #{ current_path }/magento/errors/local.xml.sample #{ current_path }/magento/errors/local.xml"
    #run "cp #{ current_path }/magento/downloader/connect.production.cfg #{ current_path }/magento/downloader/connect.cfg"
    run "cp #{ current_path }/varnish/default.vcl.production #{ current_path }/varnish/default.vcl"
    run "cp #{ current_path }/varnish/secret.production #{ current_path }/varnish/secret"
    run "chmod 755 #{ current_path }/magento"
    run "chmod 755 #{ current_path }/magento/media"
    run "mkdir -p #{ current_path }/magento/media/catalog/product"
    run "mv #{ current_path }/magento/robots.txt.production #{ current_path }/magento/robots.txt"
    run "php -f #{ current_path }/magento/shell/compiler.php -- compile"
  end
end
Now checkout this repository onto your production servers.

To setup

Run the following command in the root directory:
cap deploy:setup

To deploy

Run:
cap production deploy

Support for multiple environments

Magento is a bit of a pain in supporting multiple environments (development, testing, production). But it can be easily achieved (you should be using a deployment server as mentioned above).
In /app/etc/ directory you can setup something like:

Config files

  • local.xml.development.example (Example config for dev)
  • local.xml.development (use .gitignore so this file is not checked in)
  • local.xml.staging (For staging environment)
  • local.xml.testing (For unit testing environment)
  • local.xml.production (For production environment)
Use a .gitignore to stop people from committing in local.xml.

The continuous integration server swaps in the correct config file based off the environment (as you can see in the deploy scripts above).

System logging

Magento stores its errors in a variety of places (/var/report/, php ini error log file, apache error log file). When error logs are stored all over the place it takes longer to debug issues, and when something goes wrong, time isn't something you have on your side. If you are running Magento on multiple servers its even harder to debug!

I have built a "system_log" extension which stores every single error message in a system_log MySQL table using delayed writes (so as to have minimal impact on the live site). This database table of error logs is an aggregation of all errors from all of your Magento servers, making it so much easier to debug issues. You could look at pushing these errors to external providers if you don't have the resources in house to have higher spec servers (PaperTrailApp and Loggly are worth a look).

Another helpful feature is a simple cron that runs every minute and emails the software teams if any error messages occur (that are not debug messages). This saves someone checking the system_log table constantly. Because this is an asynchronous service, even if something goes horribly wrong and you trigger thousands of errors, you will only be sent a combined email of errors once a minute (so it wont bring your server down).

If you are interested in this 'system_log' extension send me a message.

DB migrations

You may already know this, but you don't need to be creating SQL files to manage migrations in Magento. For database/SQL migrations setup a directory:

app/code/local/ORGANISATION/MODULE/sql/ORGANISATION_MODULE_setup

Inside this directory create an SQL file to manage running migrations 

eg:
startSetup();

$installer->run("
CREATE TABLE IF NOT EXISTS {$this->getTable('system_log')} (
  `SystemLogId` int(10) NOT NULL AUTO_INCREMENT,
  `Message` text NOT NULL,
  `PriorityName` varchar(100) DEFAULT NULL,
  `PriorityLevel` int(5) DEFAULT NULL,
  `UserIp` varchar(50) DEFAULT NULL,
  `UserHost` varchar(250) DEFAULT NULL,
  `SectionId` int(10) NOT NULL DEFAULT '1',
  `Attachment` text,
  `Created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`SystemLogId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE IF NOT EXISTS `system_log_section` (
  `SystemLogSectionId` int(10) NOT NULL AUTO_INCREMENT,
  `Name` varchar(50) NOT NULL,
  PRIMARY KEY (`SystemLogSectionId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `system_log_section` (`SystemLogSectionId`, `Name`)
VALUES
 (1,'Default'),
 (2,'Magmi');
");

$installer->endSetup();


I found very little documentation about these SQL migrations, but nearly all modules use them in the Magento world, so it's fairly easy to find some examples.

File storage, sessions and images

For any system to scale you need to have your data (or file assets in this case) stored in a central location (single server or cluster of servers). Why? Because if you move from a single server environment to a multi server environment if you are storing data like images, session files etc they will only exist on 1 server (not on all of your servers) which can be problematic.

Sessions

In Magento you should use database session storage so your users sessions persist across multiple servers. Within your local.xml file set database storage for sessions

<config>
    <global>
 <session_save>db</session_save>
    </global>
</config>

A good article about performance of different session storage options
http://magebase.com/magento-tutorials/magento-session-storage-which-to-choose-and-why/

Image Assets (product images)

I typically store all product images in the database of Magento (or a central storage platform that isn't the file system) - before you worry about performance, read on. Magento has 'partial' functionality to store images in the database but it is no-where near complete. A team I was working with ended up developing our own extension to enhance the Magento core so we had a more seamless integration with database storage for product images.

If you upload and store images on the filesystem only, with every deployment you will wipe out and delete those images. Unless you store them in a folder outside the Magento directory and symlink them in (but then you need to worry about another type of backup).

Storing all product images in the database gives you a central place where all images are stored (you can copy databases from production to testing or to development) and have everything continue to work very easily, without the need to copy images from different servers. However, most of you are probably thinking about performance, rendering images from the database is seriously slow.

The good news is, Magento automatically writes the images to the filesystem on first request. This means on first request the image may take around 1 second to load, but on every subsequent request it is being read directly from Apache and the filesystem which is very fast with no application overheads.

Go a step further and implement CDN and you have completely offloaded the product image loading process to your CDN servers and they will load within milliseconds.

To implement the above requires a custom Magento extension, so send me a message if you are interested.

CDN (Content Delivery Network)

As mentioned above, with Magento you need to offload as many resources as possible from your servers to make it perform fast. Serving all of your product images from CDN is a great way to do this.

Grab this extension to start:
http://www.magentocommerce.com/magento-connect/onepica-imagecdn-1.html

A team I worked with patched this extension to support Origin-Pull for CDN (this is a far simpler implementation of CDN, which doesn't require your application to push your content to CDN, your CDN pulls the content from your server on first request).

If you want to learn more about this extension send me a message.

Magento Part 3 - MySQL Setup & Performance

1 comment:

  1. Hi, I'm trying to migrate to an scalable arquitecture for my magento project, actually I have 2 identical servers running magento under a load balancer. I've installed onepica-imagecdn in order to have all my images centralized in Rackspace's CloudFiles. Everything was working good until I noticed that sometimes the product images we're broken (404). The problem is that if the load balancer directs you to the server that doesn't have that image in the filesystem because it was uploaded to the other server (which I guess it's a common problem). I thought that implementing the CDN would fix this problem because all the images were supposed to be centralized regardless which server you were in. Is there something wrong with my logic or maybe something wrong with the configuration??

    ReplyDelete