After going through all the CDN options available for Drupal 6, I wasn't totally satisfied with any of them. My requirements were thus:
- Move user uploaded files to S3
- Rewrite image URL to S3/CloudFront IF the file has been moved
- Imagecache support
The CDN integration module seemed to be the most promising, but I wasn't crazy about patching core plus running a daemon to handle file processing/uploading. Most of the other approaches I saw all assume that the file exists on the CDN, which is an assumption I'd rather not make when dealing with user uploads.
Media Mover!
I was first introduced to Media Mover at the session at DrupalCon DC, and I've been anxious to use it ever since.
In a nutshell, Media Mover works like this:
- Find some files
- Do some stuff to them (optional)
- Stick them somewhere
There are many more configuration options, but that's probably the most common workflow Media Mover performs. It's one of those modules that can get really powerful once you explore the possibilities.
So what does this have to do with CDN integration with Drupal? Media Mover keeps track of the files that it moves and stores the filepath/URL at each step. This gives you a place to check if the file has been moved AND retreive the external URL you want to use. Here's an example record from the media_mover_files table of a file harvested from a CCK imagefield and then moved to S3:
| mmfid | 2585 |
| nid | 14105 |
| fid | 3665 |
| cid | 25 |
| harvest_file | sites/default/files/filename.jpg |
| process_file | sites/default/files/filename.jpg |
| storage_file | http://s3.amazonaws.com/bucketname/sites/default/files/filename.jpg |
| complete_file | http://s3.amazonaws.com/bucketname/sites/default/files/filename.jpg |
| status | 8 |
| date | 1263317207 |
| data | N; |
Notice the nid association. When the node gets deleted or the file is deleted from the node, Media Mover will also optionally delete the copy on S3!
The imagecache problem
It's great that Media Mover can harvest from a CCK field, but what about the imagecache derivatives? When the user submits the node entry form, the original will get pushed to S3, but the imagecache derivatives won't get generated until they are viewed. I spoke a bit with Arthur about this, and we came to the conclusion that Media Mover needs an imagecache module. He went ahead and threw one together, and after a bit of cleaning up, you can now select which presets you need per MM config, and the imagecache derivatives will be generated and pushed to S3 when the node is saved!
Rewriting the image path
So now this still doesn't really solve the problem of rewriting the image URL before the page is rendered, but if you're using imagecache, this can be done in a theme override. I know, it feels dirty, but it works.
Here's an example of what you could use:
if (is_null($attributes)) {
$attributes = array('class' => 'imagecache imagecache-'. $presetname);
}
if ($getsize && ($image = image_get_info(imagecache_create_path($presetname, $path)))) {
$attributes['width'] = $image['width'];
$attributes['height'] = $image['height'];
}
$attributes = drupal_attributes($attributes);
$imagecache_url = imagecache_create_url($presetname, $path);
$s3_url = db_result(db_query("SELECT storage_file FROM media_mover_files WHERE harvest_file = '%s' AND process_file = '%s'", $path, $imagecache_url));
if ($s3_url) {
return '<img src="'. $s3_url.'" alt="'. check_plain($alt) .'" title="'. check_plain($title) .'" '. $attributes .' />';
} else {
return '<img src="'. $imagecache_url .'" alt="'. check_plain($alt) .'" title="'. check_plain($title) .'" '. $attributes .' />';
}
}
BUT S3 ISN'T A CDN!!
Yeah, I know. I made a custom module where I keep S3 bucket to CloudFront URL/CNAME mappings, and I check that in my imagecache theme override. If I get some time here in the near future, I'd like to throw together a Media Mover CloudFront module that could store the CloudFront URL in the "complete" step, so you'd have it on the initial SQL query.
So far, my only complaint about this set up is that I'm still keeping a local copy of all the files I'm pushing to S3. Media Mover already provides an option to delete the local file after it's been processed, but if you change your imagecache preset and need to regenerate, you'd need to pull them all back down, which doesn't sound fun. The severity of this problem is going to vary depending on the site obviously, but it could be pretty inconvienient.
For anyone who want to try this setup, if there's not another beta or a proper release after today (1/28/09), I'd try the 6.x-1.x-dev branch, because that's where I've been committing most of my CDN related features/bugfixes. Also, I think the mm_imagecache module only lives there at the moment.
Great post. An alternative
Great post,
I was also blown away by Media mover when I first tried it.
However I use a seperate function to do what you're trying to do above. Early on in my pre_process node function I check if any of my files have been moved by calling a simple function
THEMENAME_get_media_mover_files($variables['field_file'][0], $variables['media_mover'][3]);
The [3] at the end of the media mover array is the ID of the Media Mover configuration
Here's the function
/**
* A function that takes a file object and a media_mover element array and set the file path to
* its media moved path on Amazon S3 or wherever it moved to.
*
* It uses the unique file_id identifier to match file with media_mover file.
*
* $file = $variables['file_image'][0];
* $media_mover = $variables['media_mover'][{id of media mover configuration}];
*
* @param &$file A reference to a Drupal file array
* @param $media_mover A media_mover file/element array
*/
function THEMENAME_get_media_mover_files(&$file, $media_mover) {
if(module_exists('media_mover_api') && $media_mover) { // If media mover is installed...
foreach($media_mover as $media) { // Loop through each media_moved file...
if($media['fid'] == $file['fid']) { // If they match (file id is a unique identifier...
$file['filepath'] = $media['complete_file']; // Replace the attached file path with the media moved file path...
}
}
}
} // THEMENAME_get_media_mover_files()
Notice that the file is passed by reference. This function takes the file you've passed it, checks if there is a media moved file with the same id, and if there is set's the filepath of the file to the path of the S3 file.
The benefit of this is that you can call this function for all filefields in your content type and there are no extra Database calls.
Sorry about the code formatting, but that's as good as I can get it in a comment.
CDN integration module
I'm glad it seemed the most promising :) And I of course understand that it
seemed too daunting because of the daemon. Note again that the daemon is
only required if you want to use push CDNs like S3. If you're using an
origin pull CDN, no daemon is required.
Also, the core patch may seem evil, but it's absolutely necessary to perform
proper CDN integration without resorting to at least equally ugly theme
overrides.
I'm looking forward to making the CDN integration module use the Media Mover
module to provide a more friendly alternative to the daemon for using push
CDNs :)
Path rewrite code change
I had to change the following
$imagecache_url = imagecache_create_url($presetname, $path);
$s3_url = db_result(db_query("SELECT storage_file FROM media_mover_files WHERE harvest_file = '%s' AND process_file = '%s'", $path, $imagecache_url));
to this in order to get it to work
$imagecache_url = imagecache_create_path($presetname, $path);
$s3_url = db_result(db_query("SELECT complete_file FROM media_mover_files WHERE harvest_file = '%s' AND process_file = '%s'", $path, $imagecache_url));
To get the query to match the amazon file it needed the imagecache path and not the full url, along with selecting the complete_file field which has the full s3 url in it.
Thank you for the tutorial as it helped us greatly!
latest
to get this working with the latest, i had to use the following theme function (essentially replace $presetname with $namespace):
function THEME_imagecache($namespace, $path, $alt = '', $title = '', $attributes = NULL) {
if (is_null($attributes)) {
$attributes = array('class' => 'imagecache imagecache-'. $namespace);
}
if ($getsize && ($image = image_get_info(imagecache_create_url($namespace, $path)))) {
$attributes['width'] = $image['width'];
$attributes['height'] = $image['height'];
}
$attributes = drupal_attributes($attributes);
$imagecache_url = imagecache_create_path($namespace, $path);
$s3_url = db_result(db_query("SELECT storage_file FROM media_mover_files WHERE harvest_file = '%s' AND process_file = '%s'", $path, $imagecache_url));
if ($s3_url) {
print "S3";
return '';
} else {
print "regular";
return '';
}
}