Of course, a user mode file system like s3fs over FUSE can always elect to shard a large file into smaller S3 objects, say 4MB each. It would need to hide the individual chunk objects and present only one coherent file. While this might seem adequate, hiding objects means that some file names are not allowed.
Another issue is that S3 objects are stored inside buckets that have no directory structure. Properly supporting directory structure means storing directory listing meta-info as an object in the bucket as well.
What if the user wants files to be encrypted transparently?
These issues all signify that, in order to build a proper network drive on top of S3, we need to implement a lot of file system features from scratch. It would be much easier if we could store a file on S3 and use it as a block device. Let's go back to the drawing board and redo the sketch.
- First, we'll have a simple sharded S3 file system over FUSE that can store large files as chunked objects.
- We create a large file to be used as a loopback device, format, and mount it.
- dd if=/dev/zero of=/mnt/s3/fs.img bs=1M seek=$((device_size_in_MB - 1)) count=1
- mkfs.ext4 /mnt/s3/fs.img
- mount -o loop /mnt/s3/fs.img /mnt/fs
This also works similarly on Mac OS X: just use Disk Utility to create a disk image.
If you want encryption, no problem.
- dd if=/dev/zero of=/mnt/s3/crypt.img bs=1M seek=$((device_size_in_MB - 1)) count=1
- losetup /dev/loop0 /mnt/s3/crypt.img
- cryptsetup luksFormat /dev/loop0
- cryptsetup luksOpen /dev/loop0
- mkfs.ext4 /dev/loop0
- mount /dev/loop0 /mnt/crypt
One problem with using S3 as a block device is that the image file size is fixed, but depending on the filesystem you use, it should be possible to resize the filesystem online. It might even be possible to use ZFS by adding block storage to the pool.
If you're not happy with the 4MB per chunk data transfer, you can easily reduce the transfer amount. Instead of mounting the block device image over S3, do the following:
- Build an Amazon EC2 AMI to mount the disk image on S3 as a block device.
- Make sure ssh and rsync are installed on the machine image.
Now, the only missing piece is a FUSE module for sharding large files into smaller objects on S3. Anyone wants to take on that?
No comments:
Post a Comment