boto-rsync – Limitations and Workarounds

boto-rsync is a great tool for interacting with object storage systems like S3, but it’s not without limitation. We all know about the 5GB limit for a single PUT, which isn’t a problem for clients that can handle multipart upload. Sadly, boto-rsync doesn’t handle that, and until someone patches it, we need a way to break up large objects. This can crudely be done with split:

This disadvantage to this is that retrievals need to manually be catted together, which obviously isn’t always a good solution.

boto-rsync’s other weakness is in handling UTF8 filenames. Improperly-encoded filenames will throw a 400 Bad Request and cause the script to choke and die, rather than gracefully skipping the failing file and moving on. Re-encoding files with proper UTF8 fixes this:

Not pretty, but it works. Note that directories need to be checked and renamed first before handling files specifically.

UPDATE – These issues have both been addressed in https://github.com/dreamhost/boto_rsync

Coordinated Spam Effort

Yesterday we saw a large number of infected domains sending a massive spam run in what appeared to be a coordinated effort. Signs of a large-scale targeted effort included:

  • Sharp upticks of outbound spam requests occurring at once across multiple domains
  • The same script being used across disparate victims
  • A large sampling of IPs submitting malicious requests (a sample has been submitted to the ISC)

Continue reading