Thursday, January 31, 2008

Ruby Hack: In-Memory Files with Metadata and Better .open() Support

PDF::Writer is probably the closest thing there is to "the" library for generating PDF docs in Ruby. But there are some other nice, tiny libraries for generating a quick PDF doc, like Ruby FPDF.

The one thing I needed recently from FPDF that it didn't offer was the ability to add an image from a in-memory data blob (the FPDF Image method reads from a file). Making FPDF read a memory stream was a fun bit of Ruby meta-style hackery, which I offer here for streaming images into your PDFs and also as a neat example of how you can usefully and easily change a module's behavior without touching its source code.

The basic plan:

  1. Since FPDF wants to open a file, and I have a String, StringIO is my friend, since it's very file like.
  2. Since StringIO doesn't have file metadata, like a filename, and FPDF looks at the filename to decide how to parse the data, I'm gonna need to make StringIO seem like it has metadata.
  3. Since FPDF uses only the raw Kernel.open, it's looking so a file path. I need to make it a little more cosmopolitan by giving it the Kernel.open in open-uri, which among other things calls open on openable objects. Like StringIO.
  4. Lastly, since StringIO.open behaves like "new," and I want to pass an object to open that already exists, I need to change the way open behaves.

First, let's make FPDF use the enhanced 'open':

FPDF.module_eval { require 'open-uri' }

Now, we'll define a function that takes a String of binary data and a pretend filename, and produces a hacked StringIO object (description of what's going in is in the comments):
def in_memory_file data, filename 
#load up some data
file = StringIO.new(data)

#tell the class that it knows about a "name" property,
#and assign the filename to it
file.class.class_eval { attr_accessor :name }
file.name = filename

#FPDF uses the rindex and [] funtions on the "filename",
#so we'll make our in-memory file object act like a filename
#with respect to these functions:
def file.rindex arg
name.rindex arg
end

#this same pattern could be used to add other metadata
#to the file (e.g., creation time)
def file.[] arg
name[arg]
end

#change open so that it follows the formal behavior
#of the original (call a block with data, return
#the file-like object, etc.) but alter it so that
#it doesn't create a new instance and can be
#called multiple times (rewind)
def file.open(*mode, &block)
self.rewind
block.call(self) if block
return self
end

return file
end

In this case, I had the FPDF source so I could see exactly what I needed to do. With Ruby, I'm likely to have the source pretty much all the time (unless the classes are pre-compiled to target another VM, e.g. CLR). But an interesting next step is to add hooks/instrumentation to a "black box" library, and use that output to try and make adjustments to the behavior of the library.

1 comment:

Cheap WOW Gold said...

i love your own composing ,the thus readable ,pleasurable as well as simple you just read .. thx .
Sell RS Gold
Sell Runescape Gold