Create a webcam manager using pyGTK and Gstreamer


I recently joined the Strongsteam project for a 6 month internship. Our main goal is to provide some "artificial intelligence and data mining APIs to let you pull interesting information out of images, video and audio." We will be doing a presentation at Pycon 2012, the 9th of March, during the Startup Row weekend. On this occasion, I had to implement a desktop GUI allowing to display a webcam video stream and to capture snapshots, with the following constraints:

How to handle the webcam ?

My initial research led me to consider two different solutions:

I quickly turned to PyGame, because of the simplicify of the snapshot operation : all we have to do is to use the function. However, the integration of the PyGame surface into a pyGTK interface turned out to be pretty complicated. I found a couple of StackOverflow posts stating that even though this integration was possible, it was not advised. Indeed, some erratic behaviours seem to be observed when using different OS.

I thus considered Gstreamer, and quicky found this encouraging project. This code allowed to start and stop a webcam video stream embedded in a pyGTK interface : I was definitlely in the right place !

Why doesn't it work with my webcam ?

If you experience some problems testing the project introduced into the previous part (black screen, first run successful and following run leading to black screen, ...) check if your webcam is UVC (USB Video Class) Linux compliant. To do that, type in

$ lsusb

in a terminal and locate the line describing your webcam.

My laptop integrated webcam was described as Bus 001 Device 003: ID 05ca:1814 Ricoh Co., Ltd HD Webcam. The reference 05ca:1814 doesn't appear on the UVC website. That could explain why I experienced so many problems with it (it appears that Ricoh webcams are poorly UVC compliant).

I hence bought a Logitech QuickCam Pro 9000, known for being well supported. Everything ran smoothly with this one.

How to use Gstreamer ?

If you don't know how to use Gstreamer, I'd advise you to have a look these pages :

The main idea is to construct a pipeline, by connecting various data sources, sinks and processing blocks (bins) in a data flow graph.

In our case, we are going to use the following pipeline to display the webcam stream:

v4l2src ! video/x-raw-yuv,width=640,height=480,framerate=30/1 ! xvimagesink

Let's see how to do that in Python:

def create_video_pipeline(self):
    """Set up the video pipeline and the communication bus bewteen the video stream and gtk DrawingArea """
    video_pipeline = 'v4l2src device=/dev/video1 ! video/x-raw-yuv,width=640,height=480,framerate=30/1 ! xvimagesink'
    self.video_player = gst.parse_launch(video_pipeline) # create pipeline
    self.video_player.set_state(gst.STATE_PLAYING)       # start video stream

    bus = self.video_player.get_bus()
    bus.connect("message", self.on_message)
    bus.connect("sync-message::element", self.on_sync_message)

def on_message(self, bus, message):
    """ Gst message bus. Closes the pipeline in case of error or end of stream message """
    t = message.type
    if t == gst.MESSAGE_EOS:
        print "MESSAGE EOS"
    elif t == gst.MESSAGE_ERROR:
        print "MESSAGE ERROR"
        err, debug = message.parse_error()
        print "Error: %s" % err, debug

def on_sync_message(self, bus, message):
    """ Set up the Webcam <--> GUI messages bus """
    if message.structure is None:
    message_name = message.structure.get_name()
    if message_name == "prepare-xwindow-id":
        # Assign the viewport
        imagesink = message.src
        imagesink.set_property("force-aspect-ratio", True)
        # Sending video stream to gtk DrawingArea

Now, we have a live video stream displayed into a pyGTK interface, but still no way of capturing a snapshot.

How do we capture a snapshot ?

I encountered many StackOverflow open questions about this part, but no satisfactory answer...

At first, I wanted to use Gstreamer for that too, but I couldn't find any way to dynamically modify the pipeline to add a frame extraction, jpg encoding and a filesink (to save the snapshot). I thus tried this ugly hack : when the 'take snapshot' button is clicked

That was of course ugly, and resulted into a ~2s flicker when taking the snapshot... Back to square one.

I'll save you the suspens, the right solution is to use the gtk.DrawingArea.window.get_colormap() method, as shown here:

def take_snapshot(self):
    """ Capture a snapshot from DrawingArea and save it into a image file """
    drawable = self.movie_window.window
    # self.movie_window is of type gtk.DrawingArea()
    colormap = drawable.get_colormap()
    pixbuf = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB, 0, 8, *drawable.get_size())
    pixbuf = pixbuf.get_from_drawable(drawable, colormap, 0,0,0,0, *drawable.get_size())
    pixbuf = pixbuf.scale_simple(self.W, self.H, gtk.gdk.INTERP_HYPER) # resize
    # We resize from actual window size to wanted resolution
    #  gtk.gdk.INTER_HYPER is the slowest and highest quality reconstruction function
    # More info here :
    filename = 'snap.jpg'
    filepath = relpath(filename), self.snap_format)

This snippet does the following operations:

And that's done, without even a teeny-tiny flicker! Yay! We now have a perfecly functional snapshot operation.

Project source code & Git repository

All the code can be encountered on my GitHub.