using google image search from within a cocoa app

January 25th, 2009

I’ve been looking for a way to use Google’s image search from within a Cocoa application but found nothing ready to use. So I had to do it myself using regular expressions. After some research I found a .NET project here which got me started. I finally came up with a working solution. Only drawback: if Google decides to change anything in the search result’s HTML the regular expression has to be adapted.

Prerequisites

You will need to download and install the excellent RegexKit framework here (full version, NOT lite!).

The project was created with XCode 3.1.2 and requires Leopard as it uses some Obj-C 2.0 features. It can be downloaded here.

How Google Image Search works

When you go to http://images.google.com/ and type in a search query, a page with exactly 20 result images is shown. You’ll also notice a bunch of parameters added to the URL on the results page. Most important for us are the q and start parameters.

  • q defines the search term. So q=apple for example, searches for the term “apple”. Spaces must be replaced with “+”.
  • start defines the results’ zero-based starting index. So a query with start=42 would return 20 results, starting from index 42.

The Regular Expression

#define IMAGES_REGEX @"/imgres\x3Fimgurl=(?<imgurl>[^&>]*)[>&]{1}imgrefurl=(?<imgrefurl>[^&>]*)[>&]{1}usg=[^&>]*[>&]{1}h=(?<height>[^&>]*)[>&]{1}w=(?<width>[^&>]*)[>&]{1}sz=(?<sz>[^&>]*)[>&]{1}hl=(?<hl>[^&>]*)[>&]{1}start=(?<start>[^&>]*)[>&]{1}tbnid=(?<tbnid>[^&>]*)[>&]{1}tbnh=(?<tbnh>[^&>]*)[>&]{1}tbnw=(?<tbnw>[^&>]*)[>&]{1}prev=(?<prev>[^&>]*)[>&]{1}<img src=(?<tbnurl>[^ ]*)[ ]“

The expression basically looks for the interesting stuff in the HTML response which are the URLs and dimensions of the preview and the actual image. These are stored in named groups (recognizable by ?<groupname>) which can then be easily extracted into a NSString object.

Code

To build a search query in Obj-C there are 3 basic steps:

  1. Build the URL string
  2. Create an NSURLRequest and a NSURLConnection to receive the HTTP response
  3. Parse the HTTP response

1 and 2 are basic stuff (well, 2 not so much… read Apple’s URL Loading System guide). So lets assume the response has finished loading and we have it stored in an NSData object. First we have to convert this object to NSString in order to parse it:

NSString *result = [[NSString alloc] initWithData:receivedData
                                         encoding:NSUTF8StringEncoding];

Then RegexKit is used to parse the string:


RKEnumerator *matchEnum = [result matchEnumeratorWithRegex:IMAGES_REGEX];
// loop over all results
while([matchEnum nextRanges] != NULL)
{
    double width = 0.0;
    double height = 0.0;

    // image
    NSString *imageUrl = [matchEnum stringWithReferenceFormat:@"${imgurl}"];
    [matchEnum getCapturesWithReferences:@"${width:%lf}", &width, nil];
    [matchEnum getCapturesWithReferences:@"${height:%lf}", &height, nil];
    NSSize imageSize = NSMakeSize(width, height);

    // thumbnail
    NSString *tbnUrl = [matchEnum stringWithReferenceFormat:@"${tbnurl}"];
    [matchEnum getCapturesWithReferences:@"${tbnw:%lf}", &width, nil];
    [matchEnum getCapturesWithReferences:@"${tbnh:%lf}", &height, nil];
    NSSize tbnSize = NSMakeSize(width, height);

    MyImageObject *imgObj = [[MyImageObject alloc] init];
    imgObj.thumbnailUrl = tbnUrl;
    imgObj.imageUrl = imageUrl;
    imgObj.thumbnailSize = tbnSize;
    imgObj.imageSize = imageSize;            

    [self.imageObjects addObject:imgObj];
    [imgObj release];
}

Here is what happens. The loop jumps from match to match. Essential data is extracted from the current match using the regex groups mentioned earlier. Then a MyImageObject instance storing this data is created and added into a NSMutableArray.

After this loop is done, we have successfully sent a search request to Google’s image search, received and parsed the response, and stored all essential data into custom objects, ready to be processed for display.

Displaying the Images

For nicely displaying the images I chose the new IKImageBrowserView from ImageKit. Basically I followed Apple’s Image Kit programming guide for IKImageBrowserView. Because we are dealing with image URLs, the only thing I changed was MyImageObject’s imageRepresentationType, which is now IKImageBrowserNSURLRepresentationType. Setting up the datasource was really easy and the result looks pleasing.

Google Image Search

mhoeller Development ,

new blog

January 23rd, 2009

To whom ever it may concern,

Welcome to my brand new blog! In fact, this is my first blog ever… So I don’t exactly know what I will put out to the world yet. It will mostly deal with things about being a developer for the Mac and iPhone platforms, both technical and businesswise.

Random (but defining) facts:

  • I stopped by in 1981 and decided to stay
  • I did this in Graz, Austria
  • I got my MSc. in computer science at the Graz Technical University
  • Music is one of the most important parts of my life (I play guitar, keyboards and piano in approximately 2 bands)
  • I started programming at the age of 13 (QBASIC, good times…)
  • I got my first Mac in 2005 (a PPC Mac Mini)
  • I use Windows only for occasional gaming
  • I got my first MacBook Pro in 2006, but drowned it in mineral water about a year later (RIP)
  • My current MacBook Pro stayed dry so far…
  • I wrote the time tracking application Project Calculator
  • I wrote My Things, Your Things, an application for managing all your borrowings
  • I co-founded the software company Nimblo (website under construction)

mhoeller General