Parsing Craigslist for an item across Multiple Cities

A friend of mine wanted something better than a curl, pipe, awesomeness he wrote to parse out motorcycles in our surrounding area – see his blog here (

Zach’s Code

curl '…'
--silent | grep 'dc:title' | sed -e 's/<.*\[//g' -e 's/\&#.*$//g'|grep -v 'by owner search'

So I wrote this little snippet of code to have a much more maintainable but quick and dirty (no exception handling) python script for him to run on a cron job or what not. Anyways take a look and fork it for your own craigslist shenanigans even though it’s against their TOS.

My Code

#!/usr/bin/env python

import pycurl

import re

from StringIO import StringIO

def parse_listing(text):
    Function for parsing results
    listings = text.split("</channel>")[1].split("<item rdf:")
    for item in listings:
        messy_name = item.split("<title>")[1].split("</title>")[0]
        pattern = re.compile(r"<!\[CDATA\[(.*)&")
        name = pattern.match(messy_name)

# All the listings we want to pull
cities = ['austin', 'collegestation', 'houston', 'killeen', 'sanantonio', 'sanmarcos', 'waco']
for city in cities:
    url = 'http://' + city + ''
    buffer = StringIO()
    c = pycurl.Curl()
    c.setopt(c.URL, url)
    c.setopt(c.WRITEDATA, buffer)
    body = buffer.getvalue()


Write a Comment

Your email address will not be published. Required fields are marked *