NakedPunch visual navigation

This process was started some time ago, hit some snags and now back on track. The idea was to categorize the articles and create a visual navigation page.

This Python script scrapes the data off the NakedPunch site and outputs to the terminal a ‘%’ separated text file with the URL, Title, Author and Blurb columns. Most of the work is done by the BeautifulSoup library.

#!/usr/bin/python

import requests
from BeautifulSoup import BeautifulSoup

prefix = 'http://www.nakedpunch.com/site/archives'

print 'url%title%author%blurb'           

for page in range(1, 20):
    url = prefix+"?page="+str(page)
    
    response = requests.get(url)
    html = response.content
    soup = BeautifulSoup(html)
    alldivs = soup.findAll('div', 'article-summary')

    for div in alldivs:
        url = div.h4.a['href'].encode('utf8').strip()
        title = div.h4.text.encode('utf8').strip()
        author = div.div.text[3:].encode('utf8').strip() # remove leading 'by '
        blurb = '"'+div.p.text.encode('utf8').strip()+'"'
        print "%".join([url, title, author, blurb])

Each entry could then be tagged…