Keyword Distribution Over RSS Syndication

A quick and easy way to create content is syndication.  Rss is among the most popular methods of syndication and the best part about Rss is that you have full control over the content that you are “collecting”.  Mark Pilgrim developed a nice rss parser to use with python; Universal Feed Parser.

Let’s “subscribe” to 10 rss feeds from google’s blog search, with our target keyword being “python”.

Parsing Google Blogsearch

import feedparser

d = feedparser.parse('http://blogsearch.google.com/blogsearch_feeds?hl=en&q=python&ie=utf-8&num=10&output=rss')
for feed in d['entries']:
    print feed['link']
    print feed['title']
    print feed['content']

At this point we can grab a part of speech to replace with our keyword.  Doing so will build up our likely hood of dominating the SERPS.  This would be a good point to point out a few things.

1)  Be frugal with the use your keyword(s).
2)  Randomizing the order of the content that you’ve scraped makes it more difficult to flag you as a “scraper”.
3)  Change the titles!  Bloggers are ALWAYS reporting “scrapers”, so mix it up a bit.
4)  When you parse URL feeds, use relevant topics, if you are tying to dominate “pay day loans”, don’t search for blogs on “potato salad”.  Try to stay on point with your blog searching, I realize that you are going to be collecting lots of feeds.  There is a finite number of blogs, seriously!

For the next chunk of code, you’ll need Oliver Steele’s PyWordNet; the python implementation of WordNet.  I suggest reading up on this project, I’ve used it for many of my own projects.  Let’s take a string, replace all Nouns with the keyword “python”.

Substituting Nouns With A Keyword With WordNet

from wordnet import *
from wntools import *

def substituteKeyword(description, keyword):
    words = description.split(' ')
    return string.join( map(lambda word: (keyword, word)[morphy(word, 'noun')==None], words), ' ' )

Leave a comment

You must be logged in to post a comment.