Proxies are easy to find, but often not working. I’ve put together simple script to test to assert if a proxy is up or down.
import os
from urllib2 import urlopen
# example proxy
os.environ['HTTP_PROXY']='http://169.229.50.9:3128'
try:
f = urlopen('http://www.google.com')
data = f.read()
f.close()
print 'Pass'
except:
print 'Fail'
It doesn’t get much more simple than this. Granted, the read is probably overkill, if the url cannot open, it dies.
I believe in community, I believe in groups, I believe in support. Having said that, I’m working on a framework *platform independent*, that will allow you to automate and “form filling”/”web submission” process, with user agent emulation.
The “engine” parses an XML rule-set for a given site. Ex. myspace.com.xml
You supply a configuration XML.
The script runs, and follows the rules in the XML rule-set.
This project will be limited to captchas. I’m very excited to have a framework that will do automatic submissions, however, time doesn’t allow for a captchas decoder. More news to come later.
Update!
I started working on the configuration framework. Here’s a sample of the framework.
<config domain=”value” proxy=”value” stdout=”value”>
<define>
<user_defined_x>
<user_data_variable_1>value</user_data_variable_1>
<user_data_variable_2>value</user_data_variable_2>
</user_defined_x>
</define>
<sequence>
<form define=”value”>
<input type=”value” name=”value” id=”value” class=”value”>value</input>
<input type=”value” name=”value” id=”value” class=”value”>value</input>
<click type=”value” name=”value” id=”value” class=”value”/>
</form>
<navigate>
<click type=”value” id=”value” class=”value”/>
</navigate>
</sequence>
</config>
In the define tag, a user will be able to “describe” a form. For example, if I’m using a login form, I could do something like.
<define>
<login>
<username>MyUserName</username>
<password>MyPassword</password>
</login>
</define>
It is important to note that the tags “username” and “password” are spelled exactly the same way as they are on the form. So, when I call the sequence tag, it would look like this.
<sequence>
<form define=”login”>
<input type=”text” name=”password”/>
<input type=”password” name=”password”/>
<click type=”submit” name=”submit”/>
</form>
</sequence>
In the form tag, I reference the login tag from define *the example just before this one*. Calling the input tags without a value tells the engine to use the data provided from the define section.
The navigate tag is used to automatically click links. More to come later.
Not cars.
I’m talking about a blog that blogs for you. The idea is fairly simple, however, there’s a fine line between “stealing” and “syndicating”. My understanding of this principle is…You can “reblog” anything in an RSS feed, as long as you provide a back link. So, let’s continue.
Pick a niche, I’m going to pick the “Pittsburgh Pirates”.
Now that I’ve identified a niche, I register a domain with “Pittsburgh Pirates” in the url. Always include your main keyword in your domain.
Content time! Head on over to http://blogsearch.google.com/ and grab some feeds. Do some searches “Pittsburgh Pirates”, “Pirates”, “Pirates Baseball”, “Pittsburgh Baseball”…you get the idea. Assert that the rss feeds have enough information in them to keep people coming back.
Next step, install WP for your blog. Grab a plugin that allows you to post syndication. “FeedWordPress” or “WP-Autoblog” will do the trick.
Insert rss feeds into the plugin that you’ve downloaded, and let it do the magic.
This is the most inportant part of the whole process, skip this step and you can expect to make $0. SEO Optimize your site. Change the footer, and do some anchor tags on your topic, and to your tags. Before all of your posts, write one static, SEO optimized post that includes every search term that you are trying to own. Don’t get too aggressive on this, just pick 5 or so. Anchor link. Make images that link to your categories. Include a cool image on your main page. All of a sudden, people spend more time on your site, and you’ve got people clicking links.
Last thing, socially submit your site. And submit your feeds to rss aggregators. Do ~1 per day. Initially, be very slow to add your url anyplace, google will peanlize you from adding your domain too fast.
I hope you make millions!
Freedom. Whatever that means. But it’s been beat into my head that we have it. Freedom hints at the notion of “anything”. Wow…Anything!? I can be/do/go anything I want, as long as I’m not hurting anybody, and it’s cool. Sounds awesome?
Every time I get to know somebody, I find that they are a lover of substance. Whether they are a drunk, a pill popper or a hard core street junkie, they NEED it. They really don’t know why. Nobody knows why, but we need a crutch. Maybe we need to rethink our social structure. Maybe “anything” isn’t so awesome.
Capitalism! Democracy! Freedom! Let’s revisit “anything”. I’m brought up to believe that I’ll be rich/famous/beautiful/awesome/INSERT WORD HERE/ and I’m eating it up. But wait! I get my first job, and realize that I’m but a pawn in a huge SCHEME. Yes, a SCHEME. It apears that my hardwork goes to furnish a SWEET LIFE for a very small percentage of the population. Bizzare? Yeah, that’s not fair. I was the one that developed that flagship product, and you gave me a $0.23 raise, while making yourself millions. But you didn’t forget about your FLUNKIES. You found it in your heart to give them $250,000. The “taker of the progress report” is a HIGH paid position. I like how you have a special lunch in my team’s honor. You assure us that WE MADE THE COMPANY WHAT IT IS! I wonder how our names were omitted from the publications. Oh wait, there it is. Is that a footnote? Well, at least it’s there.
Let me give you %100. Take your %20 cut of the profits. I’ll take my <1% cut of the profits. I question you about the fairness, you retalliate “YOU ARE LUCKY TO HAVE A JOB!”. Wow, I feel greedy. Why do I ask for so much. I’m lucky, I’m fortunate, that’s what you tell me. In the back of my mind, I KNOW that I can go anywhere else and they’ll pay me what I’m worth. I must have forgotten about capitalism. Did I forget that I was but a pawn in far “greater” scheme. I’m not worth a percentage, but rather the “competetive salary”. Translating to “We don’t pay much, but if we word it like this, you’ll think we do”.
So, let’s go to our jobs, work our best, and make a grand future for the “haves” of society. For without us, they’d be one of us.
This is just a basic MediaWiki Scraper, just pulling out all readable strings in “p” tags. Since MediaWiki disallows scrapers, I used Mechanize.
Usage: python mediaWikiScrape.py http://en.wikipedia.org/wiki/High-level_programming_language
Download mediaWikiScrape.py
Growing up, I always made the assumption that people were fairly intelligent and able to make good decisions. This was my non-media, friends of the family bias. I realized that people made a decision and followed through with it, based on some form of decision making. My observations proved that people were intelligent.
As I began to mature, I noticed that people had extreme views on certain issues. I didn’t understand why people got so angry when they talked about these things. I began noticing that the media (printed, radio, TV) played a huge factor in people’s lives. I was able to concur that people who like “X” flocked together. I noticed that “X” was often a very unappealing subject, or at least in my small mind. The bad thing about “X” was that “X” could be something very stupid, racist, hateful and so on. The most amazing thing about “X” was that it created groups and subcultures.
Past my teenage years, I really started to see the truth about “X”. I noticed that I was able to draw many conclusions about the masses. I hypothesized that the more shallow “X” is, the more are drawn to it. This hypothesis contradicts my child-hood belief in my conceived notion of intelligence. The media speaks to the world, and the masses are led by an “X”.
Somebody once said, “Nobody ever went broke underestimating the stupidity of the American People” *or something to that tune*. The internet has given every individual the ability to be their own media outlet. There are billions and billions *Carl Sagan* dollars to be made from the many “X”s. Just turn on the TV at prime-time, and get out a notebook. You’ll be amazed at your findings.
When you sign up for a website, there’s a good change that you need to validate your email account. I have a junk email address that I use for such purposes. Using the alternate email saves me from spam, but I still need to physically log into the email account and grab the confirmation. I’ve automated this function, minus the link click. The following code will log into your gmail account, and grab your confirmation link. Please note that you can change the folder to junk instead of the inbox, you may run into that problem.
Download Email Search
Requires:
libgmail
Beautiful Soup
Sample:
>>> from EmailSearch import getConfirmationLink
>>> getConfirmationLink(’google@gmail.com’, ‘password’, ‘blackcodeseo.com’)
u’http://www.blackcodeseo.com/validate.php?id=218393923′
I’ve started a forum: Black Hat SEO Forum
It’s open to the public, please become a member and share your knowledge.
Remember a few years ago when google had an API allowing for searches from within an application? Then, they decided to ditch the project. I wrote an implementation of the old google search api, but with one modification, I put no limitation on the results. Thank you, Andy Pavlo for your help with this project.
You’ll need:
Beautiful Soup
Mechanize
Download Google Search API For Python
Sample implementation:
>>> from Google import Google, search
>>> results = search(’blackcodeseo.com’, 3)
>>> for result in results:
… print ‘Title: %s’ % (result.title())
… print ‘Url: %s’ % (result.url())
… print ‘Description: %s’ % (result.description())
… print
…
Title: Black Code SEO
Url: http://blackcodeseo.com/
Description: Oct 29, 2008 … Black Code SEO. Programatically Automating SEO … Download BlackCodeSeo Navigator. Run python setup.py install …
Title: Have A Question?
Url: http://blackcodeseo.com/have-a-question/
Description: If you have any questions about anything, you can reach me at matt@blackcodeseo. com and I will be happy to reply. Your questions may be posted on the site …
Title: SpiderMonkey « Didier Stevens
Url: http://blog.didierstevens.com/programs/spidermonkey/
Description: The exact post is http://blackcodeseo.com/python-spidermonkey-navigator/. Comment by Matt — Wednesday 29 October 2008 @ 20:56. Thanks. …
>>>
I’ve implemented a very simple automated comment poster. If you don’t make seemingly useful comments, don’t plan on getting too far, as most people moderate their comments. The code WILL fail if you do not note the following:
At the bottom of the file you’ll see a few variables that you’ll need to set.
blogUrl = “”"HTTP://WWW.YOUR_URL.COM”"”
Your blogUrl should be in the http://DOMAIN.com format.
keyword = “”"YOUR KEYWORD”"”
You need to change that to the keyword that you want. This will find all wp blogs similar to yours.
results = 50
You can turn this up to 100 or more.
{’author’ : “”"YOUR AUTHOR NAME”"”, ‘email’ : “”"YOUR@EMAIL.COM”"”, ‘comment’ : “”"YOUR COMMENT”"”},
This line can be edited to your liking, and duplicated as many times as you like. I would suggest making at least 20 of these, all a bit different from the last one. Your posts will NOT work if your ‘email’ isn’t in the form of blah@blah.com.
Download the script commentonwordpressblogs.py
←Older