Archive for the ‘Google’ Category
WebDriver for logging into Twitter
No real reason for choosing Twitter apart from its cool
.
The code below uses unittest to run. It creates a new WebDriver object and users it to fetch http://twitter.com and submit a username and password. Once the details are added it “clicks” the “Sign In” button to login into Twitter.
#!/usr/bin/env python
import unittest
import logging
from webdriver_firefox.webdriver import FirefoxLauncher
from webdriver_firefox.webdriver import WebDriver
class TwitterTests (unittest.TestCase):
def test_login_twitter(self):
driver = WebDriver()
driver.get("http://twitter.com")
# find our elements - the html on the page with same ids
username_element = driver.find_element_by_id('username')
password_element = driver.find_element_by_id('password')
# use this to toggle the remember me box
remember_me_element = driver.find_element_by_id('remember')
# type into the boxes
username_element.send_keys('yourusername')
password_element.send_keys('yourpassword')
remember_me_element.toggle()
# click Sign In and we should be logged in
driver.find_element_by_id('signin_submit').click()
# check that the title of the page is correct to see if we logged in
self.assertEqual(driver.get_title(), 'Twitter / Home')
# Extract from the html using xpath to find username and updates of the people on the screen
updates = driver.find_elements_by_xpath("//span[@class='entry-content']");
user = driver.find_elements_by_xpath("//a[@class='screen-name']");
# display in the terminal the name and the update
for i,update in enumerate(updates):
print user[i].get_text() + ": " +update.get_text()
# uncomment the following to close the window and finish
#driver.quit()
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO)
unittest.main()
The docs are not great for WebDriver but reading the source is pretty simple. Being able to mentally parse Java to Python is also a big advantage!
Odd Google App Engine Issue
I was having issues getting a url with urlfetch.fetch(url), it kept failing with:
[snip]
File "/home/channam/Code/python/google_appengine/google/appengine/api/urlfetch.py", line 241, in fetch
return rpc.get_result(allow_truncated)
File "/home/channam/Code/python/google_appengine/google/appengine/api/urlfetch.py", line 388, in get_result
self.check_success(allow_truncated)
File "/home/channam/Code/python/google_appengine/google/appengine/api/urlfetch.py", line 356, in check_success
raise DownloadError(str(e))
DownloadError: ApplicationError: 2
A little bit of poking found that the issue was caused by having a space in the url, something which I’m fairly certain was ok on early versions of GAE. Oh well you live and learn.
bit.ly for the win
I got my Google App Engine library featured on the list of entries for bit.ly’s competition see bit.ly competition. Admittedly its a small bit of code but I hope someone might find a use for it.
But I`m still waiting for swag
Forms in App Engine
A handy hint from an on the ball App Engine fella: how to extend the StringProperty class so that it will render as a password field
App Engine and utf-8 Encoding
You may or may not have seen the error:
<type ‘exceptions.UnicodeDecodeError’>: ‘ascii’ codec can’t decode byte 0xc3 in position 2223: ordinal not in range(128)
args = (‘ascii’, ‘<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Tra… Engine” />\n\t\t</div>\n\n\t</div>\n\n</body>\n\n</html>\n\n’, 2223, 2224, ‘ordinal not in range(128)’)
encoding = ‘ascii’
end = 2224
message = ”
object = ‘<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Tra… Engine” />\n\t\t</div>\n\n\t</div>\n\n</body>\n\n</html>\n\n’
reason = ‘ordinal not in range(128)’
start = 2223
This had me foxed as I fetching band names which sometimes had a fancy character in them: Motörhead for example.
To allow the string to be rendered using the following:
unicode_string = unicode(string_with_char_init)
self.response.out.write(unicode_string.encode('utf-8'))
Thats it! For App Engine that works both to render to the page or to use in urlfetch.fetch.
bit.ly Competition Entry
Below is the raw code to make a Google App Engine application with the really basic bit.ly api. Theres currently no error checking etc.
If you just want the functionality use the BitLy class in your App Engine code. Currently is just returns simple for you to use. So for example with shorten to access the the hash you would use: json['results'][urlentered]['hash']. Replace the url entered with the url you supplied.
Any questions please email ch at chrishannam dot co dot uk
import cgi
from django.utils import simplejson
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.ext.webapp import template
from google.appengine.api import urlfetch
class Index(webapp.RequestHandler):
def get(self):
EXPAND = "expand"
SHORTEN = "shorten"
INFO = "info"
STATS = "stats"
ERRORS = "errors"
bitly = BitLy('your_login','your_apikey')
self.response.out.write('')
self.response.out.write(bitly.expand('31IqMl'))
self.response.out.write(bitly.shorten('http://www.chrishannam.co.uk'))
self.response.out.write(bitly.info('31IqMl'))
self.response.out.write(bitly.stats('http://bit.ly/31IqMl'))
self.response.out.write(bitly.errors())
self.response.out.write('')
class BitLy():
def __init__(self, login, apikey):
self.login = login
self.apikey = apikey
def expand(self,param):
request = "http://api.bit.ly/expand?version=2.0.1&shortUrl=http://bit.ly/"
request += param
request += "&login=" + self.login + "&apiKey=" +self.apikey
result = urlfetch.fetch(request)
json = simplejson.loads(result.content)
return json
def shorten(self,param):
url = "http://" + param
request = "http://api.bit.ly/shorten?version=2.0.1&longUrl="
request += url
request += "&login=" + self.login + "&apiKey=" +self.apikey
result = urlfetch.fetch(request)
json = simplejson.loads(result.content)
return json
def info(self,param):
request = "http://api.bit.ly/info?version=2.0.1&hash="
request += param
request += "&login=" + self.login + "&apiKey=" +self.apikey
result = urlfetch.fetch(request)
json = simplejson.loads(result.content)
return json
def stats(self,param):
request = "http://api.bit.ly/stats?version=2.0.1&shortUrl="
request += param
request += "&login=" + self.login + "&apiKey=" +self.apikey
result = urlfetch.fetch(request)
json = simplejson.loads(result.content)
return json
def errors(self):
request += "http://api.bit.ly/errors?version=2.0.1&login=" + self.login + "&apiKey=" +self.apikey
result = urlfetch.fetch(request)
json = simplejson.loads(result.content)
return json
application = webapp.WSGIApplication(
[('/', Index)],
debug=True)
def main():
run_wsgi_app(application)
if __name__ == "__main__":
main()
Woe is pylast.py
Well it tested OK my local machine but deploying it to Google created some issues. In short its just CPU hungry:
01-04 03:39PM 50.602
This request used a high amount of CPU, and was roughly 2.1 times over the average request CPU limit. High CPU requests have a small quota, and if you exceed this quota, your app will be temporarily disabled.
I removed some of the xml processing as by default it gets everything you might need. This sped it up slightly but still the fatal 500 error appeared.
Well for once it appears I was right to reinvent the wheel.
Making pylast.py play with Google App Engine
I have been playing with Google App Engine and last.fm’s api for a while now. I made the standard mistake of not checking if anyone else had written a library in Python to do the hard work for me. So, after a little googling I found pyLast which is a great piece of work by Amr Hassan. After a little playing I found that it didn’t play well with App Engine. This was down to the it not using urlfetch, which is no big surprise as thats a feature unique to App Engine. I also noticed it was missing the ability to fetch the date and start time of an event.
So below is a patch to App Engine up the code and fetch the date/time of an event. There is a slight oddity I have yet to figure out, the time gets appended to the date. I cant see any sane reason why currently.
Be warned this breaks the module for standard Python use unless you are have google.appengine.api kicking around in your module path.
If you wish to try out my App Engine app its over at Cassandra. Just enter the name of the artist to find out where they are playing displayed on Google Maps. Its very much an ongoing project…
diff pylast.py pylast.py.orig
37d36
< from google.appengine.api import urlfetch
286,287c285,292
< request = 'http://' + API_SERVER + API_SUBDIR + '?method=' + '&'.join(data)
< response = urlfetch.fetch(request)
---
> conn = httplib.HTTPConnection(API_SERVER)
> headers = {
> "Content-type": "application/x-www-form-urlencoded",
> 'Accept-Charset': 'utf-8',
> 'User-Agent': __name__ + '/' + __version__
> }
> conn.request('POST', API_SUBDIR, '&'.join(data), headers)
> response = conn.getresponse()
292c297
< doc = minidom.parseString(response.content)
---
> doc = minidom.parse(response)
404a410
>
1391,1392d1396
< data['date'] = self._extract(doc, 'startDate')
< data['time'] = self._extract(doc, 'startTime')
1482,1497c1486
<
< def getStartDate(self):
< """Returns the start date of the event """
<
< return self._getCachedInfo('date')
<
< def getStartTime(self):
< """Returns the start time of the event """
<
< return self._getCachedInfo('time')
<
< def getReviewCount(self):
< """Returns the number of available reviews for this event. """
<
< return self._getCachedInfo('reviews')
<
---
>
Banshee’s Database
I have been playing with Banshee the media player for Linux.
Its database is just a sqlite3 database. This makes getting data out very simple e.g.
:~$ sqlite3 ~/.config/banshee-1/banshee.db SQLite version 3.5.9 Enter ".help" for instructions sqlite> select name from CoreArtists; Cradle of Filth ACDC Alice in Chains A Perfect Circle
I used -1 on my banshee path as I`m running the latest version not available from the standard repos for Ubuntu.
To talk to the database from python use the following:
:~$ python
Python 2.5.2 (r252:60911, Oct 5 2008, 19:24:49)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
>>> conn = sqlite3.connect('.config/banshee-1/banshee.db')
>>> c = conn.cursor()
>>> c.execute("""select name from CoreArtists""")
>>> print c.fetchall()
[(u'Cradle of Filth',), (u'ACDC',), (u'Alice in Chains',), (u'A Perfect Circle',)]
GWT on Ubuntu
Had a hard job getting this to work. But as usual the solution is quite simple:
- Download and unzip http://google-web-toolkit.googlecode.com/svn/tools/redist/mozilla/mozilla-1.7.13.tar.gz
- Add $INSTALLED_MOZILLA (e.g. /code/mozilla-1.7.13) to mozilla-hosted-browser.conf in the GWT
- Add $INSTALLED_MOZILLA (e.g. /code/mozilla-1.7.13) to /etc/ld.so.conf.d/libc.conf (can be any file in that dir ending in conf)
- run ldconfig
- Start your app and have fun!
If you see gwt-linux-1.5.2/libgwt-ll.so: undefined symbol: JS_PropertyStub then your mozilla-hosted-browser.conf needs to be set to the downloaded mozilla from set 1. Looks like the newer Firefox and Seamonkey dont have this symbol.

