Wednesday, June 22, 2011

Python: Increasing the Timeouts for urlfetch in Google App Engine

Google App Engine provides a function, google.appengine.api.urlfetch.fetch, for fetching URLs. I do believe all the other HTTP client libraries are monkey patched to make use of that function, which is written to take advantage of various Google infrastructure. The fetch function has a default timeout of 5 seconds. You can set a higher timeout by passing a deadline parameter, but the maximum is 10 seconds. Unfortunately, passing a deadline keyword parameter is often difficult if it's a third-party library that is making the call to fetch, for instance if you're using the GData client library.

I looked for a way to set the deadline parameter in a more global way, but I couldn't fine one by mere inspection of the code. I came up with the following HACK in order to work around this problem:
# HACK: Monkeypatch google.appengine.api.urlfetch.fetch to increase the
# deadline. This is used by the various client libraries.
def _fetch(*args, **kargs):
from google.appengine.api.urlfetch import _orig_fetch # Import late.
kargs["deadline"] = 10
return _orig_fetch(*args, **kargs)
_fetch.this_is_the_wrapper = True

# HACK: Because of the way the dev app server works when reloading code,
# things are a little tricky here.
from google.appengine.api import urlfetch
if not hasattr(urlfetch.fetch, "this_is_the_wrapper"):
urlfetch._orig_fetch = urlfetch.fetch
urlfetch.fetch = _fetch
else:
assert hasattr(urlfetch, "_orig_fetch")

2 comments:

Peter said...

To increase the timeout of GData urlfetch calls you can always use the run_on_appengine function call.
For details see the documentation:

http://gdata-python-client.googlecode.com/svn/trunk/pydocs/gdata.alt.appengine.html

Shannon -jj Behrens said...

> To increase the timeout of GData urlfetch calls you can always use the run_on_appengine function call.
For details see the documentation:

Thanks. That's what we ended up doing. However, we had multiple libraries, each making HTTP calls, so we had to figure out how to tweak it for each of the client libraries. I've officially asked the GAE team to add some way to set the value globally.