Friday, September 22, 2006

Python: I Love Genshi!

I’ve totally fallen in love with Genshi! It's an XML templating engine for Python. I learned it between midnight and 2AM one night. By the next day, I was totally productive and totally loving it! I like the fact that template inheritance works so easily, and I love the XPath stuff. It's nice to be free of XSS vulnerabilities to some extent. I really didn't like Tal, so I was surprised to find that Genshi was so nice. It’s weird--Genshi is like a superset of all the templating engines, but in a way that is conceptually simple and elegant.

More about Genshi

3 comments:

mike bayer said...

One thing i can say i like about genshi is that it isnt offensive to me; it has a lot of resemblance to using jsp taglibs, and if I had to use it on a job, I would probably be OK with it. it has a lot of intelligence to it.

however, while I really am hoping for one python template language that is the most popular language and also is my favorite (and if its not Myghty, so be it), i still think myghty is way ahead of this one in several areas. I have several big concerns:

1. theres two entirely different syntaxes for generating HTML and for generating text. that seems awfully reinvention-y and also is no solution at all to a single document that contains XML markup *and* literal text.
2. it says "templates are executed directly". which says to me, "this template is parsed every time it is executed" and/or "this template is parsed everytime its loaded from the filesystem". Myghty has it beat as it generates real python modules once per template, just the same way large-scale engines like JSP and HTML::Mason do it for the fastest performance amongst arbitrarily large sets of files.
3. I dont see anything about encodings. What do I do with a template thats in cp1250 and wants to output as ISO-8859? and you cant say "XML handles it" because they have a non-xml template language as well.
4. no inheritance. ill plead ignorance here that i dont exactly understand the py:match directive (although im not sure how they can say its "more powerful", when its just something different). but once again, its only in the XML language. i get the impression that they just dont like the inheritance model...which makes me alittle antsy since i think nobody likes the inheritance idea until they play with it for awhile (like me).
5. filesystem hardwiredness. when you do an include, the file is included based on a direct search of the underlying filesystem in relation to the file location of the current document. for starters, this would make it hard to put template includes in another template that was programmatically generated. there doesnt seem to be any way to abstract away the resolution of templates that are called from other templates, or to install a custom resolution model that is used both to locate top-level templates as well as included templates. this is another area Myghty has a ton of functionality (it actually has too much....).

I think genshi has a lot of appeal, im glad a higher-performance alternative to Kid is now available, and it seems like a lot of the things i dont like about it have to do with the environment in which a template runs, as opposed to its syntax.

im starting to feel the need for a true "template framework" system whereby everyone's pet syntax can be built on top of a common engine that provides all the hardcore features, such as all the component caching, module generation, encoding support, etc. in theory, myghty's port from Mason was meant to do this, in that you could swap in your own Lexer/Compiler/ObjGen to make it happen (but its only theoretical and in reality would need a ton of cleanup).

Christopher Lenz said...

Hey Mike,

in response to your concerns:

1. The syntaxes are different, but the underlying engine is the same. The syntax is different because the (mostly Kid-inherited) markup template language is designed for HTML/XML templates, and not for plain text. That is a feature: in my experience general-purpose text templating just doesn't work very well for markup. So Genshi's focus is on generating markup, but it also provides simple plain-text templating if you just need to generate a couple of simple plain text emails.

2. Kid compiles templates to Python byte-code, but that doesn't make it faster than Genshi. Part of my motivation for Genshi was to see whether templates really need to be compiled to code to have acceptable performance, and I think Genshi shows that that's not the case. Not having a code generation step makes the design a lot simpler. Genshi may not be the fastest template engine on the block, but the actual performance bottlenecks are in different areas. Also note that templates are only parsed the first time they're loaded, and Python expressions inside templates are compiled down to Python bytecode at parse time. If that's not enough (for example, if you're not using long-running process), pickling parsed template objects may still be more effective than compiling them.

3. Encodings: yeah, that needs some improvement. But please note that we're only at version 0.3, and for many scenarios, having UTF-8 in-and-out is probably sufficient.

4. I have used inheritance in template engines. I still think that using includes combined with py:match gives you more flexibility. It's kind of like running an XSLT transformation on the template output, only that the transformation is integrated into the render process. While traditional inheritance makes it easy to reuse common template snippets, includes with py:match does that too *and* makes it easy for people to customize the output for their sites, without the main template authors having to hardwire hooks for such customizations. The document structure itself provides those hooks.

5. You can use your own custom template loader, for example if you'd like to load templates from a database. I haven't tried that yet though, so it's entirely possible that some of the code would need to be changed to make that work nicely. If you need to programmatically generate snippets that are included in templates, you just generate the snippets themselves (as opposed to templates for those snippets). py:match rules would still pick up those snippets and transform them, as if they were an actual part of the template.

All that is not to say that Genshi is the perfect template language. I think it should work pretty well for the majority of web-applications, especially those that need to be customized for different deployments---but that's just my personal opinion, and I am definitely biased ;-)

mike bayer said...

hey christopher -

thanks for the comments. i think you shouldnt close the door totally on python module generation though; that Genshi is faster than Kid is not enough of an argument since Kid was notorious for extreme slowness, and theres even a message on the sourceforge list for it explaining that this is due to its usage of "exec()" ...to which I just cringed ! exec()?!

also, if you compile the template then hold onto it thats fine, but for a site that has many thousands of files (and they do, even though you might argue that they should make better usage of templating) it will not be able to hold all those files in memory (or it would eventually run out of memory) and will be re-compiling quite a bit (which brings up the point..what happens if a filesystem has 10000 genshi templates ? will the template engine just fill up with all those compiled templates in memory ?).

while compilation does complicate things, it can be optional. myghty does it optionally (though it very much prefers to). and also it has a lot of functionality with regards to keeping only a subset of templates in memory at a time..a lot of effort went into all those mechanisms (which are generally unrelated to the template syntax that most people judge a template language on).

what do you think of the notion of a "common template runtime" which we could ultimately build myghty, genshi, django templates, whatever on top ? you could then swap in whatever kind of backend (eval, generate py modules, exec() :) that you want...) i did put some thought into changing myghty's component model to be all WSGI to enable that sort of thing, but i think WSGI is not quite appropriate for that.