How To Implement Tagging With TurboGears and SQLObject
I recently read Nadav Samet's nice tutorial on How To Implement Tagging With TurboGears and SQLAlchemy [1] and since I have implemented a very similar tagging mechanism for CBlog [2], but am using SQLObject [3] instead, I want to show you that this can be done very easily too.
In my explanations I am keeping my examples rather abstract to emphasise that this approach works for any kind of objects that can be tagged. You can use this to implement tags for articles, bookmarks, addressbook entries or whatever you want.
The model
Tags objects
Let's look at the model first. In your application's model.py file, add the following SQLObject class to represent tags:
class Tag(SQLObject):
class sqlmeta:
defaultOrder = 'name'
name = UnicodeCol(alternateID=True, length=100, notNull=True)
@classmethod
def byLabel(cls, value):
"""Retrieve tag in case-insensitive manner."""
try:
return cls.select(func.LOWER(cls.q.name) == value.lower())[0]
except IndexError:
raise SQLObjectNotFound(value)
This is a very simple but sufficient representation of a tag. Tags have a single "name" attribute that acts as the tag label and also as the unique identifier, i.e. there can be no two tags with the same name/label.
We also want that two labels which only differ in case, for example "Python" and "python", should be saved as the same tag. One solution would be to convert the label to lowercase when saving the tag, but I find it nicer if the tags preserve case, so that we can have tag labels like "TurboGears". My solution therefore takes another approach: when a new tag is created, the label is saved with the exact case preserved, but saved tags can be looked up in a case-insensitive manner. This is what the class method byLabel() is for. Later on, when the tags of a taggable object are updated, we can re-use saved tags regardless of whether the user specified the tag label with the same capitalization or not.
Taggable objects
Now for the objects to which you can attach tags. I dubbed these "Taggable", but of course the class in your actual application will probably have a more descriptive name, e.g. "Article" or "Bookmark" or whatever. I'm also leaving out any other attributes your taggable objects will have (e.g. in case of an article objects, things like "author", "text", "date" etc.) since these are dependant on the type of object your application needs:
class Taggable(SQLObject):
tags = SQLRelatedJoin('Tag', orderBy='name')
It's as simple as that. Tags and Taggables have a many-to-many relationship, i.e. taggable objects can have unlimited tags attached to them, and a single tag can be attached to as many taggable objects as you like. I am using a SQLRelatedJoin column type here, instead of RelatedJoin. The former behaves like a normal related join, but when you access the tags attribute of a Taggable, you will not directly get a list of attached tags, but a SelectResults object instead, which you can convert into a list with the list() function. Using SQRelatedJoin allows us to do things like taggable.tags.count() to retrieve the number of attached tags, without fetching each of these tags from the database. Read more about SQLRelatedJoin and SelectResults objects in the SQLObject documentation [4]. Unfortunately SQLRelatedJoin (and also SQLMultipleJoin) are not handled correctly by CatWalk, but since we will now implement our own method of attaching tags to taggable objects, we can live with with this loss.
The controller
We want to allow the user to add tags to a taggable object by simply entering a comma-separated list of tag labels. (I chose commas as separators rather than spaces, so that we can have tag labels with spaces in them, e.g. "Web development" or "Ruby On Rails".) Suppose we have a form where the user can edit a taggable object, we just add another text entry field with the name "tags" and, in controllers.py, add a few lines to the controller method that handles the form submissions. In this example, I call this method update_taggable and it requires the ID of the taggable object and a string with the list of tag labels as arguments:
class TaggableController(controllers.Controller):
@expose()
def update_taggable(self, id, tags='', *args, **kwargs):
taggable = Taggable.get(id)
# update any other properties of the taggable object here...
hub.begin()
taglist = [tag.strip() for tag in tags.split(',')]
taggable.update_tag(taglist)
hub.commit()
What this method does, is also very simple: the string with the comma-separated tag list is split into a real list of tag labels and whitespace around every label is stripped. The resulting list is passed to the update_tags method of the taggable object. We will add this method to the definition of the Taggable class in the next section.
Updating tags
The list of tags given by the user may contain existing tags or new ones and tags that were formerly attached to the taggable object might have been removed from the list. The following method handles all these cases. We have to add this method to the class Taggable in model.py:
def update_tags(self, taglist):
"""Update tags associated with entry to given iterable of tagnames."""
old_tagset = set(list(self.tags))
new_tagset = []
for tagname in set(taglist):
# skip empty tag labels
if not tagname:
continue
try:
# retrieves existing tag regardless of case
tag = Tag.byLabel(tagname)
# --> Tag reused
except SQLObjectNotFound:
tag = Tag(name=tagname)
# ---> new Tag created
# add tag to object if not already attached
if tag not in old_tagset:
self.addTag(tag)
new_tagset.append(tag)
# remove old tags
for tag in old_tagset.difference(new_tagset):
self.removeTag(tag)
The approach adopted by this method is the following:
- Retrieve the list of tags that are presently attached to the object.
- Convert the list of tag labels to a set to remove duplicates.
- Iterate over the set of tag labels and for each label:
- Look if the tag is already saved in the database and, if yes, use that.
- If not, create a new tag object.
- Add the tag to the object, if it's not already in the set of old tags created in step 1.
- Add the tag to a list of new tags.
- Compare the list of new tags (A) with the set of old tags (B) and remove every tag from the object that is in B but not in A.
Where to go from here
That's all the code needed to handle tags. Of course, there are several other interfaces to handle tags you can provide to the user to make life easier for him.
If you want to make it easier for the user to add and remove tags, you can generate a tag cloud from the the list of saved tags. When the user clicks on a tag in the tag cloud, add/remove tags from the "tags" entry field via JavaScript. To see an example of this kind of interface, head over to ma.gnolia [5]. It also sports a nice ajaxian auto-completion feature for the tags text entry field.
Also, you should provide an interface to remove tags from the database or rename them. It might also be useful to allow tagging of several objects at once. Since this can be achieved easily with the existing SQLObject facilities, I will not enter further in to this topic here.
Lastly, you should add a back reference to your taggable objects to the Tag class, so that you can fetch all objects with a certain tag easily. Just add another attribute to the Tag class in model.py:
entries = SQLRelatedJoin('Taggable')
How you should name this attribute depends on the type of your taggable objects, instead of the generic entries, something like articles, bookmarks etc. might be more appropriate. Then, if you want to get all objects with the tag "AJAX", you can use this:
tag = Tag.byLabel('AJAX')
entries = list(tag.entries)
That's all for now. I will post the code for a small sample application that implements this pattern later, so stay tuned!
Reader comments
There are 1 comment on this article. Add a comment now...
1 Mike said on 03.01.2007 @ 03:28 CET:
Thanks Chris! Very informative and I appreciate you taking the time to share with us.
Keep the posts coming and I’ll be checking back soon.