Showing posts with label ReferenceProperty. Show all posts
Showing posts with label ReferenceProperty. Show all posts

Thursday, December 25, 2008

Cached ReferenceProperty

Piece of cake

Earlier I wrote about my wish to subclass ReferenceProperty so the collection would not be fetched every time I iterate though it. Well, it was so easy I can post the whole implementation here.
from google.appengine.ext import db

class CachedReferenceProperty(db.ReferenceProperty):

  def __property_config__(self, model_class, property_name):
    super(CachedReferenceProperty, self).__property_config__(model_class,
                                                       property_name)
    #Just carelessly override what super made
    setattr(self.reference_class,
            self.collection_name,
            _CachedReverseReferenceProperty(model_class, property_name,
                self.collection_name))

class _CachedReverseReferenceProperty(db._ReverseReferenceProperty):

    def __init__(self, model, prop, collection_name):
        super(_CachedReverseReferenceProperty, self).__init__(model, prop)
        self.__collection_name = collection_name

    def __get__(self, model_instance, model_class):
        if model_instance is None:
            return self
        if self.__collection_name in model_instance.__dict__:# why does it get here at all?
            return model_instance.__dict__[self.__collection_name]

        query=super(_CachedReverseReferenceProperty, self).__get__(model_instance,
            model_class)
        #replace the attribute on the instance
        res=[c for c in query]
        model_instance.__dict__[self.__collection_name]=res
        return res

    def __delete__ (self, model_instance):
        if model_instance is not None:
            del model_instance.__dict__[self.__collection_name]
Having these classes now we can rewrite previous example as:
class Master(db.Model):
  pass

class Detail(db.Model):
  master=CachedReferenceProperty(Master)
Try to run the same cycle and you will see it executes instantly even with 100,000 iterations instead of 1000.

Is it a free cake?

Not exactly. Try this:
m=Master()
m.put()
d1=Detail(master=m)
d1.put()
print m.detail_set
d2=Detail(master=m)
d2.put()
print m.detail_set
The second time it returned a wrong result, which did not include d2. So we need a way to reset the cached value and fetch up-to-date values from the datastore. Fortunately, it's achieved easily:
del m.detail_set
print m.detail_set
This is why I implemented _CachedReverseReferenceProperty.__delete__. When m.__dict__ has no key'detail_set', m.detail_set is dispatched to type(m).__dict__('detail_set'), and there I call the base class to access the datastore. What surprised me is when I do have m.__dict__('detail_set'), m.detail_set is still dispatched to Master.__dict__('detail_set'). I don't understand why that happens, so I worked around this problem. Have to learn Python better to answer that question.

Wednesday, December 24, 2008

AppEngine Datastore and memcache

I miss Hibernate collections. In the following code I access the collection a thousand times:

class Master(db.Model):
  pass

class Detail(db.Model):
  master=db.ReferenceProperty(Master)

m=Master()
m.put()
d=Detail(master=m)
d.put()

for i in range(1000):
  for tmp_d in m.detail_set:
    pass

The above code takes a few second to execute. The reason is Datastore fetches the collection from the storage every time, and in Hibernate the collection would be fetched from the database only once until the end of the session. Oops, no sessions with Datastore. So Datastore developers were right when they opted to fetch collection every time - they don't know when the details change.

This is the reason Master cannot be put in memcache effectively: it would be stored without the Details. Master.detail_set holds only the definition of the query needed to get the details. So I'm thinking of a way I could decorate ReferenceProperty to make one-to-many relations suitable for the memcache. So big object trees will be read from Datastore once and then accessible in a fast way.