The JSON module supports encoding (aka serializing) for all the basic built-in python types– strings, lists, dictionaries, tuples, etc. but if you have your own user-defined class that you want to store, I found the documentation to be pretty ambiguous. And since I also didnt see any complete examples out there of custom object encoding and decoding, i thought i would post mine here.
Encoding
For encoding, the documentation’s not all that bad. It will tell you to implement the default() method in a subclass (of json.JSONEncoder) which takes your obect as an argument, and returns a serializable object. By serializable, they just mean something in the form of one of the basic serializable types. So, say you have a class with a few attributes as follows:
class MyClass: def __init__ (my_int, my_list, my_dict): this.my_int = my_int this.my_list = my_list this.my_dict = my_dict
You could write a custom encode function by mapping all the class attributes you want to save as members of a dictionary. If there are helpful additional things you want to store as well, that’s fine too. in this example, i use a string representation of a previously defined datetime object to make note of when the object was saved. Of course the only thing to remember is that when you later decode the object, you’re going to be recreating a MyClass object from this data, and it will have to match (so, specifically, you’ll either be discarding the date information or storing it elsewhere (or annotating your object with it on the fly)).
class MyEncoder(json.JSONEncoder): ''' a custom JSON encoder for MyClass objects ''' def default(self, my_class): if not isinstance (my_class, MyClass): print 'You cannot use the JSON custom MyClassEncoder for a non-MyClass object.' return return {'my_int': my_class.my_int, 'my_list': my_class.my_list, 'my_dict': my_class.my_dict, 'save date': the_date.ctime()}
Decoding
Decoding is less clear than encoding. there are two ways you can customize the results returned by the json load() or loads() functions. One is by writing an object hook, and one is by subclassing JSONDecoder and overriding the decode() function.
When called, load/loads calls the decode() function on the json string or file pointer you pass to it. if object_hook is also specified, then the function passed to object_hook is called after the decode function is called.
the default behaviour of decode() is to return a python object FOR EVERY SIMPLE OBJECT in that string. this means that if you have a hierarchy of such objects, for example a dictionary which contains several lists, then although you only call load() once, the decode() function gets called recursively for each python-like object in that string. here’s an example to convince yourself of this, using the previously encoded object:
fp = open ('myclass.json') def custom_decode(json_thread): print json_thread json.loads(fp, object_hook=custom_decode)
if what you want is to recover a custom object (such as the original MyClass object), this isnt terriby useful. at this point, it becomes clear we probably have to override the default loads() behaviour. as mentioned above, we do this by subclassing the JSONDecoder and overriding the decode() function. It’s not clear why the lack of symmetry here with JSONEncode– we override default() in one, and decode() in the other. but, ok.
now, your custom decode function took a python object as argument, but the decode() function of course will receive the raw serialized string being decoded. the basic approach is to use the generic decode capability of the JSON module to parse the string that was stored on disk into a python dictionary object. but the decoder still doesnt know about your custom MyClass object, so what you do is actually create a new object, initializing it with the values in my_class_dict.
class ThreadDecoder(json.JSONDecoder): def decode (self, json_string): # use json's generic decode capability to parse the serialized string # into a python dictionary. my_class_dict = json.loads(json_string) return MyClass(my_class_dict['my_int'], my_class_dict['my_list'], my_class_dict['my_dict'])
And there you have it. This is a simple example, but objects and types can be nested arbitrarily; you just have to be willing to unravel them as appropriate such that you are encoding and decoding basic python types.
Happy serializing!
Tags: json, python
4 Comments
Thanks for the post. I found it while I was searching for JSON custom encoding.
However, I have a question. How do you tie MyEncoder in the serialization flow. So for example if my serialization code is such:
res = serializers.serialize(”json”, questions)
Where questions is a QuerySet, and I want to use a custom encoder for encoding to be able to encode DateTime objects in a different way then the default JSONEncoder. How do I ensure that the above call uses the custom encoder instead of the default one.
–
Thanks
Parag
Hi Parag,
I assume from your terminology that you’re working with Django. It looks like at the bottom of the django documentation on serialization it has some clues. i havent dont this myself, but if your custom encoder is called MyEncoder, i would try something like:
json_serializer = serializers.get_serializer(”json”)()
json_serializer.serialize(queryset, cls=MyEncoder, stream=response)
‘cls’ is the argument you use to pass the custm encoder to the json.dump() function as well (that was originally in my post, but i think that paragraph was eaten by the pre-formatter– argh.
Good luck!
Hi Jessy,
Thanks for the response. I tried passing my custom encoder with the cls parameter as suggested, but I got an error msg:
dump() got multiple values for keyword argument ‘cls’
I will figure out the correct way to do this and post it back here.
Thanks for taking the time to help me.
–
Regards
Parag
Hi Jessy,
Here’s how I was able to use the custom encoder. Not sure if it is the best way though…
http://blog.adaptivesoftware.biz/2009/08/custom-json-encoder-in-django.html