Working on a Java-based project for Grooveshark, I ran into an issue with the Java JSON library's numeric datatype parsing. My data source is a pretty run-of-the-mill MySQL database and, as you may know, if you have a table with an INT column (very common for primary key ID fields), Java imports those values as Longs. Now, my code is basically taking data from MySQL, serializing it into JSON text and storing it in memcached for later retrieval. The problem came about as follows: I would load the JSON string from memcached (in this case a list of numbers from an INT UNSIGNED PRIMARY KEY column), deserialize it, and convert it from a JSON Array to a Java List<Long>
. At the same time, I had another List<Long>
of values coming in from the database, and I performed a simple check to see if one list contained the other as a subset. Say the JSON-based list was called jsonIDs
and the list from the DB dbIDs
-- I called jsonIDs.containsAll(dbIDs)
. In this particular case, both lists had only one element and by logging them to stdout, I could *see* that they contained the same numerical value. Yet despite my pleas and screaming, jsonIDs.containsAll(dbIDs)
returned false. Infuriating! So finally, I wound up writing my own nested for loops to do the comparisons by hand, and sure enough, when I did the comparison on individual elements pulled from the respective lists, I got an exception --java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Long
Apparently, my List<Long>
was nothing of the sort. I thought and thought about where my objects were actually coming from. The DB IDs were clearly coming straight from the database and there couldn't be an issue there. It had to be the JSON-based list. So I downloaded the source code from json.org and dug around. Sure enough, the JSONTokener
class, responsible for parsing JSON text into a Java object was the source of the problem. In parsing what it believes to be numeric text, it goes through a series of checks to determine the type of number -- hex, octal, double, and finally long. Unfortunately, when determined to be a whole number and despite being parsed as a long, the JSONTokener
makes a final check to see if precision is lost by casting to an Integer
. If not, the value is cast to an Integer and returned, and this was the source of my problem. So I pulled the json.org source code into my project and made an easy one-line fix to prevent this cast altogether. Now my (hacked) JSON code will return a long
no matter what the (whole-number) value, which may not be ideal in general (hence the original behavior) but does the trick for me. All the numeric values I'm serializing are going to be long
s anyway. What I still don't get is why, when loading an Integer
into a List<Long>
, the object is allowed to keep its Integer
type. Shouldn't this object be cast before it can be added to the List
? I'm stumped on this one, but hey, my code is working again so I'm movin' on. Any ideas?
Wednesday, May 20, 2009
Java/JSON numeric data types
Subscribe to:
Post Comments (Atom)
2 comments:
Regarding why addition of Integer to List<Long> works (or vice versa): it may well be due to generics and type erasure. Fundamentally ArrayList et al really just have Object[] to store things in, and all type magic is really just implicit casting. Further, casting may not even occur when adding, leading to odd errors when getting elements (at which point cast is done, and would fail!).
I am you at Modea.
Post a Comment