Two ways to use Gson for JSON in Java

May 07, 2020
Written by

If you’re working in a statically-typed language like Java then dealing with JSON can be tricky. JSON doesn’t have type definitions and is lacking some features which we would like - there’s only strings, numbers, booleans and null, so to store other types (like dates or times) we’re forced to use a string-based convention. Despite its shortcomings, JSON is the most common format for APIs on the web so we need a way to work with it in Java.

Gson is one of the most popular Java JSON libraries. In this post I’ll pick a fairly complex JSON document and three queries which I want to make using Gson. I’ll compare two different approaches:

  1. Tree model
  2. Data binding

All the code used in this post is in this repository. It’ll work with Java 8 onwards.

Other Java Libraries for working with JSON

The most popular Java libraries for working with JSON, as measured by usage in maven central and GitHub stars, are Jackson and Gson. In this post I will be using Gson. I also wrote an equivalent post with Jackson code examples.

You can see the Gson dependency for the examples here.

Example data and questions

To find some example data I read Tilde’s recent post 7 cool APIs you didn’t know you needed, and picked out the Near Earth Object Web Service API from the NASA APIs. This API is maintained by the brilliantly-named SpaceRocks team.

The NeoWS Feed API request returns a list of all asteroids whose closest approach to Earth is within the next 7 days. I’ll be showing how answer the following questions in Java:

  • How many are there?
    This can be found by looking at the element_count key at the root of the JSON object.
  • How many of them are potentially hazardous?
    We need to loop through each NEO and check the is_potentially_hazardous_asteroid key, which is a boolean value in the JSON. (Spoiler: it’s not zero 😨)
  • What is the name and speed of the fastest Near Earth Object?
    Again, we need to loop through but this time the objects are more complex. We also need to be aware that speeds are stored as strings not numbers, eg "kilometers_per_second": "6.076659807". This is common in JSON documents as it avoids precision issues on very small or very large numbers.

A tree model for JSON

Gson allows you to read JSON into a tree model: Java objects that represent JSON objects, arrays and values. These objects are called things like JsonElement or JsonObject and are provided by Gson.

Pros:

  • You will not need to create any extra classes of your own
  • Gson can do some implicit and explicit type coercions for you

Cons:

  • Your code that works with Gson’s tree model objects can be verbose
  • It’s very tempting to mix Gson code with application logic which can make reading and testing your code hard

Gson tree model examples

Start by instantiating a JsonParser then call .parse to get a JsonElement which can be descended through to retrieve nested values.  JsonParser is thread-safe so it’s OK to use the same one in multiple places. The code to create a JsonElement is:

JsonParser parser = new JsonParser();
JsonElement neoJsonElement = parser.parse(SourceData.asString());

[this code in the example repo]

How many NEOs are there?

I find this code quite readable, although I don’t really think I need to know about the distinction between a JsonElement and a JsonObject it isn’t too bad here.

private static int getNeoCount(JsonElement neoJsonElement) {
   return neoJsonElement
       .getAsJsonObject()
       .get("element_count")
       .getAsInt();
}

[this code in the example repo]

If the node isn’t an Object, Gson will throw an IllegalStateException, and if we attempt to .get() a node that doesn’t exist Gson will return null and we have to handle possible NullPointerExceptions ourselves.

How many potentially hazardous asteroids are there this week?

I admit that I expected the answer here to be zero but I was wrong  😱  It’s currently 19 - still, I’m not panicking (yet). Anyway, to calculate this from the root JsonElement we need to:

  • iterate through all the NEOs - there is a list of these for each date so we will need a nested loop
  • increment a counter if the is_potentially_hazardous_asteroid field is true

Here’s the code:

private static int getPotentiallyHazardousAsteroidCount(JsonElement neoJsonElement) {
   int potentiallyHazardousAsteroidCount = 0;
   JsonElement nearEarthObjects = neoJsonElement.getAsJsonObject().get("near_earth_objects");
   for (Map.Entry<String, JsonElement> neoClosestApproachDate : nearEarthObjects.getAsJsonObject().entrySet()) {
       for (JsonElement neo : neoClosestApproachDate.getValue().getAsJsonArray()) {
           if (neo.getAsJsonObject().get("is_potentially_hazardous_asteroid").getAsBoolean()) {
               potentiallyHazardousAsteroidCount += 1;
           }
       }
   }
   return potentiallyHazardousAsteroidCount;
}

[this code in the example repo]

All those calls to .getAsJsonObject() are adding up to a lot of code (and one of them is a .getAsJsonArray() too). JsonObject doesn’t implement Iterable, so an extra call to .entrySet() is needed for the for loop. JsonObject has a .keys() method but not a .values(), which is what I actually wanted in this case. Overall, I think the intent of this code is obscured by the Gson API being rather verbose.

What is the name and speed of the fastest NEO?

The method of finding and iterating through each NEO is the same as the previous example, but each NEO has the speed nested a few levels deep so you need to descend through to pick out the kilometers_per_second value.

"close_approach_data": [
 {
    ...
     "relative_velocity": {
         "kilometers_per_second": "6.076659807",
         "kilometers_per_hour": "21875.9753053124",
         "miles_per_hour": "13592.8803223482"
   },
  ...
 }
]

I created a small class to hold both values called NeoNameAndSpeed. This could be a record in the future. The code creates one of those objects like this:

   private static NeoNameAndSpeed getFastestNEO(JsonElement neoJsonElement) {
   NeoNameAndSpeed fastestNEO = null;
   JsonElement nearEarthObjects = neoJsonElement.getAsJsonObject().get("near_earth_objects");
   for (Map.Entry<String, JsonElement> neoClosestApproachDate : nearEarthObjects.getAsJsonObject().entrySet()) {
       for (JsonElement neo : neoClosestApproachDate.getValue().getAsJsonArray()) {
           double speed = neo.getAsJsonObject()
               .get("close_approach_data").getAsJsonArray()
               .get(0).getAsJsonObject()
               .get("relative_velocity").getAsJsonObject()
               .get("kilometers_per_second")
               .getAsDouble();

           if ( fastestNEO == null ||  speed > fastestNEO.speed ){
               fastestNEO = new NeoNameAndSpeed(neo.getAsJsonObject().get("name").getAsString(), speed);
           }
       }
   }
   return fastestNEO;
}

[this code in the example repo]

Gson handles numbers stored as strings by calling Double.parseDouble, so that’s good. But this is still a lot more code than the equivalent version using Jackson. The code was fiddly to write, but for me the main demerit here is readability. It’s much harder to tell what this code does because of all the background noise.

Data binding JSON to custom classes

If you have more complex queries of your data, or you need to create objects from JSON that you can pass to other code, the tree model isn’t a good fit. Gson offers another mode of operation called data binding, where JSON is parsed directly into objects of your design.

Pros:

  • JSON to object conversion is straightforward
  • reading values out of the objects can use any Java API
  • the objects are independent of Gson so can be used in other contexts
  • the mapping is customizable using Type Adapters

Cons:

  • Up-front work: you have to create classes whose structure matches the JSON objects, then have Gson read your JSON into these objects.

Data binding 101

Here’s a simple example based off a small subset of the NEO JSON:

{
 "id": "54016476",
 "name": "(2020 GR1)",
 "closeApproachDate": "2020-04-12",
}

We could imagine a class for holding that data like this:

public class NeoSummaryDetails {
    public int id;
    public String name;
    public LocalDate closeApproachDate;
}

Gson is almost able to map back and forth between JSON and matching objects like this out of the box. It copes fine with the int id actually being a string, but needs some help converting the String 2020-04-12 to a LocalDate object. This is done by creating a class which extends TypeAdaptor<LocalDate> and overriding the .read method to call LocalDate.parse. You can see an example of this here.

Gson data binding - custom types

For data binding, the Gson class to use is Gson. Once we have created our TypeAdapter  we can register it in a Gson like this:

   Gson gson = new GsonBuilder()
       .registerTypeAdapter(LocalDate.class, new GsonLocalDateAdapter())
       .create();

[this code in the example repo]

Gson data binding - custom field names

You might have noticed that I used closeApproachDate in my short example JSON above, where the data from NASA has close_approach_date. I did that because Gson will use Java’s reflection capabilities to match JSON keys to Java field names, and they need to match exactly.

Most times you can’t change your JSON - it’s usually coming from an API that you don’t control - but you still wouldn’t like to have fields in your Java classes written in snake_case. This could have been done with an annotation on the closeApproachDate field:

@SerializedName("close_approach_date")
public LocalDate closeApproachDate;

[this code in the example repo]

Creating your custom objects with JsonSchema2Pojo

Right now you are probably thinking that this can get very time-consuming. Field renaming, custom readers and writers, not to mention the sheer number of classes you might need to create.  Well, you’re right! But fear not, there’s a great tool to create the classes for you.

JsonSchema2Pojo can take a JSON schema  or (more usefully) a JSON document and generate matching classes for you. It knows about Gson annotations, and has tons of options, although the defaults are sensible. Usually I find it does 90% of the work for me, but the classes often need some finessing once they’re generated.

To use it for this project I removed all but one of the NEOs and selected the following options:

JsonSchema2Pojo screenshot

[generated code in the example repo]

Data stored in keys and values

The NeoWS JSON has a slightly awkward (but not unusual) feature - some data is stored in the keys rather than the values of the JSON objects. The near_earth_objects map has keys which are dates. This is a bit of a problem because the dates won’t always be the same, but of course jsonschema2pojo doesn’t know this. It created a field called _20200412. To fix this I renamed the class _20200412 to NeoDetails and the type of nearEarthObjects became Map<String, List<NeoDetails>> (see that here). I could then delete the now-unused NearEarthObjects class.

I also changed the types of numbers-in-strings from String to double and added LocalDate where appropriate.

Gson data binding for the Near-Earth Object API

With the classes generated by JsonSchema2Pojo the whole big JSON document can be read with:

NeoWsDataGson neoWsDataGson = new GsonBuilder()
   .registerTypeAdapter(LocalDate.class, new GsonLocalDateAdapter())
   .create()
   .fromJson(SourceData.asString(), NeoWsDataGson.class);

Finding the data we want

Now that we have plain old Java objects we can use normal field access and the Streams API to find the data we want. This code is the same for Gson or Jackson:

System.out.println("NEO count: " + neoWsData.elementCount);


System.out.println("Potentially hazardous asteroids: " +
   neoWsData.nearEarthObjects.values()
       .stream().flatMap(Collection::stream) // this converts a Collection of Collections of objects into a single stream
       .filter(neo -> neo.isPotentiallyHazardousAsteroid)
       .count());


NeoDetails fastestNeo = neoWsData.nearEarthObjects.values()
   .stream().flatMap(Collection::stream)
   .max( Comparator.comparing( neo -> neo.closeApproachData.get(0).relativeVelocity.kilometersPerSecond ))
   .get();

System.out.println(String.format("Fastest NEO is: %s at %f km/sec",
   fastestNeo.name,
   fastestNeo.closeApproachData.get(0).relativeVelocity.kilometersPerSecond));

[this code in the example repo]

This code is more natural Java, and it doesn’t have Gson all through it so it would be easier to write unit tests for the logic. If you are working with the same format of JSON a lot, the investment of creating classes is likely to be well worth it.

Path Queries with Gson

If you have read my post about working with JSON using Jackson, you might have read the section about retrieving individual values from a JSON document using JSON Pointers. Gson does not have support for JSON Pointers. I don’t consider this a huge loss though as it usually doesn’t save a lot of code and there are more flexible alternatives.

Summing up the different ways of using Gson

For simple queries, the tree model can serve you well but it’s hard to avoid mixing up JSON parsing and application logic which can make the code hard to maintain.

For more complex queries, and especially when your JSON parsing is part of a larger application, I recommend data binding. It’s usually easiest in the long run, remembering that JsonSchema2Pojo can do most of the work for you.

What are your favourite ways of working with JSON in Java? Let me know on Twitter I’m @MaximumGilliard, or by email I am mgilliard@twilio.com. I can’t wait to see what {"you": "build"}.