Monday, June 26

JSON version of XMLTABLE example

My colleague Álvaro Herrera gave a terrific explanation of the new XMLTABLE feature from Pavel Stěhule that he’s put a huge amount of effort into, and finally recently committed. I thought it would be fun to see how the example he gave might work today with JSON.

First, I came up with a JSON equivalent version of his piece of XML. Notice that in JSON, unlike in XML, containers are nameless, so we have no “room”  tags, for example, a room is just an object in the relevant array. This is what I came up with:

CREATE TABLE IF NOT EXISTS hoteldata AS SELECT json
$$[
   {"id": "mancha", "name": "La Mancha",
    "rooms": [
      {"id": "201", "capacity": 3, "comment": "Great view of the Channel"},
      { "id": "202", "capacity": 5 }
    ],
   "personnel": [
      {"id": "1025","name": "Ferdinando Quijana",
       "salary": { "currency": "PTA", "amount": 45000}
      }
   ]
 },
 {"id": "valpo", "name": "Valparaíso",
  "rooms": [
    {"id": "201", "capacity": 2, "comment": "Very Noisy"},
    {"id": "202", "capacity": 2}
  ],
  "personnel": [
    {"id": "1026", "name": "Katharina Wuntz",
    "salary": { "currency": "EUR", "amount": 50000}
    },
    {"id": "1027", "name": "Diego Velázquez",
    "salary": { "currency": "CLP", "amount": 1200000}
    }
  ]
 }
]$$ AS hotels;

Now the problem in creating a query with this data that mimics Álvaro’s first query is that the individual room data is nested inside an array, while the hotel name is in the parent object of that array. XPATH is very good at combining data at different nesting levels, but we don’t have that here so we need to be a bit more creative. The strategy is to break the hotels data into individual rows, one for each hotel, and then with each hotel combine its name with each room. Note that combining values is currently a jsonb-only operation. I decided for now not to bother about the row numbering piece – that should be easily added if necessary with a call to the row_number() window function.

Here’s the first query I came up with to implement that strategy:

with the_hotels as
(
 select hotel from hoteldata, json_array_elements(hotels) h(hotel) 
),
flattened as
(
 select jsonb_build_object('hotel', hotel->>'name') || room::jsonb as rm
 from the_hotels, json_array_elements(the_hotels.hotel->'rooms') r(room)
)
select j.hotel as hotel_name, j.id as room_id, j.capacity, j.comment
from flattened, 
 jsonb_to_record(rm) as 
 j ("id" int, "capacity" int, "hotel" text, "comment" text);

And the result looks like this:

 hotel_name | room_id | capacity |          comment          
------------+---------+----------+---------------------------
 La Mancha  |     201 |        3 | Great view of the Channel
 La Mancha  |     202 |        5 | 
 Valparaíso |     201 |        2 | Very Noisy
 Valparaíso |     202 |        2 | 
(4 rows)

Not bad! Pretty close, but we’re missing the default values for the comments. Well, the json functions have support for that, too. Instead of using jsonb_to_record, we need to use a named type with jsonb_populate_record. Then we can supply a value for the record to populate that has the default value. Note that the named type needs to have the same column names that the jsonb data has.

Here’s how that looks:

create type hotelroom as 
("id" int, "capacity" int, "hotel" text, "comment" text);

with the_hotels as
(
    select hotel from hoteldata, json_array_elements(hotels) h(hotel) 
),
flattened as
(
    select jsonb_build_object('hotel', hotel->>'name') || room::jsonb as rm
    from the_hotels, json_array_elements(the_hotels.hotel->'rooms') r(room)
)
select j.hotel as hotel_name, j.id as room_id, j.capacity, j.comment
from flattened, 
    jsonb_populate_record(row(null,null,null,'A regular room')::hotelroom,
                          rm) as j;

And the result:

 hotel_name | room_id | capacity |          comment          
------------+---------+----------+---------------------------
 La Mancha  |     201 |        3 | Great view of the Channel
 La Mancha  |     202 |        5 | A regular room
 Valparaíso |     201 |        2 | Very Noisy
 Valparaíso |     202 |        2 | A regular room
(4 rows)

And there we are.

There is more json functionality coming to Postgres, but you can do quite a lot right now.

By the way, just to emphasize, XMLTABLE is a cool feature.

4 Comments

  • And wouldn’t it be nice to have a xml2jsonb conversion function so we could have the best of both worlds?

    Current postgresql jsonb toolkit is already impressive and keep growing. It would be nice to have a way to convert XML data object to jsonb for thoses who chose to work primarily with jsonb going forward.

    I found several xml2json projects in various languages but I really think this would be another plus for postgresql to have this built-in, don’t you think so?

    • Andrew Dunstan

      Well, that’s a bit off-topic. This should be fairly doable as an extension. Or you could write a plperl or plpython or possibly plv8 function to do it fairly easily too.

Leave a Reply

Your email address will not be published. Required fields are marked *