Strange errors in chunk data when exporting JSON compressed

I have a very specific use case for reading Tiled Map Editor data that doesn’t require the entire libtiled library.

JSON is my preferred format, since it’s really easy to work with in Java. The code I have is working perfectly for raw, uncompressed chunk data that is base64 encoded. As soon as I change the setting in Tiled to gzip or zlib, I start to get completely invalid numbers after a couple of correct tile GID’s are extracted from the base64 decoded string:

RAW/UNCOMPRESSED (Correct data):

Layer Name: Floor
Width, Height: 96,48 
Start X,Y: -80,-48
Encoding: base64
Compression: 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,1099,1100,1101,691,692,693,694,695,696,697,1207,1208,700,701,
0,0,708,709,1243,1244,1245,713,714,1297,1298,1299,718,719,720,721,

GZIP:

Layer Name: Floor
Width, Height: 96,48 
Start X,Y: -80,-48
Encoding: base64
Compression: gzip
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,1099,1100,1101,45989871,-1074855936,701,45989871,-1074855936,701,45989871,-1074855936,701,45989871,-1074855936,
1213,79544303,-1074855936,701,45989871,0,0,-1074855936,701,45989871,-1074855936,1213,79544303,-1074855936,1213,45989871,

ZLIB:

Layer Name: Floor
Width, Height: 96,48 
Start X,Y: -80,-48
Encoding: base64
Compression: zlib
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,1099,1100,1101,45989871,-1074855936,701,45989871,-1074855936,701,45989871,-1074855936,701,45989871,-1074855936,
1213,79544303,-1074855936,701,45989871,0,0,-1074855936,701,45989871,-1074855936,1213,79544303,-1074855936,1213,45989871,

Working source code is here: Paste.ee - Working code for Tiled JSON reader

Exported Raw, gzip and zlib JSON data are here: Paste.ee - Tiled json - raw, gzip and zlib

Could you also provide a CSV version? “CSV” in JSON maps produces a native JSON array. That would give us a human-readable version to compare to. It’s entirely possible that there are problems with the uncompressed base64 parse too.

I don’t have a working JSON parser handy, but I know the TMX encodings all work correctly, and I just compared the outputs of a map in TMX and TMJ and the layer data is identical, save for JSON escaping the slashes. So, it’s much more likely that your parser is wrong somewhere that Tiled is wrong.

Here you go:

I’m hoping it’s just a stupid mistake that I don’t realize I’ve made.

What I can’t wrap my head around is if there was a parsing error in the code, it should affect the chunk data regardless of being compressed or not compressed. It also seems odd that I get the same results with both gzip and zlib compression, so it doesn’t seem to be a decompression issue in Java.

Furthermore, why would the first couple of GID’s be correct before all hell breaks loose? I’m stumped.

Thanks! So there are no flipped tiles, which are the typical explanation for weirdly large GIDs, and indeed the uncompressed base64 decoding is fine.

This bit: chunkData.getBytes(StandardCharsets.UTF_8) concerns me a bit. base64 strings are not UTF8 strings, they are ASCII strings. I don’t think this should be an issue as UTF8 encodes ASCII characters identically to ASCII, but still.

I thought maybe the presence of \/ in the base64 strings could be a source of problems (this is meant to be / in the final base64 string, the \ just escapes the / in JSON), but it looks like those are in your uncompressed base64 too, and not causing problems, so I guess your JSON parser takes care of them correctly.

I wonder if the fact that you’re treating the decompressed data as strings could be a problem. Have you tried keeping the data as a byte array the entire time, and then converting that to long? Strings should probably not be involved at any point after decoding the base64. This is just a guess though - I don’t know anything about the libraries you’re using or anything about how Java deals with Strings.

BOOM! Thank you for pointing me in the right direction.

First I changed the unzip method to use bytes instead, but that still returned the same result. I paid very close attention this time and noticed that the unzip method was returning only 526 bytes of the 1024 bytes I was expecting (16 x 16 x 4). This seemed to indicate improper handling of the unzipped data, so I googled around to find the proper way to unzip gzipped data.

Two quick source code changes later and it’s working fine now!

First, have the method properly use only byte arrays, and an actual buffer:

    public static byte[] unzip(String b64) {
        ByteArrayOutputStream os = new ByteArrayOutputStream();
        try {
            byte[] bytes = Base64.getDecoder().decode(b64);
            GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(bytes));
            byte[] buffer = new byte[1024];
            int len;
            while ((len = gis.read(buffer)) != -1){
                os.write(buffer, 0, len);
            }
            os.close();
            gis.close();
            return os.toByteArray();
        } catch (Exception e) {
            return new byte[1];
        }
    }

Next, change the return variable to the unzip call to match the method change:

                        case "gzip":
                            try {
                                decoded = unzip(chunkData);
                            } catch (Exception e) {
                                e.printStackTrace();
                            }
                            break;