This site uses cookies. Continue to use the site as normal if you are happy with this, or read more about cookies and how to manage them.

×

This site uses cookies. Continue to use the site as normal if you are happy with this, or read more about cookies and how to manage them.

×

NFC Storage: There's plenty of room at the bottom

Compressing data on NFC tags

Recently I've been working with Near Field Communication tags. These are the tiny, un-powered circuits that, along with RFID, is the technology that makes contactless payments possible. The bank card you have in your purse or wallet probably includes an NFC tag.

Nfc Guide Tags

NFC tags are really small electrical circuits that work entirely through the inductance of a nearby device. When you put your phone near one, the electromagnetic field from your phone provides just enough power to fire up the NFC chip. This allows the the NFC to respond with a small electrical signal of its own. NFC tags are smarter than RFID chips and they can hold short, secure conversations.

But they can also be used as un-powered storage devices. Say you want to keep track of a bunch of equipment. You attach an NFC tag to each device with data saying who owns it, when it was last serviced and so on. They are a great, and relatively cheap way of keeping track of your stuff.

But here's what makes them especially fun to work with: they have really, really small amounts of storage. An NFC tag will typically have between 40 bytes and 4KB of storage. The more storage you have, the more you pay. The prices are changing all the time, but looking on eBay, the 4K cards are currently going for just over $2, and the smaller tags are around 30-40 cents each. If you tag $10 products with a 4KB tag, you are probably throwing away your profit. So getting the smaller, cheaper tags to work can make a lot of financial sense.

That's why most people choose tags like the NTAG213, which comes with just 144 bytes of storage. That's a tiny amount of space. What's worse, 7 bytes of that storage is already taken up with the unique ID number. If you want the tag to fire up a custom reader app on your phone, you'll need to include a thing called an AAR record. That'll take up around 40 bytes. So you'll have around 100 bytes left to have fun with. 100 bytes is really not much space.

For example, 100 bytes on a tag is not even enough storage to record the contents of this sentence.

For that reason, you'd have to be just insane to try and store anything useful on an NFC tag. Well, you can't be just insane. You also need to do a little work.

1 Custom binary formats

The obvious place to start is with some custom binary format. That's fine if you need to write a big pile of NFC tags that are all pretty much the same. If your data's all very similar, you can have a single data scheme:

  • Gromet size (1 byte)
  • Reciprocating flange description (4 bytes)
  • … and so on…

With a binary format, you really get the maximum possible storage. The downside? It doesn't work so well if you want to store heterogeneous data. The more varied your data, the more problems you'll have.

Let's say you manufacture tools, and you want to record the weight of the hammer, or the maximum strain of the torque wrench or the amount of power the portable sander will take before killing you. Dozens of possible data items you might want to record on the tag.

For that kind of situation you can:

  • Make a lot of stuff optional. The downside is that your binary format needs to record some piece of data that means "I don't have a value for this". If you want to record 3 of 50 potential data items, you've already blown a lot of your space just saying which 47 things you don't have.
  • Record some sort of version number for the particular scheme being used. A version for hand drills. A version for electrical sanders. A version for electrical drills. The number of versions you would need will expand exponentially. Things will become unmanageable.
  • Try to include the scheme within the data. So include extra information alongside each item of data to say what it means. Yeah. That sounds like a good idea…

None of these options look good if you need to support a lot of data schemes. Optional data wastes space. Multiple versions will fry your brain. And obviously storing the scheme alongside the data in that tiny, tiny amount of space on an NFC chip is absolutely insane. Only a fool would try to store the scheme alongside the data.

Let's store the scheme alongside the data.

2 Putting JSON on an NFC chip

The two main data formats that include their own scheme are XML and JSON. In both cases, they include field names to say what the stuff means. You don't have to say what you don't have. But you do need to say what you do have. Nobody likes XML any more, so we'll look at JSON. Let's say we want to record this on a tag:

{
   "tests":[
      {
         "date":"2018-01-15",
         "type":"Electrical"
      },
      {
         "date":"2019-06-17",
         "type":"Safety"
      },
      {
         "date":"2020-09-20",
         "type":"Functional"
      },
      {
         "date":"2022-08-11",
         "type":"Scrap"
      }
   ],
   "weight":"4",
   "torque":"4",
   "capacity":"4"
}

This is data that tells you something about when the gadget needs to be serviced… and stuff. You would obviously never store JSON formatted like this. You'd remove all of the whitespace and get this:

{"tests":[{"date":"2018-01-15","type":"Electrical"},{"date":"2019-06-17","type":"Safety"},{"date":"2020-09-20","type":"Functional"},{"date":"2022-08-11","type":"Scrap"}],"weight":"4","torque":"4","capacity":"4"}

I make that 211 bytes of goodness. Over twice the size that we actually have space for on the NFC tag. So we better do some compression.

Target size: 100 bytes
Current size: 211 bytes.

3 Compression

If we take the JSON payload and run it through zip-compression we get… 286 bytes. That's right: zip compression actually makes the data bigger. That's because zip compression adds in a small lookup table that says how to de-compress the data. Well, the table's small if you're dealing with larger amounts of data. For tiny amounts of data, zip is pretty terrible. But of course, zip isn't the only available compression algorithm. If you run it through gzip you get 158 bytes. That's better, but it's still a way from our goal of 100 bytes. If we want to shrink stuff more, we're going to have to think again.

Target size: 100 bytes
Current size: 158 bytes.

4 RJSON

RJSON is a tool that allows you to re-write JSON into a more compact form. You can find it here: http://www.cliws.com/p/rjson/ RJSON identifies repeating groups within your data, and uses JSON arrays to store the additional data. It allows us to turn this:

{
   "tests":[
      {
         "date":"2018-01-15",
         "type":"Electrical"
      },
      {
         "date":"2019-06-17",
         "type":"Safety"
      },
      {
         "date":"2020-09-20",
         "type":"Functional"
      },
      {
         "date":"2022-08-11",
         "type":"Scrap"
      }
   ],
   "weight":"4",
   "torque":"4",
   "capacity":"4"
}

Into this:

{
    "capacity": "4",
    "tests": [
        {
            "date": "2018-01-15",
            "type": "Electrical"
        },
        [
            2,
            "2019-06-17",
            "Safety",
            "2020-09-20",
            "Functional",
            "2022-08-11",
            "Scrap"
        ]
    ],
    "torque": "4",
    "weight": "4"
}

If you skip all of the whitespace, you'll find that your 211-byte JSON object shrinks down to 169 bytes. No way near small enough to fit on the NFC tag, but of course, it's still JSON. So we can still run it through compression:

  • Zip compression: 283 bytes. Zip has just given up at this point.
  • Gzip compression: 155 bytes. That's right. We just saved 3 bytes.

So even though we've reformatted the JSON, we've now got to a point where even gzip is struggling. So what do we? Give up? Throw in the towel? Crumple like the whimpering fool we always suspected we were and go home for the night? Of course we do. And in the morning, we look at how else we can change the format.

Target size: 100 bytes
Current size: 155 bytes.

5 The walk format: leaving JSON behind

We're building this compression chain and we've now got to the point where we need to think of switching JSON for something else. We'll still use JSON with the outside world, but inside our compression code we'll switch to something a little more compact. A JSON object really just describes a tree of data. Conceptually, our data is this:

+-tests:+ (array)
|       |
|       +-object:+
|       |        |
|       |        +- date: 2018-01-15
|       |        |
|       |        +- type: Electrical
|       |
|       +-array:+
|               |
|               +- 2
|               |
|               +- 2019-06-17
|               |
|               +- Safety
|               |
|               +- 2020-09-20
|               |
|               +- Functional
|               |
|               +- 2022-08-01
|               |
|               +- Scrap
|
+-weight: 4
|
+-torque: 4
|
+-capacity: 4

But JSON's formatting, whilst it's nice and readable, takes up quite a bit of space. Instead, we can replace the JSON format with something smaller. Imagine you're a robot, walking that tree of data. To visit the whole thing, you will do each of four things:

  • Go down into an object sub-tree
  • Go down into an array sub-tree
  • Move sideways, from one branch to the one next to it, or
  • Move back up a branch to its parent.

If we represent each of these four movements by the symbols +, *, > and ^ respectively, and if we then output the text content of each node we visit, we can convert JSON data like this (whitespace removed):

{"tests":[{"date":"2018-01-15","type":"Electrical"},["2","2019-06-17","Safety","2020-09-20","Functional","2022-08-01","Scrap"]],"weight":"4","torque":"4","capacity":"4"}

Into this:

tests*+date>2018-01-15>type>Electrical^*2>2019-06-17>Safety>2020-09-20>Functional>2022-08-01>Scrap^^weight>4>torque>4>capacity>4

It's the same data. It represents the same thing. Of course, if we have text that includes any of the symbols +, *, > or ^, we'd need to escape them with an extra character. But chances are, we won't get many of them. So now that we've got rid of the JSON formatting, how long is our string? 128 bytes. It's still not short enough, but it's shorter than gzip compression gave us. And way more than zip did. And of course, we're still using uncompressed text, so we can still squeeze it a bit more, right? Wrong.

  • Zip compression: 262 bytes. And the crowd laughed.
  • Gzip compression: 134 bytes.

Oh $*#. Even gzip has given up now. Gzipping this string actually makes it 6 bytes longer. So we're 28 bytes short of our goal and we can't even compress. Or can we?

Target size: 100 bytes
Current size: 128 bytes.

6 We don't need no stinking 8-bit

At this point, we're going to assume that the JSON data is ASCII. I know this is a terrible sin. Not only will we be unable to support foreign character sets, but we won't even be able to embed emojis. However, if we restrict our data to ASCII, that means we only need 128 characters. That's 7 bits. And bytes are… 1, 2, 3,… 8 bits. Awesome. So we can take bytes that look like this: 01001111 01111001 And then remove the highest order bits: 01001111 01111001 Looks OK to me (Geddit? Geddit? Oh I don't know why I bother). Skipping those useless bits, reduces our payload by 12.5%. Which means our payload is now… (drum-roll please) 113 bytes. So close! But. There's more. Since the shift key was invented, most people have used two cases: upper and lower. And unless you're in a very SHOUTY MOOD, you probably create data that is mostly lowercase. That can help us. If we prefix each of the uppercase letters with some symbol, say the ";" symbol (and yes, that's mean we probably need to convert any existing ";" symbols into ";;") we can then uppercase the whole string, like this:

TESTS*+DATE>2018-01-15>TYPE>;ELECTRICAL^*2>2019-06-17>;SAFETY>2020-09-20>;FUNCTIONAL>2022-08-01>;SCRAP^^WEIGHT>4>TORQUE>4>CAPACITY>4

So we only need to use characters ASCII 32 to ASCII 95. That's 64 characters. And 64 characters can be represented by 6 bits. So if we subtract 32 from each of the bytes, we can then get rid of the 2 most significant bits. 01001111 01111001 –> 00101111 00101010 –> 00101111 00101010 12 bits where there were 16 before. A 25% saving. That means our final score is a suspiciously convenient: 100 bytes. To re-assemble our original JSON data, we read it off the NFC tag, and do the reverse of the above.

Target size: 100 bytes
Current size: 100 bytes.

7 Conclusion

If you are writing data to NFC chips and the following are true:

  • Using larger chips makes no financial sense, and
  • You want to store data on the chips rather than the server, and
  • Your data will be structured in many different ways, then

a little work will mean you can store data on an NFC tag in an accessible and flexible format.

You can get the code here.