How many bytes can a Reed-Solomon codeword bite?

In my previous article on DOCSIS Codeword Errors, reader “Sandy” asked how many Reed-Solomon codewords would be required to transport a 1518 byte length of data?  This is a very good question and one that I had never considered until I first faced when designing DOCSIS data-layer models in MATLAB.  In the example I used in my previous article I chose a word that was 16 bytes long.  I did this because it made it easy to draw an 8×8 Reed-Solomon matrix to demonstrate the Read-Solomon codeword algorithm.  The algorithm is actually much more complex than my simple illustration in the previous article but the complex math was unnecessary to provide the basic concept.  So we can skip the complexities in this article as well and make some basic assumptions.

The first assumption is that our word size is 1518 bytes.  Let’s also assume that our upstream modulation is 16-QAM.  To start to understand how our codewords are built we need to look at the configuration of the CMTS.  This is done in the modulation profile which is in the running-config of the CMTS.  If you are not familiar with this, don’t worry, many people who work on CMTSs are often confused by some of the settings in modulation profiles, so this will be fun for all.

DOCSIS Cable Modulation Profiles

This first section of code is taken directly from a CMTS running-config.  The running-config is the instruction set that tells a CMTS how to operate on a specific plant.  In particular we are telling this CMTS the information it needs to know for upstream transmission of data in 16-QAM.  There are eight (8) lines of code below.  Each one defines different parameters for different types of modes that the cable modem can transmit data.  For instance the first line of code tells the cable modem how to transmit a REQuest message.  This tells the CMTS that the cable modem has data to transmit.  Maybe its that 1518 bytes packet of information that we want to understand how the Reed-Solomon FEC encoding happens.  For the cable modem to transmit a 1518 byte packet it would use a ‘long’ data grant, which are for transmitting upstream data that are large in size.  Notice there are two types of long profiles, long and a-long.  Long is used by DOCSIS 1.0 and 1.1 cable modems while a-long is used by DOCSIS 2.0 and DOCSIS 3.0 cable modems.  So this is a mix-mode TDMA/ATDMA modulation profile which supports both DOCSIS 1.x and later cable modems.

Now contained in the long data grant is the FEC information for how many bytes of data that are stuffed into a Reed-Solomon codeword.  If you are familiar with looking at modulation profiles you would know that it is the second number in the profile which is 232.  That is the k value from my first article on codeword errors, which follows as:

t<\frac{q-k+1}{2}

where t = number of correctible bits, q=codeword length, k = data length in bytes.

cable modulation-profile 143 request 0 16 0 8 16qam scrambler 152 no-diff 176 fixed 
cable modulation-profile 143 initial 5 34 0 48 16qam scrambler 152 no-diff 224 fixed 
cable modulation-profile 143 station 5 34 0 48 16qam scrambler 152 no-diff 224 fixed 
cable modulation-profile 143 short 5 78 19 17 16qam scrambler 152 no-diff 200 shortened
cable modulation-profile 143 long 9 232 139 77 16qam scrambler 152 no-diff 216 shortened
cable modulation-profile 143 a-short 5 78 19 17 16qam scrambler 152 no-diff 100 shortened qpsk1 1 2048 
cable modulation-profile 143 a-long 9 232 139 77 16qam scrambler 152 no-diff 108 shortened qpsk1 1 2048 
cable modulation-profile 143 a-ugs 9 232 139 77 16qam scrambler 152 no-diff 108 shortened qpsk1 1 2048

A helpful feature on both Cisco and Arris CMTSs is that you can use the ‘show cable modulation-profile’ command to display the modulation profiles with a header on top.  The header tells you what each element of the  modulation profile represents so that even if the order changes on a CMTS, the headers keep you straight.

volpefirm#show cable modulation-profile | include 143
Mod  IUC     Type  Pre Diff FEC  FEC  Scrmb  Max Guard Last Scrmb Pre   Pre   RS
                   len enco T    k    seed   B   time  CW         offst Type
143  request 16qam 176 no   0x0  0x10 0x152  0   8     no   yes   0     16qam na 
143  initial 16qam 224 no   0x5  0x22 0x152  0   48    no   yes   0     16qam na 
143  station 16qam 224 no   0x5  0x22 0x152  0   48    no   yes   0     16qam na 
143  short   16qam 200 no   0x5  0x4E 0x152  19  17    yes  yes   0     16qam na 
143  long    16qam 216 no   0x9  0xE8 0x152  139 77    yes  yes   0     16qam na 
143  a-short 16qam 100 no   0x5  0x4E 0x152  19  17    yes  yes   0     qpsk1 no 
143  a-long  16qam 108 no   0x9  0xE8 0x152  139 77    yes  yes   0     qpsk1 no 
143  a-ugs   16qam 108 no   0x9  0xE8 0x152  139 77    yes  yes   0     qpsk1 no

Now looking at the output above, you can see the FEC k value for the long and a-long mod-profiles are both 0xE8.  That is a hex value.  Using the Hex to Decimal converter on Google we find that E8 = 232.  So now we know for certain that each codeword in a long data grant contains 232 bytes.  So if we have 1518 bytes how many codewords do we have?  That’s pretty simple.  Just divide 1518 byte by the FEC k bytes.

1518 / 232 = 6.5 codewords

Shortened Last Codeword

Wait!  How can we have 0.5 codewords?  What is a half codeword?

Notice the 11th column over to the right in the second box above.  It says ‘LAST CW’, which is an abbreviation for shortened last codeword.  The DOCSIS standard takes into account that data lengths will frequently be random resulting in words that don’t evenly fill any FEC k value provided in the modulation profile.  So you will end up with some number of bytes left over.  As long as there are a minimum of 16 bytes (you need that many to make up a codeword) you can have a shortened codeword and call it a day.  So you are not required to have the full 232 bytes.  This means that you still end up with a full code word, but not the full FEC k bytes defined by the modulation profile.  So the total codewords for 1518 bytes are seven (7) and not six and half (6.5).

Galois Fields

How does all of this data get structured?  It uses a framing method called Galois Fields, which is a very impressive word that describes how the complex matrixes are put together.  I described these matrices in very simple terms in my previous article, but will leave it as homework for anyone who would like to explore the math behind this further to start with the link I provided.  I do recommend that if you become an expert in Galois Fields you don’t use this at your next social hour.  People will look at you funny.  Trust me. 🙂