Sockets: Bytes and Message Terminators
Sockets are fun. They're a cross-platform mechanism for opening a black hole to another dimension. Of course, you're the one that defines the other end and you're in full control of what gets sent into the hole.
Hopefully you've also written the code at the other end... that way you're also in full control of what comes back through the hole. Note that, like any good black hole, the specks of data flying back at you won't always be complete, nor in sequence. Fortunately, if you're using TCP, the underlying network technology/code will rearrange the packets for you. UDP, on the other hand, will send the packets to you as it receives them.
Text or Bytes?
It does get better though... especially when you're trying to send through real data, we'll call them bytes [valued from 0 to 255], as opposed to just fancy textual strings in the ASCII format. The latter is great if you actually want to send legible messages; a chat program perhaps? But useless if you want to send large numbers or other binary data quickly and efficiently.
A quick example... say you want to send the value of a HTML colour. Let's choose the colour off-white, of RGB value (250,250,250). You could send the string "250,250,250" and the receiver could parse the ASCII values. You could save further chars by passing in "250250250" and, as long as your receiver understands the format, it would still be understood. What you need to realise though is that you're sending 3 bytes down the pipe when you really only need to send one. Each byte, in this case '2', '5' and '0' are all bytes, has the ability to store a numeric value from 0 through to 255.
The first byte, the ASCII character '2' is actually represented by the numeric raw value 50. The basic idea is that ASCII is a big lookup table. The number 20 is the index in the table that describes the pixel arrangement to display a character '2' on the screen. What you can then realise is that, in the buffer, you've actually placed the value 50 in the first byte, not 2. If you knew how to read that value as 50 and not 2, then you're already saving 1 byte (as '50' would take two ASCII characters!)
It gets better though... each byte can store up to the value 255, so instead of trying to jam the string "250" into three bytes, you simply need to store the value 250 into the first byte.
Signed or Unsigned?
A byte has 8 bits. Of these 8 bits, all 0s equals the numerical value zero and all 1s equals 255. You can therefore store 256 values in a byte. Computers, to represent negative numbers, use the most significant bit (determined by endianess) as a flag to indicate if a number is negative. Unfortunately, this takes one bit off your value, allowing you to store -128 to +127. Why not -127? That would be a waste and would allow -0 and +0. See Two's complement to understand more.
In the end, unless you're using C# (where Microsoft likes to 'help' you and limit ASCII usage) then it'll be up to the receiving end as to how to read the data. All the bits will be there; it's just a matter of casting them to a format you desire.
Message Terminators
Firstly, I'm using the term 'message' here to describe a block of data sent from the server to a client. This block is formulated by the server with known start and end indicators and thrown down the tubes. If the client isn't listening hard enough, then they may well miss the start of the message and have no idea how to recover and process the rest of the data. The goal is to create unique tokens in your message stream to allow a client to truncate data it can't deal with and get back to a known starting point. It can then process the next message in the queue.
Choosing a terminator can be difficult. A unique byte, or sequence of bytes, can be hard to determine if you are expecting to send arbitrary data in the message content. Human-readable characters can be used, if sending strings where those characters can't possibly be included. The 'pipe' | is a good choice, even a comma if you're in total control of message content and can replace/remove them from the middle. The issue will be that as soon as a terminating character is found in the middle of the string, then the client will expect that the message is complete and pass on the truncated data for further processing. It then gets worse when the client retrieves the next message which happens to be the second half of what should have been a complete message.
The best way to get around this is to have a header at the start of your messages. First and foremost this header needs to indicate the length of the message it describes. From then on you can have whatever message content you want, making sure the byte count matches the length you have set. In a recent application, I limited myself to 255 byte messages, so the first byte of any data sent was a numeric value that describe the number of following bytes that made up the message. I then also put a terminating character at the end, the pipe, as a check so the client could confirm that the end was really there.
Putting it all together...
Once you've defined a message structure, your listeners and receivers should be able to decipher the string of bytes coming down the line with a little more ease. Of course, if they get caught up and fail to read packets then those packets are lost. Your next step would be to ensure each end is in a known state and that if a state hasn't progressed that data needs to be re-sent.
I'll post again in the future with code samples for the theories above.
Random Photos
Search
Tags
Ads
Links - Click for details
- Abandoned Rails (Japan)
- AIRLINE (Shinkansen Photography)
- Akihabara Station
- annexpressのブログ
- Australian Model Railway Magazine
- DCC普及協会ホームページ (Japanese DCC)
- Dead Section (Japanese Track Diagrams)
- Delicious Things (Japanese N Scale DCC)
- Densha Wotorou
- Digital Direct for Windows (DCC Server)
- Don's Dream World – AMAZING N Scale Japanese Layout
- Hatena::Diary
- Japanese N-Scale Modeling Forum
- JR Chiisai
- Kaz-T's blog レインボーライン (Rainbow Line)
- LED Resitance Calculator
- Masioka
- Poppondetta Blog
- RailFan Magazine, Japan
- Railmind
- Railway Travelers' Room
- Serenity Valley
- Shashinka Ichiban
- Shuzuku
- Sumida Crossing
- The next station is…
- Tomix N Gauge Track and Japanese N Gauge Trains
- TT Forums (Transport Tycoon Deluxe)
- 名鉄尾西線の貨物列車 (Nagoya: Meitetsu Freight)
- 日本型Nゲージ DCC改造例のご紹介 (Okiraku DCC)
- 泰 茅 轍 道 (Taichi Railway)
- 箱庭登山鉄道製作記 (Hakone-Tozan Layout Blog)
Archive
- November 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- December 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- December 2022
- November 2022
- October 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- September 2021
- August 2021
- July 2021
- May 2021
- March 2021
- February 2021
- January 2021
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- January 2020
- December 2019
- November 2019
- October 2019
- September 2019
- August 2019
- July 2019
- June 2019
- April 2019
- March 2019
- February 2019
- January 2019
- December 2018
- November 2018
- October 2018
- September 2018
- August 2018
- July 2018
- June 2018
- May 2018
- April 2018
- March 2018
- January 2018
- December 2017
- November 2017
- October 2017
- September 2017
- August 2017
- July 2017
- June 2017
- May 2017
- March 2017
- February 2017
- January 2017
- December 2016
- November 2016
- October 2016
- September 2016
- August 2016
- July 2016
- June 2016
- May 2016
- February 2016
- November 2015
- October 2015
- September 2015
- August 2015
- July 2015
- June 2015
- May 2015
- April 2015
- March 2015
- February 2015
- January 2015
- December 2014
- November 2014
- August 2014
- July 2014
- May 2014
- April 2014
- March 2014
- December 2013
- November 2013
- October 2013
- June 2013
- August 2012
- April 2012
- March 2012
- February 2012
- November 2011
- October 2011
- September 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- November 2010
- October 2010
- September 2010
- August 2010
- June 2010
- May 2010
- April 2010
- March 2010
- February 2010
- January 2010
- December 2009
- November 2009
- October 2009
- August 2009
- January 2009
- December 2008
- November 2008
- October 2008
- September 2008
- July 2008