E1.20 RDM (Remote Device Management) Protocol Forums

E1.20 RDM (Remote Device Management) Protocol Forums (http://www.rdmprotocol.org/forums/index.php)
-   RDM General Implementation Discussion (http://www.rdmprotocol.org/forums/forumdisplay.php?f=4)
-   -   Handling incorrect messages (http://www.rdmprotocol.org/forums/showthread.php?t=1104)

eldoMS August 17th, 2011 02:39 PM

Handling incorrect messages
 
Hello,

I am working on implementing the, hopefully according to the standard, correct actions to be taken on incorrect messages that could be received by a responder.

After reading the standard several times I am still left with the following questions on how to do things proper:
1. A message count must be 0x00 as received by a responder. What if a non-zero value is received, this should generate a format error?
2. What error to give with a wrong message class value -> format error?
3. When the packet data length mismatches with data length -> format error?
4. When a PID requested via PARAMETER_DESCRIPTION is not available -> format error or range error?

Greetings

Marc

sblair August 17th, 2011 05:19 PM

Quote:

1. A message count must be 0x00 as received by a responder. What if a non-zero value is received, this should generate a format error?

In many respects this is up to you. I generally advocate trying to gracefully handle errors as well as possible. In this case, you could NACK the request message, but from a responder point-of-view it really doesn't care what that field is set to. Yes, the controller is supposed to send 0x00, but you could easily ignore whatever value it sent and and still handle and respond to the message correctly which is what I would suggest.

Quote:

2. What error to give with a wrong message class value -> format error?
If you get a message with an invalid COMMAND_CLASS or one that you just don't support for that message, then the correct response would be to NACK with NR_UNSUPPORTED_COMMAND_CLASS

Quote:

3. When the packet data length mismatches with data length -> format error?
Depends, if you were able to decode the message and had all the data necessary to handle the message, I'd just go ahead and handle it. Otherwise NR_FORMAT_ERROR would probably make the most sense.

Quote:

4. When a PID requested via PARAMETER_DESCRIPTION is not available -> format error or range error?
In this case, NR_DATA_OUT_OF_RANGE is what you should use.

ericthegeek August 18th, 2011 09:44 AM

Quote:

Originally Posted by eldoMS (Post 2209)
2. What error to give with a wrong message class value -> format error?

This is a bit of a gray area. It's depends on how you interpret table 6-7.

Technically, NACK is only allowed for GET_COMMAND_RESPONSE and SET_COMMAND_RESPONSE. It doesn't address whether you can NACK an unknown command class. Scott is correct that if you do decide to NACK it, NR_UNSUPPORTED_COMMAND_CLASS is the correct NACK reason code to use.

One quirk of the standard: Per table 6-7 you can never NACK a discovery request. That means if you get an unknown PID with a discovery command class, all you can do is drop it since NACK is not allowed. I'm not sure this was the intended purpose, but that's how it's written. (The original motivation was to ensure that Discover Unique Branch, Mute, and Unmute were always handled immediately).

Quote:

Originally Posted by eldoMS (Post 2209)
3. When the packet data length mismatches with data length -> format error?

I disagree with Scott on this one. A responder needs to be very strict about enforcing the length fields. If the lengths don't match, the packet is almost certainly corrupt, and you should treat it like a checksum error and ignore the packet.

There are three lengths in RDM:
1: The "Message Length" Field in the packet.
2: The PDL field in the packet (should be exactly "Message Length" minus 24).
3: The number of bytes you actually receive with your UART (should be "Message Length" plus 2).

If all 3 don't match, then some part of the packet has been corrupted in transit. It's common to see this happen. The checksum used in RDM is relatively weak. Because it's an additive checksum, if a 0x00 gets added or dropped from the packet, the checksum won't change. It also can't detect many two-bit errors.

That means if a communication error occurs (RF noise, loose connection, UART Error), or a splitter/hub drops one or more bytes, the checksum can't reliably detect it. The only way you have to know if this has happened is to verify that the lengths all match.

nomis52 August 18th, 2011 06:02 PM

I agree with Eric. Responders should be strict in what they accept. Anything else encourages sloppy programming on the controller side.

mike_k September 1st, 2011 11:13 PM

Quote:

Originally Posted by nomis52 (Post 2214)
I agree with Eric. Responders should be strict in what they accept. Anything else encourages sloppy programming on the controller side.

I agree with you, to some degree...

Problem is, that out in the real world the controller is not the piece of equipment that will get the blame if your responder does not work, the user just knows that all other responders he had do work... Hence, the user will blame the responder that does not work.

When all responders behave the same way (that is when the standard is no longer open for interpretation), then the controller will get the blame.

Compare with the web browser issues... We, who know the details, know that when a site that is W3C compliant that does not show up correctly in IE, but is Webkit based browsers etc, it is IE that is the fault, but the general user does not understand that.

ericthegeek September 1st, 2011 11:51 PM

The matter of how strict a device should be is a philosophical one. It's a matter of strict compliance vs. "I can figure out what they meant to do so I will accept it". I've seen lots of devices that have the wrong PDL for things like the mute Response. They send perfectly valid, properly formatted RDM packets, the PDL:PDATA fields just don't match what's specified for that particular PID.

But from my perspective the length fields are *not* part of the same philosophical debate. Because the checksum is so weak, if the lengths don't match then the packet is corrupt. If you ignore the lengths, Interesting things can happen. Imagine if a 0x00 byte got dropped from a packet such that the checksum ended up in the PDATA fields.

mike_k September 2nd, 2011 12:13 AM

I totally agree that if the length does not match, it is something wrong (especially if message length is not consistent with PDL)... But there are just so many things that the responder could cope just fine with. For instance the message count field.

If the message count field is != 0, it is probably because the controller used the same memory buffer for the request as for a response but did not set the field correctly.
This should be caught in the tests, not out in the field. If someone have equipment from one manufacturer that ignores this fault in the controller, and then gets a load of moving heads from a manufacturer that drops these packages, I would be surprised if the user in that cased called the controller manufacturer and yelled at them, I rather think they will throw out the equipment they see as "non-working".

I'm not saying we should encourage sloppy implementations on the controller side, but we have to keep the real world in mind. And the most important part of the real world: the end user.

sblair September 2nd, 2011 01:55 PM

As has been said, there is a balance. My own design philosophy is to make it as generally forgiving as possible. This in itself doesn't force non-compliant products to behave better, but it does at least help the end-user accomplish what they need to.

As said, the end-user is unlikely to understand or blame the proper product that is the non-compliant one when they are faced with issues anyway.


All times are GMT -6. The time now is 08:04 AM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2024, vBulletin Solutions, Inc.