GenBoard/BinaryProtocol

For developers

Quintessence continued on http://wiki.x-dsl.hu/cgi-bin/w/telemetry.cgi?UartPacketComm.

Discussion below is history


With the initial discussion mcell and hackish discussed the ideas of how frames should be defined.

TODO: search for opendiag and see if there's something (standard) we could use (they apparently don't have a wiki, so it'll suck a bit to catch up); someone could subscribe and keep an eye?


Aims

SerialComm/SIPR is already implemented (by Hackish). SIPR is much better than MegaTune protocol, since it supports reliability. Unfortunately SIPR does not support flexibility (multiple devices on a bus, or complex topology).

Real CAN supports flexibility, but CAN does not support compactness (8byte load is a bit small for logging, for which the 6 byte overhead is a bit prohibitive) and real CAN controller not available on many controllers.


Solution

In order to make an error resilient system we have decided on the following.

CAN-IDs

CAN-ID is a 29bit address that identifies the data. The beauty of CAN is that the usage is not defined. CAN-ID can identify the source of the data, or the destination. It can be a command, which has some parameters (max 8 bytes of payload).

We propose to use

This way a display can be configured to accept a certain datatype (such as WBO2 lambda) from a certain device (eg. WBO2 controller of cyl4), while neglect same datatype from another similar device (eg. WBO2 controller of cyl7).

When packing 29bit CAN-IDs into UART frames (this part consumes 4 bytes = 32 bits), we use the remaining 3 bits to mark gigapacks:


Arbitration

Some devices are slave-only. A slave can only talk if it was requested by the master. However, than it is permitted to send it's queued messages, including messages to other slaves.

Some devices (eg. ARM and PC) can act as master. Only one master at a time is active. If a master-capable device cannot hear any other master for master_existence_timeout (appr. 20..100msec), takes over the bus (involving some randomness to decrease chance of collision), and grants bandwidth to each slave.

example

PC wants to save a config value to v3.x.


Secondary problems

Bootloader HW-indication changes

Shaking the TX in hope the TX is loopbacked to RX is not desirable. The bootloader will wait (listening to the bus) bootloader_inactive_timeout (appr 200msec) than start the application, unless:

Above special commands are address/mask addressable selectively.


Gigapacks

TODO: clean this

A frame_type_description payload is defined as

and a repetition of:

The frame_type_description is packaged in a frame (using a special frame_type) when transmitted or stored (tricky?).


Byte stuffing or not?

Byte stuffing (escaping the frame_marker byte) is not needed.

It seems we get better overall performance without byte stuffing.


Acknowledge

We better support

Some sequence number is required for this.

We must support the acknowledge piggybacked in a useful frame (that might possibly be empty) to avoid extreme acknowledge overhead. Overhead is one cost of robustness and security. It'll always be a trade off - we need not even use a frame... a single bit or marker byte is plenty.

Lets look at 2 usecases for updating a value from tuning software. Steps 1 & 2 are about all that's done right now, but there's no verification that anythng worked (correct me if I'm wrong!). That's not good enough.

Usecase without ack:

  1. Software sends command to set VE table [1][2] to 35.
  2. ECU receives command, and updates VE table [1][2].
  3. Software sends command to read VE table [1][2].
  4. ECU sends back VE table [1][2].
  5. Software tests that command worked.

Usecase with ack:

  1. Software sends command to set VE table [1][2] to 35.
  2. ECU receives command, updates VE table [1][2], and sends ack.
  3. Software receives ack and knows command worked.

Are we trying too hard to reinvent a reliable message passing protocol? If we're talking about using CRC functions and ack and so on anyway... CAN already works, and can be implemented over a 1-wire interface (see SAE J2411). And we need it for next gen hardware. We should get it figured out on some level now.


Some bytes are lost forever - this condition must be handled gracefully in any case.

We expect there to be a defined timeout where the receiver will give up and request that a sender re-send the frame (issuing the same - idempotent - command again). Note that for the PC-GenBoard communication the PC will take care of this. The GenBoard doesn't care if the PC does not get the reply (the PC will request again).

Please list any operations (firmware commands) that are not idempotent (issuing again can have side-effects).

---

See also: