AVOIDING COLLISIONS: ADDING DETERMINISM TO MY NETWORK
Dr. Mathias Bohge
Prof. James Gross
In the previous blog post, we dug into the wireless propagation effects, how they challenge reliable communications and how EchoRing™ deals with them. We learned that the electromagnetic waves attenuate, and suffer from shadowing and fading, which adds much difficulty at the receiving node. In a way, we could say that these effects are caused by nature, and their explanation lies in physics. However, the fact of having several electronic devices within the same wireless network adds an extra challenge: managing collisions. In this blog post, we explain how collisions affect wireless communications and the most effective way to overcome them while delivering the required performance for time-critical automation applications.
The human (transmitter), the robot (receiver) and the forklift (colliding station)
Collisions, the impact from simultaneously sending stations
The easiest way to understand collisions is to make an analogy to the interruptions during a voice conversation: whenever there is more than one person talking at the same time, the listener finds difficulties to understand what is being said. In wireless communications the same principle applies: whenever there are two nodes in the network that transmit at the same time, a collision might happen. Naturally, for a collision to occur the two nodes also need to transmit in the same frequency band, which is always the case as the two nodes are in the same network. In such a case, the receiving antenna also experiences an entirely different signal, which is a combination of the two signals that are being transmitted.
The RX Node receives a combined signal from the Tx node and the Colliding node
At this point, the reader might have thought about another crucial word in wireless communications: interference. As a matter of fact, the effect that the receiving antenna experiences during a collision is the same one as during interference. For this reason, collisions could be interpreted to as self-interference. However, we explain both phenomena in different blog posts for two reasons. First, because collisions are caused by nodes within the same network and interference by nodes in other networks (or even from other technologies using the same frequency band). Second, because the techniques used to overcome these challenges are also different.
Impact of Collisions
From the previous blog post, we know that having a signal a 100 times stronger than the noise is sufficient for reliable reception. In a wider sense, we can think of collisions as a second source of noise. The central question is how much additional power this collision carries. Thus, the impact of collisions is crucially dependent on the position of the colliding source relative to the receiver position. If the source is not too far away from the receiver, its signal is likely to be received much stronger than the basic noise floor. This leads immediately to a new situation with respect to the power gap between the incoming signal-of-interest and the “new noise” power. The consequence is a sudden, much worse reception situation leading to a packet loss. As per the duration of the disturbance, a typical colliding Wi-Fi node might actually only try to transmit a very short packet, such that the unfortunate transmission situation changes after one millisecond, or even faster. Nevertheless, it is clear that collisions are in general random effects as well, especially in non-coordinated systems such as Wi-Fi networks (more on that below). We, therefore, cannot rely on predictions to overcome these disturbances, as it is also the case with fading and shadowing.
Standard solutions to overcome collisions, and why they do not work in an industrial set-up
To avoid collisions any wireless network needs to establish rules about how the nodes of the network share the wireless channel. Of course, a straightforward solution to this is to define a transmission sequence, but establishing and maintaining such a sequence is not always either practical or easy. That has led to alternative transmission rules in which nodes simply first listen if the wireless channel is currently not in use, and if this is the case, they directly transmit. This transmission technique is known as Listen-Before-Talk, e.g. Carrier Sense Multiple Access (CSMA) used by Wi-Fi. The clear advantage of this technique is that it requires little coordination, which makes it very flexible. The catch is of course that two nodes might listen at the same time (or slightly shifted to each other), detect that the wireless channel is free and thus transmit at the same time. Even though each node has been perfectly well following the transmission rule, the result is a collision. The obvious result of having collisions during the transmissions are lost packets and, if too many packets are lost the system turns into an unreliable one. Moreover, whenever a packet lost another technique needs to come to place in order to deal with this loss – in the previous blog post we learned that ARQ systems retransmit the message again – which adds waiting time in the transmission. In such a case, the system isn’t real-time compatible either.
Coordination: adding determinism to deal with collisions
As discussed above, collisions are essentially transmission attempts of multiple nodes happening at the same time. It is thus clear that through a suitable transmission sequence, this source of unreliability can be avoided. However, coordinating the transmission sequence of nodes requires a more involved management scheme of the system. For instance, one needs to synchronise the clocks of each node regularly and define based on the assumption of synchronised clocks time slots which are exclusively reserved for each node. If the time slot of a node is coming up, but it happens to have no data to be transmitted, it is quite challenging to reallocate the corresponding time slot, such that in most cases this transmission opportunity is lost. On the other hand, allowing nodes to send data whenever they have a backlog merely frees the system design from these aspects. The price though, as explained in the previous section, are collisions that happen from time to time.
Introducing transmission sequences can be realized in different ways. A first scheme has been introduced above, which utilises synchronised clocks and transmission schedules that need to be announced to all nodes in the system, i.e. TDMA systems. A second option is given by polling. A master station in the system sends periodically a polling command to each node in the system, conveying payload but also granting the receiving node to transmit any payload back if any. Both above solutions (time schedules, and polling) typically require the introduction of a designated node that manages the system. Thus, the system operation breaks down if this single point of failure fails.
A third way to realize a dedicated transmission sequence – which does not introduce a single point of failure – is given by token-passing, precisely the technique used by EchoRing™.
EchoRing™, determinism through token-passing
In token-passing, a special control packet (the token) is passed from node to node. A node receiving the token is now allowed to transmit payload. Hence, the token can be seen as a transmission right which is passed from node to node. In this way, just one station has the transmission right, and no collision happens.
In order to guarantee real-time operations of the system, one merely introduces a timer per node which is activated each time the token is received at that node. By forcing the node to pass the token on once a maximum token holding time is reached, it can be guaranteed that each node can transmit data within a certain worst-case latency (which is obviously also easily achievable in scheduled or polled systems). Still, this is realized in a distributed fashion, meaning that if one of the nodes fail, the token-passing sequence has an own recovery system that enables the network to adapt to the new topology/status of the network. In other words, the system knows which station has failed and takes over the sequence from a different node. Not only the network is able to avoid a major failure but it also recovers fast enough that the real-time operations of all other stations are not significantly compromised. Furthermore, this approach enables the roaming of different devices from network to network, or in R3 Communications’ language, from EchoRing™ to EchoRing™. In the case of roaming a station is associated with a new network usually due to connectivity reasons. However, both networks need to adapt to the new situation (one network has less nodes now, another one has more nodes). Performing this adaptation quickly is based on the same mechanism as is the case if a node fails.
EchoRing™ Token-passing Technique
With this deterministic approach EchoRing™ reaches two essential objectives: avoid all collisions and guarantee a certain latency. As per the latter, latencies down to 2 ms are achievable by EchoRing™, a performance that enables industrial automation to go wireless.
In the next blog entry, Blog Post III “Interference: Other Wireless Networks Affect Mine” we will explain how EchoRing™ coexists with other networks operating in the same frequency band.