How to detect disassociation by AP reboot within station in PS mode - wifi

I'm writing a fairly low-level driver for a wireless card, and while most of the spec is fairly straightforward, I haven't wrapped my head around a single question yet:
If my station is in power-save mode and its receiver is turned off for an extended period (say, 10 seconds) between DTIM frames, and the access point is rebooted in the meantime so my association is lost, how can I detect this?
I'm aware that the most common case will be that synchronisation is lost thoroughly enough that I will miss a number of beacons and simply go back to the AP search afterwards, but if by some lucky chance I get to see beacons, is there some way to find out that this is a new "instance" of the same AP?
I can think of
a short(er) TIM field -- however I believe APs are allowed to shorten the TIM information if no traffic is waiting
the AP timestamp changing unexpectedly.
the "number of beacons to next DTIM" field changing unexpectedly.
Being a perfectionist, I'd like to know if there is an entirely reliable way to detect that the AP has been rebooted, rather than just putting together clues.

I would suggest that you look at the TSF in received beacon frames and
if it differs too much from the TSF you expected you send a NULL-data
frame to the AP. If the AP was rebooted it should respond with a
deauthenticate frame with reason "Class 2 frame received from
nonauthenticated STA".

I don't have any knowledge of wireless cards at that level, but I'd take a practical route and analyze the communication from the AP just leading up to the disconnect for a pattern that matches a typical shutdown sequence; for example, "no more traffic, a sudden loss of DTIM sync, and then an AP announcement".
Off the top of my head: maybe look into Kismet's AP detection and analysis code for an idea or two. I'd bet someone else has encountered this problem before.
Cheers!

Related

CAN J1939 device stops responding after communication timeout

I'm a higher layer guy, I don't and don't want to know much about can-bus, j1939 or even particular ECUs. I just don't like the software solution, so I'd like to ask, if customer's requirements are legitimate.
If particular ECU doesn't receive CAN frame within 300 ms timeout after powerup, it stops responding to any further frames and must be power cycled. This is a information from customer's technicians, I have to just believe it.
It is possible to powerup ECU after CAN driver thread is ready, but it would require some extra wiring by end customers.
Software solutions are all bad or worse, like running FreeRTOS before important checks, put CAN driver code to code common with other products, or start CAN periphery in the bootloader and left running without software control until driver starts.
The sensitive part is, that we have no explicit demand to start CAN driver within such a short time in specification. Customer says, that it's part of J1939 specification.
Can someone confirm or disprove, that J1939 allows devices to unrecoverably stop receiving after 300 ms of silence or requires devices to start transmitting within 300 ms after powerup? Or at least guide me to parts of J1939 standard, which could possibly regard this?
Thank you
If particular ECU doesn't receive CAN frame within 300 ms timeout after powerup, it stops responding to any further frames and must be power cycled. This is a information from customer's technicians, I have to just believe it.
This does of course entirely depend on what task it is performing.
Generally, an ECU, as in an automotive computer in a car/truck etc is never allowed to hang up/latch up. The normal course of action would be for the ECU to either reboot/reset itself or revert to a fail-safe mode.
But in case of tractors and heavy machinery the normal safe mode is "stop everything".
It is possible to powerup ECU after CAN driver thread is ready, but it would require some extra wiring by end customers.
I don't know what this is supposed to mean. What is "extra wiring"? Something to keep other nodes in common mode while one is rebooting? Terminating resistors? Some dirty power-up delay circuit?
Software solutions are all bad or worse, like running FreeRTOS before important checks, put CAN driver code to code common with other products, or start CAN periphery in the bootloader and left running without software control until driver starts.
Generally speaking, it's custom to initialize critical hardware like clocks, watchdogs, prescalers, pull resistors etc very early on. Initializing hardware peripherals may or may not be critical. It's custom to do this after the CRT has been executed, at the beginning of main() and the order of initialization usually matters a lot.
If you have a delay longer than 300ms from power-on reset to the start of main(), something is terribly wrong with the program.
The sensitive part is, that we have no explicit demand to start CAN driver within such a short time in specification. Customer says, that it's part of J1939 specification.
I haven't worked much with J1939 and I don't remember what it says specifically, but 300ms is an eternity in a real-time system! It's not a "short time".
In general, correctly designed mission-/safety-critical CAN control systems in automotive/industrial settings work like this:
All data is sent repeatedly in fixed intervals, regardless of if it has changed or not. Commonly once per 10ms or once per 100ms.
A node which has not received new data will use the previously received data for now.
There is a timeout from the point of when last valid data was received, when the receiving node must stop using old data and revert to a fail-safe mode. This time is often relative to how fast the controlled object can move. It's common to have timeouts after some multiple of 100ms.
I would say that your customer's requirements are very reasonable, it's nothing out of the ordinary.
My colleague answered, that there's no such demand, only vague 300 ms timeout.

Establishing synchronized music streaming across devices

I am attempting to stream audio files from a server to iOS devices and play them completely synchronized. For example on my phone I might be 20 secs into a song and then my friend next to me should also be 20 secs into the song as well. I know this is not an easy problem to solve, but I am attempting to do so.
I can currently get them within one second of each other by calculating the difference in time between the devices and then have them sync up, however that is not good enough because the human ear can detect a major difference in a second and this is over WIFI.
My next approach is going to be to unicast the one file from the server and then have the all devices pick it up directly from the server and then implement some type of buffer system similar to netflix so that network connectivity would be a limiting factor. http://www.wowza.com/ is what I would use to help with that.
I know this can be done, because http://lysn.in/ is does it with their app and I want to be able to do something similar.
Any other recommendations after I try my unicast option?
Would implementing firebase help solve a lot of the heavy lifting problems?
(1) In answer to ONE of your questions (the final one):
Firebase is not "realtime" in "that sense" -- PubNub is probably (almost certainly) the fastest "realtime" messaging for and between apps/browser/etc.
But they don't mean real-time in the sense of real-time, say, as race game engineers mean it or indeed in your use-case.
So firebase is not relevant to you here and won't help.
(2) Regarding your second general question: "how to synchronise time on two or more devices, given that we have communications delays."
Now, this is a really well-travelled problem in computer science.
It would be pointless outlining it here, because it is fully explained here http://www.ntp.org/ntpfaq/NTP-s-algo.htm if you click on "How is time synchronised"?
So in fact, to get a good time base on both machines, you should use that! Have both machines really accurately set a time to NTP using the existing (perfected for decades) NTP synchronisation.
(So for example https://stackoverflow.com/a/6744978/294884 )
In fact are you doing this?
It's possible that doing that will solve all your problems; then just agree to start at a certain exact time.
Hope it helps!
I would recommend against using the data movement to synchronize the playback. This should be straightforward to do with a buffer and a periodic "sync" signal that is sent at a period of < 1/2 the buffer size. Worst case this should generate a small blip on devices that get ahead or behind relative to the sync signal.

Does a CAN bus device need to be kick started to start sending messages?

I am a complete CAN bus newbie. I'm hoping someone with CAN experience can point me in the right direction. I was given a Vector VN1610 USB to CAN adapter and a Continental ARS-308 radar sensor. The goal is to read some velocity and distance information from the sensor. Right now I am just trying to see any data but all I get are messages with an id of 0 or 0x80000000. The data payloads all report as 8 bytes of 0.
What Works
I have been able to use the sample .NET code provided and set up the VN1610. The ARS-308 has a single CAN channel so in the Vector Hardware Config for my application I just map "CAN 1" to VN16101 Channel 1. (I leave CAN 2 unassigned) I then assume I use that one channel for both transmit and receive. The code reports that the channel sets up an activates and no errors are reported.
I then have a thread looking for incoming messages. If I don't debug out the two IDs mentioned above I can actually process all of them and then I get XL_ERR_QUEUE_IS_EMPTY messages. So it looks like its all working, I'm just not getting any real data.
What Doesn't
I would think a slew of data messages in the 0x200 - 0x702 range would be coming in for the Continental ARS device. Now I'm more used to ethernet type protocols where I would send a command and then read a response. None of my docs talk about how CAN works so I am ASSUMING that in CAN the device just sends data. I certainly can't find any commands that tell the device to send me the particular msg ID I'm interested in.
Am I missing some basic CAN configuration step that informs the device it should start sending data? Any suggestions at all would be appreciated.
If it matters I'm writing in VS2013, .NET on a Win 7 64 Ultimate machine.
The answer is No. It turns out that CAN devices will indeed just start streaming out messages when you turn them on (well at least this one does). The messages with ids of 0x0 and 0x8000000 are bogus. Even with the radar sensor turned off I continued to see those messages.
It turns out I had a hardware problem. The CAN bus requires a 120 Ohm resistor which was installed. The problem was when the shell was put back on the cable the resistor got cracked. Once we repaired this, everything started working as expected.

PIC32 becomes unresponsive after a few hours

I have a PIC32MX340F512 board developed by another company for us, The board has a DS1338 RTCC and 24LC32A eeprom, and display unit on an I2C bus, on this bus i included a TSL2561 I2C light sensor, i wrote code in c to poll the light sensor continously , when the light sensor reaches a certain level i save the time and date and light sensor value on SD card. This all works fine but if i leave the system without exposure to light inside tunnel where incident light on one end of the tunnel is ought to be monitored the system becomes unresponsive no matter how much amount of light you apply and then if i switch power off and back on again everything starts to work normal. i am a one man development team and have been trying to find out the problem for months, i activated the watchdog timer to prevent the system from hanging but the problem still persisted. i then decided to find out if the problem is with the sensor by including a push button to activate light measurement but still when 4-5 hours elapse the PIC cant even detect a change in the the input pin. Under the impression that a hardware reset overrides anything going on i included a reset button and it also works ok for the first few hours after that the PIC doesn't seem to be responding to anything including a reset. I was getting convinced that there is nothing wrong with the firmware but also with all this happening the display unit (pic16f1933 and lcd) on the I2C shares power with the main unit and doesn't seem to be affected as it alternates between different messages constantly Does anybody have an idea what could be wrong (hardware/firmware or my sensor). I am using a 24v DC power supply purchased seperately. The PIC seems to go into a deep sleep although i dd not implement any kind of SLEEP mode in my code. Nb We use the same board for many other projects and i haven't come across such a problem . Thanks in advance.
I think you need to (if you haven't already) explore the wonderful world of in-circuit-debugging (such as with the ICD3 or PICkit 2/3). It allows you to run the processor in a special mode that lets you pause execution, see exactly which line of code is being executed, inspect variable values, and step through the code to see which parts are running and not running, or see exactly where execution takes a wrong turn. If the problem takes hours to reproduce, that's okay. You can just leave it overnight running in debug mode and hopefully it will be locked-up or 'sleeping' in the morning. At this point, you will be able to pause the processor and poke around to see if you got caught in some kind of infinite loop or something. This is often the only way to dig inside a running piece of code to see why things aren't working as you expect. But as you say, those bugs that take hours or days to manifest are the trickiest. Good luck!
It sounds like you can break up your design into two main parts, sd card interfacing, reading the rtc and reading the light sensor. If it were me I would upload a version of the code that mimics reading the light sensor but only returns fake data and see if that cures the problem. Additionally do the same with the other two modules separately and see if any of the three versions of your project not show this problem. From there just keep narrowing it down until you find the block of code thats causing problems.
if Two or more versions of your debug code show the same problem then my guess is it has to do with one of the communication protocols. I had a problem with a Pic32 silicon version blocking when using the DMA in conjunction with the SPI peripherals. So I would suggest checking the errata for your chip.
If you still cant find the problem, my only suggestion would be to check for memory leaks or arrays that are growing into reserved memory.
Hope that helps, good luck!

What happens if a bus-off error occurs in a CAN controller while a car is in motion?

I know that in a CAN controller if the error count reaches some threshold (say 255), bus off will occur which means that a particular CAN node will get switched off from the CAN network. So there won't be any communication at all. But what if the above said scenario happens while the car is moving which contains the ECU (includes the CAN controller)?
Is there any auto-recovery mechanism in a CAN controller to avoid any of the above situations?
During bus off, the node will be isolated.
CAN waits for the mandatory time period, 128 x 11 bits (1408 bits - 5.6 ms for a 250 kbit/s system) of time, and then tries to re-initialize the node.
Yes, if a CAN Tx error count reaches 255, a node will turn off and potentially reset itself. A good implementation will not continue resetting a node if the problem persists.
In addition to this safety mechanism, ECU's (electric control units) also time the duration between valid transmissions of the messages they expect to receive. Therefore, if the engine controller goes offline, nearly every ECU in the vehicle will report "Lost Communication with the Engine Controller."
Typically, these type of CAN problems are identified by DTC's (diagnostic trouble codes) beginning with U, like this one: http://www.obd-codes.com/u0115
Depending on the severity of the issue, the vehicle might enter a "limp home" mode, or might be totally disabled. Problems with the CAN bus on a vehicle are extremely rare, unless there has been some tampering.
The recovery mechanism depends on the software stack that's being used. Most new vehicles have AUTOSAR compliant software implementations. In the AUTOSAR communication stack, the CanSM (state manager) module has configurable BusOff Monitoring and Recovery. You can read more at http://autosar.org .
A BusOff however, is a serious situation in a running vehicle. How this is handled at the vehicle level is very specific to the system design. But, in most cases the system would go into a safe mode of operation and all parameters would take pre-set fault values to let the vehicle run with a reduced functionality. You would see the warning lamps on the dash go off to alert the driver. ECUs typically comply with some level of ASIL (https://en.wikipedia.org/wiki/Automotive_Safety_Integrity_Level) standard. This makes sure that such situations are thought of and taken care of during design and development.
Nothing spectacular will happen, even if the Engine Control Unit looses CAN communication. The car will continue running.
When bus-off occurs, the CAN network isolates that node and then resets that node which can able to start communication.
As you mentioned, after reaching a specific error count, that node gets disconnected/prohibited from transmitting anything on the bus. This is a description for the bus side.
On the controller side, every CAN controller generates an interrupt on BUS_OFF. It is the controller's responsibility that it should reset the CAN controller and bring it back to the normal state.
This is strictly followed for every CAN controller in any car. And this all happens in a few milliseconds... So for the driver, nothing happens!
When the ECU detects a BUS_OFF fault, the ECU should stop its emissions so this is a good question to ask.
There is an auto-recovery mechanism:
For the first three detections, the CAN controller resets its registers without a delay
For the next detections, the ECU waits 1 second before the reset
There is something called limp-home mode for the cars. That is the condition when all the ECUs fail in the car network. Then a set of default parameters for the ECUs are initialized and then the system, i.e., your car can continue running only for some time before it is properly serviced by the OEM.
I know this is an old thread, but the answers are a bit different from the situation I have observed, in relation to the OP question.
From experience, I'm have an issue where my ECU stops communicating with the diagnostic tools while the engine is running, apparantly it has entered the CAN off state. The only reason I know is I have a OBD 2 plug in monitor for engine parameters. I don't get ANY DTC, well most of the time anyways.. sometimes I get DTCs that are not applicable to my vehcile, and some U codes.
That said, the vehicle continues to run just fine, and if I didn't have the plug-in monitor, I would have no idea there was a problem! I'm now pretty sure the ECU for the Engine is having communication problems, and hitting the error counter and shutting off, it's the only thing that makes sense. I checked the CAN signals with a 2 channel O-scope, and they are a bit noisy compared to one of my other cars, so my next step is to swap the ECU and see if that fixes it. I already swapped out the TIPM (Total Integrated Power Module), it serves as a router of sorts between the 2 CAN networks, to the OBD2 port. That apparantly wasn't it.
if a CAN Tx or RX error counter reaches 255 , the node will turn off and be isolated
What happens if a bus-off error occurs in a CAN controller while a car is in motion?
1)HARD SWAPPING can be done in can network.
eg: Assume four(4) nodes(ECUS) are connected in can bus network.if we disconnected one
ecus then also can bus works properly.
2)In BUSS OFF condition it can hear every signal on the bus network but it cant transmit
mssgs(signal). If the car in motion or in rest position.
eg: Ecus(ABS) are using for better performance but actual work is done by actuator(DISK BRAKE).

Resources