...
Backtrace generation: Acquiring a stack trace when the OpenSync process crash reasons reporting is implemented in a standard way:
Installing a signal handler for fatal signals (such as SIGSEGV, SIGBUS, SIGILL, etc.).
Unwinding a backtrace at that point of execution when the signal is raised (at the point of crash).
Reporting the crash info (i.e., backtrace, fatal signal) to the system logs and possibly elsewhere, and then re-raising the same signal to assure the program will actually crash (since the crash is inevitable and ignoring the signal would lead to undefined states).
Backtrace summary example:
...
The following JSON format is used:
Field name | Data type | Description | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
locationId | string | Location ID | |||||||||
nodeId | string | Node ID | |||||||||
model | string | Node model string | |||||||||
firmwareVersion | string | Firmware version string | |||||||||
pid | string | Process ID of the crashed process | |||||||||
name | string | Name of the crashed process | |||||||||
reason | string | Crash reason description | |||||||||
timestamp | number (long) | Timestamp of the crash – milliseconds since epoch
| |||||||||
backtrace | string | Backtrace string |
...
At the time of crash, if the target uses BTRACE_FILE_LOG (report to file, logs and controller) option and CONFIG_DM_OSYNC_CRASH_REPORTS is enabled in Kconfig, the flow is:
In the signal handler, write a crash report to a temporary file (or a set of files) under a dedicated directory under /tmp/, for instance /tmp/osync_crash_reports/.
DM monitors the contents of that temporary directory.
When DM detects a new crash report, it sends the report (via MQTT) to the controller and deletes the entry for that crash report in the temporary directory.
Stripping backtraces: The most important function is usually the function immediately following the __default_sa_restorer, and the following 1 to 3 functions. Backtraces in this short reports should be stripped from both directions (omit lines up to __default_sa_restorer, and include only the first few (the suggestion is 3, but up to 5) lines that carry most of the information.