I'm running influxdb from a remote file system (CIFS) on a raspberry.
After start it works pretty good for few hours and then when I try to list the series (using show series;) I get following error:
ERR: SHOW SERIES [panic:runtime error: index out of range]
After restarting application everything works fine again for few hours.
I tried adding more logging but there are no errors there or useful info entries.
I'm quite puzzled why this happens, googling returns mostly error for golang.
EDIT: Actually I found some logs for this:
lvl=error msg="SHOW SERIES [panic:runtime error: index out of range] goroutine 10617 [running]:\nruntime/debug.Stack(0x31e91d0, 0x30ecf00, 0x
b)\n\t/usr/local/go/src/runtime/debug/stack.go:24 +0x80\ngithub.com/influxdata/influxdb/query.(*Executor).recover(0x2ebfe60, 0x31e91d0, 0x33da5c0)\n\t/go/src/github.com/influxdata/influxdb/query/executor.go:3
94 +0x88\npanic(0xad41a8, 0x15fd528)\n\t/usr/local/go/src/runtime/panic.go:513 +0x194\nencoding/binary.bigEndian.Uint16(...)\n\t/usr/local/go/src/encoding/binary/binary.go:100\ngithub.com/influxdata/influxdb/
tsdb.ReadSeriesKeyMeasurement(...)\n\t/go/src/github.com/influxdata/influxdb/tsdb/series_file.go:338\ngithub.com/influxdata/influxdb/tsdb.parseSeriesKey(0x63e190ed, 0x1, 0x3fff13, 0x0, 0x0, 0x0, 0x1, 0x1, 0x3
0b7620, 0x0, ...)\n\t/go/src/github.com/influxdata/influxdb/tsdb/series_file.go:375 +0x350\ngithub.com/influxdata/influxdb/tsdb.ParseSeriesKey(0x63e190ed, 0x1, 0x3fff13, 0x30, 0x0, 0x0, 0x1ce08, 0x2cf6108, 0x
2cbb9e0)\n\t/go/src/github.com/influxdata/influxdb/tsdb/series_file.go:358 +0x40\ngithub.com/influxdata/influxdb/tsdb.(*seriesPointIterator).Next(0x2f18000, 0x9e4448, 0x9e4448, 0x2cf60d0)\n\t/go/src/github.co
m/influxdata/influxdb/tsdb/index.go:842 +0x40\ngithub.com/influxdata/influxdb/query.(*floatInterruptIterator).Next(0x348c190, 0x3988640, 0x2cbb9e0, 0x1ad84)\n\t/go/src/github.com/influxdata/influxdb/query/ite
rator.gen.go:941 +0x44\ngithub.com/influxdata/influxdb/query.(*floatFastDedupeIterator).Next(0x348c1a0, 0x39886c0, 0x30eced8, 0x4)\n\t/go/src/github.com/influxdata/influxdb/query/iterator.go:1301 +0x28\ngithu
b.com/influxdata/influxdb/query.(*bufFloatIterator).Next(0x348c1c0, 0x363ae4, 0x39886c0, 0x30eced8)\n\t/go/src/github.com/influxdata/influxdb/query/iterator.gen.go:90 +0x74\ngithub.com/influxdata/influxdb/que
ry.(*bufFloatIterator).peek(0x348c1c0, 0x2e61080, 0x1, 0x1)\n\t/go/src/github.com/influxdata/influxdb/query/iterator.gen.go:65 +0x1c\ngithub.com/influxdata/influxdb/query.(*floatIteratorScanner).Peek(0x39886a
0, 0
Apr 02 19:30:20 pi2 influxd[23263]: x2, 0x2, 0x30eced8, 0x4, 0xa8ff70, 0x2e61080, 0x1)\n\t/go/src/github.com/influxdata/influxdb/query/iterator.gen.go:516 +0x2c\ngithub.com/influxdata/influxdb/query.(*scanner
Cursor).scan(0x2d4a4b0, 0x39886c0, 0x0, 0x0, 0x3f9e74, 0x10, 0xb180c0, 0x1, 0x1ad84)\n\t/go/src/github.com/influxdata/influxdb/query/cursor.go:235 +0x28\ngithub.com/influxdata/influxdb/query.(*scannerCursor).
scan-fm(0x39886c0, 0xd53478, 0x348c130, 0x4, 0x1, 0x328a8cf, 0x18, 0x8)\n\t/go/src/github.com/influxdata/influxdb/query/cursor.go:230 +0x24\ngithub.com/influxdata/influxdb/query.(*scannerCursorBase).Scan(0x2d
4a4b8, 0x328a8d0, 0x0)\n\t/go/src/github.com/influxdata/influxdb/query/cursor.go:175 +0x38\ngithub.com/influxdata/influxdb/query.(*Emitter).Emit(0x3d78340, 0xc1e884, 0x3d78340, 0x3d78340, 0x440a0bc)\n\t/go/sr
c/github.com/influxdata/influxdb/query/emitter.go:41 +0x40\ngithub.com/influxdata/influxdb/coordinator.(*StatementExecutor).executeSelectStatement(0x2c36a40, 0x316e880, 0x2e231f0, 0x0, 0x0)\n\t/go/src/github.
com/influxdata/influxdb/coordinator/statement_executor.go:561 +0xfc\ngithub.com/influxdata/influxdb/coordinator.(*StatementExecutor).ExecuteStatement(0x2c36a40, 0xd558d8, 0x316e880, 0x2e231f0, 0x1, 0x1)\n\t/g
o/src/github.com/influxdata/influxdb/coordinator/statement_executor.go:64 +0x1f90\ngithub.com/influxdata/influxdb/query.(*Executor).executeQuery(0x2ebfe60, 0x31e91d0, 0x440a0bc, 0xa, 0x0, 0x0, 0xd561f8, 0x161
9f60, 0x2710, 0x0, ...)\n\t/go/src/github.com/influxdata/influxdb/query/executor.go:334 +0xb50\ncreated by github.com/influxdata/influxdb/query.(*Executor).ExecuteQuery\n\t/go/src/github.com/influxdata/influx
db/query/executor.go:236 +0x6c\n" log_id=0EZUK5qG000 service=query
but they don't help much :(
CIFS is a problem. Probably InfluxDB files are corrupted - see https://github.com/influxdata/influxdb/issues/10309
InfluxDB isn't supported on network file systems (e.g. NFS, GlusterFS).
Related
I am working with the Songhe Mega2560 + WiFi R3 Mega2560 + ESP8266 4MB Memory integrated circuit for a project involving connecting to a WiFi signal and reading the RSSI value.
Below is a basic sketch that I uploaded to the Mega2560 to communicate to the ESP8266 through Serial3 to test the firmware:
#include "WiFiEsp.h"
// Emulate Serial3 on pins 6/7 if not present
#ifndef HAVE_HWSERIAL3
#include "SoftwareSerial.h"
SoftwareSerial Serial3(6, 7); // RX, TX
#endif
void setup() {
// initialize serial for debugging
Serial.begin(115200);
// initialize serial for ESP module
Serial3.begin(115200);
// initialize ESP module
WiFi.init(&Serial3);
// check for the presence of the shield
if (WiFi.status() == WL_NO_SHIELD) {
Serial.println("WiFi shield not present");
// don't continue
while (true);
}
// Print WiFi MAC address
printMacAddress();
}
void loop() {
// do nothing
}
I flashed different versions of espressif's AT firmware but the serial monitor keeps showing this:
22:28:07.009 -> [WiFiEsp] Initializing ESP module
22:28:08.023 -> [WiFiEsp] >>> TIMEOUT >>>
22:28:10.026 -> [WiFiEsp] >>> TIMEOUT >>>
22:28:12.022 -> [WiFiEsp] >>> TIMEOUT >>>
22:28:14.023 -> [WiFiEsp] >>> TIMEOUT >>>
22:28:16.020 -> [WiFiEsp] >>> TIMEOUT >>>
22:28:17.006 -> [WiFiEsp] Cannot initialize ESP module
22:28:23.017 -> [WiFiEsp] >>> TIMEOUT >>>
22:28:23.017 -> [WiFiEsp] No tag found
22:28:23.017 -> WiFi shield not present
I am not sure if it is a firmware issue so I have tried multiple versions of AT firmware. The baud rate I have set is 115200. I have been looking at many other sources online, but I cannot seem to initialize WiFiEsp's WiFi module and I would really appreciate some help on this matter.
I have been following these steps for flashing and testing.
Toggle DIP switches 5,6,7 to ON and all else OFF and RXD/TXD to RXD0
Connect USB cable from port COM3 (on my computer) to integrated PCB with Mega2560 + ESP8266 WiFi
Use esptool.py to flash firmware to the ESP8266
The latest, released firmware for ESP8266 is the "ESP8266-IDF-AT_V2.2.1.0.zip" downloadable at espressif.com
I download the factory_xxx.bin to address 0 since I read that it indicates all hardware configurations for the ESP module. Below is the command I ran:
esptool.py --chip auto --port COM3 --baud 115200 --before default_reset --after hard_reset write_flash -z --flash_mode dio --flash_size 4MB 0x0 factory_WROOM-02.bin
Disconnect USB cable
Toggle DIP switches 1,2,3,4 to ON and all else OFF and RXD/TXD to RXD3
Connect USB cable and upload sketch with Arduino IDE and read serial monitor
This is the procedure I have been trying to debug. If anymore information is required for better help, please let me know and I will try my best to provide.
The screenshot below is the command I run which successfully flashes (I think):
C:\Users\[MY NAME]\Downloads\ESP8266_NONOS_SDK-3.0.5\ESP8266_NONOS_SDK-3.0.5\bin>esptool.py write_flash --flash_mode dout --flash_size 4MB-c1 0x0 boot_v1.7.bin 0x01000 at/1024+1024/user1.2048.new.5.bin 0x1fb000 blank.bin 0x1fc000 esp_init_data_default_v08.bin 0xfe000 blank.bin 0x1fe000 blank.bin
esptool.py v4.4
Found 1 serial ports
Serial port COM3
Connecting....
Detecting chip type... Unsupported detection protocol, switching and trying again...
Connecting....
Detecting chip type... ESP8266
Chip is ESP8266EX
Features: WiFi
Crystal is 26MHz
MAC: a4:e5:7c:b6:77:c0
Uploading stub...
Running stub...
Stub running...
Configuring flash size...
Flash will be erased from 0x00000000 to 0x00000fff...
Flash will be erased from 0x00001000 to 0x00065fff...
Flash will be erased from 0x001fb000 to 0x001fbfff...
Flash will be erased from 0x001fc000 to 0x001fcfff...
Flash will be erased from 0x000fe000 to 0x000fefff...
Flash will be erased from 0x001fe000 to 0x001fefff...
Flash params set to 0x0360
Compressed 4080 bytes to 2936...
Wrote 4080 bytes (2936 compressed) at 0x00000000 in 0.4 seconds (effective 92.5 kbit/s)...
Hash of data verified.
Compressed 413556 bytes to 296987...
Wrote 413556 bytes (296987 compressed) at 0x00001000 in 26.2 seconds (effective 126.1 kbit/s)...
Hash of data verified.
Compressed 4096 bytes to 26...
Wrote 4096 bytes (26 compressed) at 0x001fb000 in 0.1 seconds (effective 373.3 kbit/s)...
Hash of data verified.
Compressed 128 bytes to 75...
Wrote 128 bytes (75 compressed) at 0x001fc000 in 0.1 seconds (effective 11.8 kbit/s)...
Hash of data verified.
Compressed 4096 bytes to 26...
Wrote 4096 bytes (26 compressed) at 0x000fe000 in 0.1 seconds (effective 365.1 kbit/s)...
Hash of data verified.
Compressed 4096 bytes to 26...
Wrote 4096 bytes (26 compressed) at 0x001fe000 in 0.1 seconds (effective 357.1 kbit/s)...
Hash of data verified.
Leaving...
Hard resetting via RTS pin...
After this, I test it with Arduino's SerialPassThrough sketch replacing Serial1 with Serial3 and I get no response from running the command: AT.
I would appreciate any help on how to resolve this and where I could possibly be going wrong. Thanks!
I want to ask how can we claculate the:
MBR ( Memory Buffer register ) size
MAR ( Memory address register ) size
number of data lines
when knowing the number of address lines and the memory capacity ?
I am working with a Beaglebone Black and I would like to use the mmc2 slot.
according to AM335xx TRM, a beaglebone black should have 3 mmc available:
mmc0 (sd card);
mmc1 (2G flash),
mmc2.
I am trying to enable mmc2 by device tree (and I am quite sure to have the right pin settings) but, by doing
dmesg
I obtain:
/ocp/mmc#47810000: can't find DMA channel
omap_hsmmc mmc.11: unable to obtain RX DMA engine channel 65
By putting the oscilloscope probe on the header (e.g. the mmc2 clk signal), I do not see any transition.
I already removed R 160 to have mmc2 cmd accessible but I do not see any transition also there.
I tried both to enable it by
echo > /sys/devices/..../slots
and by
capemgr.enable_partno
with no success:
I can see it in
/sys/devices/..../slots
(with the L meaning loaded)..but no way to see any signal on the header.
I already googled it but answers are not clear at all.
Any ideas?
My
uname -a
is:
Linux beaglebone 3.8.13 #1 SMP Tue Jun 18 02:11:09 EDT 2013 armv7l GNU/Linux
Thanks for your help.
You need to configure the mmc2 DMA events to some DMA channel since these events are not direct mapped.
I was not able to do this successfully using device tree overlays. So I made a change in the
am335-x-bone-common.dtsi directly (not sure this is the best way though):
&edma {
ti,edma-xbar-event-map = <32 12>, /* gpevt2 -> 12 */
<30 20>, /* xdma_event_intr2 -> 20 */
+ <1 32>,
+ <2 33>;
};
In the example above the event 1 (SDTXEVT2) was mapped to channel 32 and event 2 (SDRXEVT2) to channel 33.
In case you want to pick another open DMA channel check tables 11-23. Direct Mapped and Table 11-24. Crossbar Mapped from the technical reference manual Rev J.
In your device tree overlay file add these channels in the mmc3 node:
dmas = <&edma 32
&edma 33>;
dma-names = "tx", "rx";
Context (probably not needed):
As a learning exercise, I'm trying to implement a mini "OS" for the Raspberry Pi.
I'm currently implementing a very dumb memory management system. I already have the MMU enabled, and I'm in the process of getting a usable kmalloc.
It can already allocate chunks of memory from a pre-existing little kernel heap, mapped after the code and data segments. I'm trying to get it to grow as needed by mapping more pages. It must also be able to produce physically contiguous chunks.
The code is hosted at Github, there's a branch dedicated to this question with debug code. Note that it's not an example of well-organized, well-commented nor very clever code. :)
Actual question:
While trying to debug a data abort, I found something very strange.
This is a piece of code from my kmalloc:
next->prev_size = chunk->size;
next->size = -1;
term_printf(term, "A chunk->next_free = 0x%x\n", chunk->next_free);
term_printf(term, "B chunk->next_free = 0x%x\n", chunk->next_free);
*prev_list = next;
next->next_free = chunk->next_free;
term_printf(term, "next_free = 0x%x, chunk 0x%x\n", next->next_free, chunk->next_free);
term_printf(term, "next_free = 0x%x, chunk 0x%x\n", next->next_free, chunk->next_free);
I run it three times. Here are the results:
# 1st
A chunk->next_free = 0x0
B chunk->next_free = 0x0
next_free = 0x0, chunk 0x0
next_free = 0x0, chunk 0x0
# 2nd
A chunk->next_free = 0xffffffff
B chunk->next_free = 0x0
next_free = 0x0, chunk 0xffffffff
next_free = 0x0, chunk 0x0
# 3rd
A chunk->next_free = 0xffffffff
B chunk->next_free = 0xffffffff
next_free = 0xffffffff, chunk 0xffffffff
next_free = 0xffffffff, chunk 0xffffffff
The first and third iterations look normal (though next_free is supposed to have the value 0, the data abort occurs because it's got 0xffffffff). But what is my code doing during the second? O_o What kind of black sorcery can make my printf output two different values for chunk->next_free when read four times in a row? O_o
The data is well aligned, the pages are cacheable and non-bufferable (making them non-cacheable doesn't help), and I get the same result whether compiler optimizations are turned on or off. I tried throwing a data memory barrier in there but indeed it does nothing. I also checked the assembly produced, it looks ok.
I thought it could be caused by corrupted TLBs. I'm issuing "Invalidate unified single entry" (mcr p15, 0, %[addr], c8, c7, 1) after each new page mapping. Is it enough?
I tried debugging with qemu but it gets a data abort earlier when setting a bitmap of used physical pages, though this part works fine on the Pi.
I'm just looking for clues about what can cause this behavior. If you need more context please ask, though my code is for the moment rapidly changing and messy with lots of printf.
ETA:
The disassembly with -O0 for the first two printf:
c00025e4: e51b3018 ldr r3, [fp, #-24]
c00025e8: e5933008 ldr r3, [r3, #8]
c00025ec: e59b0004 ldr r0, [fp, #4]
c00025f0: e59f10a0 ldr r1, [pc, #160] ; c0002698 <kmalloc_wilderness+0x2c0>
c00025f4: e1a02003 mov r2, r3
c00025f8: eb000238 bl c0002ee0 <term_printf>
c00025fc: e51b3018 ldr r3, [fp, #-24]
c0002600: e5933008 ldr r3, [r3, #8]
c0002604: e59b0004 ldr r0, [fp, #4]
c0002608: e59f1088 ldr r1, [pc, #140] ; c000269c <kmalloc_wilderness+0x2c4>
c000260c: e1a02003 mov r2, r3
c0002610: eb000232 bl c0002ee0 <term_printf>
So it puts the address of chunk into r3, then perform a ldr to get next_free. It does it again before the second prinf. There's only one core, the DMA is not running, so there's nothing changing the value in memory between the calls.
With -O2:
c0001c38: e1a00006 mov r0, r6
c0001c3c: e59f10d8 ldr r1, [pc, #216] ; c0001d1c <kmalloc_wilderness+0x1b8>
c0001c40: e5942008 ldr r2, [r4, #8]
c0001c44: eb000278 bl c000262c <term_printf>
c0001c48: e1a00006 mov r0, r6
c0001c4c: e59f10cc ldr r1, [pc, #204] ; c0001d20 <kmalloc_wilderness+0x1bc>
c0001c50: e5942008 ldr r2, [r4, #8]
c0001c54: eb000274 bl c000262c <term_printf>
So it still fetches the value with ldr. That's why I get the same thing with both optimization levels.
New edit: I added more printfs, and it seems that the singularity happens at this point:
next->size = -1;
After this line, chunk->next_free turns into Heisenberg's cat. Before, it reads as 0.
The structure is defined as such:
struct kheap_chunk {
size_t prev_size;
size_t size; // -1 for wilderness chunk, bit 0 high if free
struct kheap_chunk *next_free;
};
chunk and next don't overlap.
If I move the "singularity line" below the next->next_free = chunk->next_free, it stops alternating between two values, but it's still weird: chunk->next_free is 0 before *prev_list = next, 0xffffffff after that. But next->next_free is still set to 0.
When using copy-on-write semantics to share memory among processes, how can you test if a memory page is writable or if it is marked as read-only? Can this be done by calling a specific assembler code, or reading a certain spot in memory, or through the OS's API?
On Linux you can examine /proc/pid/maps:
$ cat /proc/self/maps
002b3000-002cc000 r-xp 00000000 68:01 143009 /lib/ld-2.5.so
002cc000-002cd000 r-xp 00018000 68:01 143009 /lib/ld-2.5.so
002cd000-002ce000 rwxp 00019000 68:01 143009 /lib/ld-2.5.so
002d0000-00407000 r-xp 00000000 68:01 143010 /lib/libc-2.5.so
00407000-00409000 r-xp 00137000 68:01 143010 /lib/libc-2.5.so
00409000-0040a000 rwxp 00139000 68:01 143010 /lib/libc-2.5.so
0040a000-0040d000 rwxp 0040a000 00:00 0
00c6f000-00c70000 r-xp 00c6f000 00:00 0 [vdso]
08048000-0804d000 r-xp 00000000 68:01 379298 /bin/cat
0804d000-0804e000 rw-p 00004000 68:01 379298 /bin/cat
08326000-08347000 rw-p 08326000 00:00 0
b7d1b000-b7f1b000 r--p 00000000 68:01 226705 /usr/lib/locale/locale-archive
b7f1b000-b7f1c000 rw-p b7f1b000 00:00 0
b7f28000-b7f29000 rw-p b7f28000 00:00 0
bfe37000-bfe4d000 rw-p bfe37000 00:00 0 [stack]
The first column is the virtual memory address range, the second column contains the permissions (read, write, execute, and private), columns 3-6 contain the offset, major and minor device numbers, the inode, and the name of memory mapped files.
On Win32, the best way is to use VirtualQuery. It returns a MEMORY_BASIC_INFORMATION for the page an address falls in. One of the members is Protect, which is some combination of these flags, which contain the possible protection modes. The function also tells you if the memory is free, committed, reserved, and whether it is private, part of an image or shared memory section.
The OS's API is the best way to detirmine the a page's protection. The CPU reads the protection mode from a page descriptor, which is only accessible from kernel mode.
Are you talking abou the variety of shared memory allocated via shmget (on Unix)? I.e.
int shmget(key_t, size_t, int);
If so, you can query that memory using
int shmctl(int, int, struct shmid_ds *);
For example:
key_t key = /* your choice of memory api */
int flag = /* set of flags for your app */
int shmid = shmget(key, 4096, flag);
struct shmid_ds buf;
int result = shmctl(shmid, IPC_STAT, &buf);
/* buf.ipc_perm.mode contains the permissions for the memory segment */
If you're using Win32, there are the calls IsBadReadPtr and IsBadWritePtr. However, their use is discouraged:
"The general consensus is that the IsBad family of functions (IsBadReadPtr, IsBadWritePtr, and so forth) is broken and should not be used to validate pointers."
The title of Raymond Chen's take on this says it all: "IsBadXxxPtr should really be called CrashProgramRandomly"
Chen has some helpful advice about how to deal with this issue here.
The upshot is, you shouldn't be testing this kind of thing at run-time. Code so that you know what you're being handed, and if it's not what's expected, treat it as a bug. If you really have no choice, look into SEH for handling the exception.