Is it possible to dump and investigate shared memory content from Linux? I spoted some strange shared memory segments in a "ipcs -m" output and want to see what is in there.
Also is it possible to determine the creator of this segment. "nattch" seems to be always zero.
Have a look at this tool
Shmcat
It's a good tool for your purpose.
What do you mean with creator? Do you mean the PID of the process? In this last case you can use
ipcs -mp
You'll get this output:
------ Shared Memory Creator/Last-op --------
shmid owner cpid lpid
3211265 root 1857 1866
where
CPID
The process ID of the job that created the shared memory segment.
and
LPID
The process ID of the last job to attach or detach from the shared memory segment or change the semaphore value.
Edit:
I don't think is possible to log those informations with standard tools.
I think we can do in this way.
Suppose we execute the command:
ipcs -m
and get these results
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 3211265 root 644 80 2
Then, with the command grep 3211265 /proc/*/maps, we obtain:
/proc/1862/maps:bla bla bla rw-s 00000000 00:09 3211265 /SYSV00000000 (deleted)
/proc/1863/maps:bla bla bla rw-s 00000000 00:09 3211265 /SYSV00000000 (deleted)
In this way we get the processes that was attached to the segment.
Scanning the elements in /proc/*/maps, you are able to discover the PIDs that are currently attached to a given segment.
You can make use of bash script that log these particular information.
Related
I'm using u-boot on raspberry pi 4, A/B booting from USB attached SSD, integrated with mender without yocto. Everything works fine except the env saving: initially configured to use MMC and offsets, fw_printenv complained about a bad CRC and output the default config instead. I changed the env saving to FAT files on the boot partition, and I'm now troubleshooting 2 issues:
uboot.env does not get written by issuing saveenv in the u-boot prompt
uboot-redund.env gets written but its CRC is incorrect.
I'm checking the CRC by issuing the fw_printenv command from linux. Its config file states:
/boot/u-boot/uboot-redund.env 0x0000 0x4000
U-Boot is compiled with 0x4000 as env size, and using hexdump to check the file shows a correct file with a length of 0x4000.
When booting up, u-boot outputs the following log over TTL serial:
U-Boot 2021.07-rc2-00246-gd64b3c608d-dirty (Jun 16 2021 - 12:16:24 +0200)
DRAM: 7.9 GiB
RPI 4 Model B (0xd03114)
MMC: mmcnr#7e300000: 1, emmc2#7e340000: 0
Loading Environment from FAT... In: serial
Out: vidconsole
Err: vidconsole
Net: eth0: ethernet#7d580000
PCIe BRCM: link up, 5.0 Gbps x1 (SSC)
starting USB...
Bus xhci_pci: Register 5000420 NbrPorts 5
Starting the controller
USB XHCI 1.00
scanning bus xhci_pci for devices... 3 USB Device(s) found
scanning usb for storage devices... 1 Storage Device(s) found
Hit any key to stop autoboot: 0
which shows that u-boot does load the env from FAT correctly.
How can I investigate further why u-boot does not write uboot.env, and why when it does write uboot-redund.env, writes it with a wrong CRC from fw_printenv's point of view?
env save
In function env_fat_save only one file is written. If CONFIG_SYS_REDUNDAND_ENVIRONMENT is defined, alternatively either file CONFIG_ENV_FAT_FILE_REDUND or file CONFIG_ENV_FAT_FILE is written.
You have to execute the command env save twice if you want to write both files.
Command env infoindicates which instance was written last via field env_valid:
=> env select FAT
Select Environment on FAT: OK
=> env info
env_valid = invalid
env_ready = true
env_use_default = false
=> env save
Saving Environment to FAT... OK
=> env info
env_valid = redundant
env_ready = true
env_use_default = false
=> env save
Saving Environment to FAT... OK
=> env info
env_valid = valid
env_ready = true
env_use_default = false
=>
fw_printenv
When using fw_printenv you must specify the configuration which matches your U-Boot build. By default file /etc/fw_env.config is used. You can pass the configuration file on the command line:
tools/env/fw_printenv -c fw_env.config
This is what your configuration file could look like:
uboot.env 0x0000 0x2000
uboot-redund.env 0x0000 0x2000
There is an example file in tools/env/fw_env.config explaining the available fields.
I'm running dask over slurm via jobqueue and I have been getting 3 errors pretty consistently...
Basically my question is what could be causing these failures? At first glance the problem is that too many workers are writing to disk at once, or my workers are forking into many other processes, but it's pretty difficult to track that. I can ssh into the node but I'm not seeing an abnormal number of processes, and each node has a 500gb ssd, so I shouldn't be writing excessively.
Everything below this is just information about my configurations and such
My setup is as follows:
cluster = SLURMCluster(cores=1, memory=f"{args.gbmem}GB", queue='fast_q', name=args.name,
env_extra=["source ~/.zshrc"])
cluster.adapt(minimum=1, maximum=200)
client = await Client(cluster, processes=False, asynchronous=True)
I suppose i'm not even sure if processes=False should be set.
I run this starter script via sbatch under the conditions of 4gb of memory, 2 cores (-c) (even though i expect to only need 1) and 1 task (-n). And this sets off all of my jobs via the slurmcluster config from above. I dumped my slurm submission scripts to files and they look reasonable.
Each job is not complex, it is a subprocess.call( command to a compiled executable that takes 1 core and 2-4 GB of memory. I require the client call and further calls to be asynchronous because I have a lot of conditional computations. So each worker when loaded should consist of 1 python processes, 1 running executable, and 1 shell.
Imposed by the scheduler we have
>> ulimit -a
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8192
-c: core file size (blocks) 0
-m: resident set size (kbytes) unlimited
-u: processes 512
-n: file descriptors 1024
-l: locked-in-memory size (kbytes) 64
-v: address space (kbytes) unlimited
-x: file locks unlimited
-i: pending signals 1031203
-q: bytes in POSIX msg queues 819200
-e: max nice 0
-r: max rt priority 0
-N 15: unlimited
And each node has 64 cores. so I don't really think i'm hitting any limits.
i'm using the jobqueue.yaml file that looks like:
slurm:
name: dask-worker
cores: 1 # Total number of cores per job
memory: 2 # Total amount of memory per job
processes: 1 # Number of Python processes per job
local-directory: /scratch # Location of fast local storage like /scratch or $TMPDIR
queue: fast_q
walltime: '24:00:00'
log-directory: /home/dbun/slurm_logs
I would appreciate any advice at all! Full log is below.
FORK BLOCKING IO ERROR
distributed.nanny - INFO - Start Nanny at: 'tcp://172.16.131.82:13687'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/dbun/.local/share/pyenv/versions/3.7.0/lib/python3.7/multiprocessing/forkserver.py", line 250, in main
pid = os.fork()
BlockingIOError: [Errno 11] Resource temporarily unavailable
distributed.dask_worker - INFO - End worker
Aborted!
CANT START NEW THREAD ERROR
https://pastebin.com/ibYUNcqD
BLOCKING IO ERROR
https://pastebin.com/FGfxqZEk
EDIT:
Another piece of the puzzle:
It looks like dask_worker is running multiple multiprocessing.forkserver calls? does that sound reasonable?
https://pastebin.com/r2pTQUS4
This problem was caused by having ulimit -u too low.
As it turns out each worker has a few processes associated with it, and the python ones have multiple threads. In the end you end up with approximately 14 threads that contribute to your ulimit -u. Mine was set to 512, and with a 64 core system I was likely hitting ~896. It looks like the a maximum threads per a process I could have had would have been 8.
Solution:
in .zshrc (.bashrc) I added the line
ulimit -u unlimited
Haven't had any problems since.
I am working on yocto-project to create images for BBB.I cloned the project git clone -b pyro git://git.yoctoproject.org/poky then initiated the build process. Baked it with bitbake core-image-sato and got the build directory with files.
I created 2 partitions on SD card with 64M for root and rest(15+GB) for boot.
Copied MLO and u-boot-beaglebone.img to the root partition.
Untared the core-image-sato-beaglebone.tar.bz2on root partition and then copied zImage-beaglebone.bin, zImage-am335x-bone.dtb, zImage-am335x-boneblack.dtb under boot partition.
When I tried to boot BBB found that u-boot expects uEnv.txt and get stuck there. The yocto build directory doesn't have any uEnv.txt, so how to write own uEnv.txt ? This is the u-boot prompt.
Hit any key to stop autoboot: 0
gpio: pin 53 (gpio 53) value is 1
mmc0 is current device
micro SD card found
mmc0 is current device
gpio: pin 54 (gpio 54) value is 1
SD/MMC found on device 0
reading uEnv.txt
** Unable to read file uEnv.txt **
gpio: pin 55 (gpio 55) value is 1
** File not found /boot/uImage **
U-Boot#
I added uEnv.txt in root partition with text as
mmcdev=0
mmcpart=1
bootpart=0:1
This time u-boot tries to read a uImage from /boot directory but I have zImage whats this conflict now? How should I resolve it?
SD/MMC found on device 0
reading uEnv.txt
32 bytes read in 4 ms (7.8 KiB/s)
Loaded environment from uEnv.txt
Importing environment from mmc ...
gpio: pin 55 (gpio 55) value is 1
reading /boot/uImage
** Unable to read file /boot/uImage **
U-Boot#
It seems that U-boot is not able to find the uEnv.txt file. Try these configurations. You might need to modify some of the configuration based on your environment.
sudo vim uEnv.txt
kernel_file=zImage
bootdir=/boot
mmcdev=0
mmcpart=2
loadzimage=load mmc ${mmcdev}:${mmcpart} ${loadaddr} ${bootdir}/${kernel_file}
loadfdt=load mmc ${mmcdev}:${mmcpart} ${fdtaddr} ${bootdir}/${fdtfile}
console=ttyO0,115200n8
mmcroot=/dev/mmcblk0p2 ro
mmcrootfstype=ext4 rootwait fixrtc
mmcargs=setenv bootargs console=${console} root=${mmcroot} rootfstype=${mmcrootfstype} ${optargs}
uenvcmd=run loadzimage; run loadfdt; run mmcargs; bootz ${loadaddr} - ${fdtaddr}
Copy zImage and dtb to the boot partition :
sudo cp -v /<path_to_kernel>/arch/arm/boot/zImage <path_to_boot>/boot/
sudo cp -v /<path_to_kernel>/arch/arm/boot/dts/am335x-boneblack.dtb <path_to_boot>/boot/
I have used spiffsimg to create a single file containing multiple lua files:
# ./spiffsimg -f lua.img -c 262144 -r lua.script
f 4227 init.lua
f 413 cfg.lua
f 2233 setupWifi.lua
f 7498 configServer.lua
f 558 cfgForm.htm
f 4255 setupConfig.lua
f 14192 main.lua
#
I then use esptool.py to flash the NodeMCU firmware and the file containing the lua files to the esp8266 (NodeMCU dev kit):
c:\esptool-master>c:\Python27\python esptool.py -p COM7 write_flash -fs 32m -fm dio 0x00000 nodemcu-dev-9-modules-2016-07-18-12-06-36-integer.bin 0x78000 lua.img
esptool.py v1.0.2-dev
Connecting...
Running Cesanta flasher stub...
Flash params set to 0x0240
Writing 446464 # 0x0... 446464 (100 %)
Wrote 446464 bytes at 0x0 in 38.9 seconds (91.9 kbit/s)...
Writing 262144 # 0x78000... 262144 (100 %)
Wrote 262144 bytes at 0x78000 in 22.8 seconds (91.9 kbit/s)...
Leaving...
I then run ESPLorer to check the status and get:
PORT OPEN 115200
Communication with MCU..Got answer! AutoDetect firmware...
Can't autodetect firmware, because proper answer not received.
NodeMCU custom build by frightanic.com
branch: dev
commit: b21b3e08aad633ccfd5fd29066400a06bb699ae2
SSL: true
modules: file,gpio,http,net,node,rtctime,tmr,uart,wifi
build built on: 2016-07-18 12:05
powered by Lua 5.1.4 on SDK 1.5.4(baaeaebb)
lua: cannot open init.lua
>
----------------------------
No files found.
----------------------------
>
Total : 3455015 bytes
Used : 0 bytes
Remain: 3455015 bytes
The NodeMCU firmware flashed correctly, but the lua files can't be located.
I have tried flashing to other locations (0x84000, 0x7c000), but I am just guessing at these locations based on reading threads on github.
I used the NodeMCU file.fscfg() routine to get the flash address and size. If I only flash the NodeMCU firmware I get the following:
print (file.fscfg())
524288 3653632
534288 is 0x80000, so I tried flashing only the spiffsimg file (lua.img) to 0x8000, then ran the same print statement and got:
print (file.fscfg())
786432 3391488
The flash address incremented by the exact number of bytes in the lua.img - which I don't understand, why would the flash address change? Is the first number returned by file.fscfg not the starting flash address, but the ending flash address?
What is the correct address for flashing an image file, contain lua files, that was created by spiffsimg?
The version of spiffsimg found here will provide the correct address for flashing an image file that contains lua files.
Do not use this version of spiffsimg as it is out of date.
To install the spiffsimg utility, you need to download and install the entire nodemcu-firmware package (into a linux environment, use make to install - note: make on my debian linux box generated an error, but i was able to go to the ../tools/spiffsimg subdirectory and run make on the Makefile found in that directory to create the utility).
The spiffsimg instructions found here are quite clear, with one exception: the file name you specify, with the -f parameter, needs to include the characters %x. The %x will be replaced with the address that the image file should be flashed to.
For example, the command
spiffsimage -f %x-luaFiles.img -S 4MB -U 465783 -r lua.script
will create a file, in the local directory, with a name like: 80000-luaFiles.img. Which means you should install that image file at address 0x80000 on the ESP8266.
I've never done that myself but I'm reasonably confident the correct answer can be extracted from the docs.
-f specifies the filename for the disk image. '%x' will be replaced
by the calculated offset of the file system.
And a bit further down
The disk image file is placed into the bin directory and it is named
0x<offset>-<size>.bin where the offset is the location where it
should be flashed, and the size is the size of the flash part.
However, there's a slight mismatch between the two statements. We may have a bug in the docs. If "'%x' will be replaced..." then I'd expected the final name won't contain 0x anymore.
Furthermore, it is possible to define a fixed SPIFFS position when you build the firmware.
#define SPIFFS_FIXED_LOCATION 0x100000
This specifies that the SPIFFS filesystem starts at 1Mb from the start of the flash. Unless
otherwise specified, it will run to the end of the flash (excluding
the 16k of space reserved by the SDK).
Lets say I have opened two tabs in the konsole (Tab1 and Tab2).
When I run tty in both of them I have:
Tab1:
~$ tty
/dev/pts/23
Tab2:
~$ tty
/dev/pts/24
If I run a simple program hello.c with a printf("Hello") in Tab1, how the system goes from writing to the stdout (file id 1) to writing to /dev/pts/23, being read by the konsole and then appearing in Tab1?
How the system know it has to give the "Hello" string to /dev/pts/23 and not to /dev/pts/24? And how it does that?
Is there a parameter given by the bash to the program so it knows which psudoterminal to send the "Hello"? Or the program sends the string back to the bash (how?) who knows to which pseudoterminal to send the data?
Thank you for your help
If you look at your process open files, you can see that the STDOUT,STDERR, etc points to the specific psuedo terminal that you already figured out using tty in your question
root#hello:~# ls -l /proc/self/fd
total 0
lrwx------ 1 root root 64 May 21 02:18 0 -> /dev/pts/3
lrwx------ 1 root root 64 May 21 02:18 1 -> /dev/pts/3
lrwx------ 1 root root 64 May 21 02:18 2 -> /dev/pts/3
As you might know, a process is created by a fork system call that actually duplicates the open file descriptors from the parent. so basically, your process gets the file descriptors from its parent.
How did the parent hot these associated with him ? well, konsole already dealt with that.