Issue increasing memsize to desired value in SAS - memory

I am running a proc mixed code where the default 2GB provides insufficient memory to run.
I changed the memsize in the config file to 4G and it did change to 4GB when checking in proc options; run;. However, it is still not enough for proc mixed to execute.
When I change it to 8G, I ran proc options; run; to check the memsize and it was still stuck at 4GB unfortunately.
I have a 16GB computer so I thought I would not come across such an issue. Is there a workaround?

Are you running 64-bit SAS? The only reason I can think of it would be limited is if you're using a 32-bit version of SAS, or if you changed the wrong config file. The standard config file location should be:
C:\Program Files\SASHome\SASFoundation\9.4\nls\en\sasv9.cfg
If you're running UTF-8, it is located in:
C:\Program Files\SASHome\SASFoundation\9.4\nls\u8\sasv9.cfg
Changing -MEMSIZE to 8G should change it.
One test would be to invoke SAS with this option set explicitly:
sas.exe -MEMSIZE 8G
proc options group=memory;
run;
Should show:
MEMSIZE=8589934592
If it does not show 8GB, you must be running a 32-bit version of SAS. SAS will automatically set the max memory to 4GB even if the config file is above that value.

Related

Request larger than allowed

I have an OpenWhisk deployment and I am using docker. I want to increase max memory limit. I think it can be done with enviroment variables. Which enviroment variable can I use to fix my problem?
I get the following error.
Request larger than allowed: 3600245 > 1048576 bytes. (code 9RKHKJtJVqWK7jz3o608HcQagzLHFfvt)

mmap error : cannot allocate memory. how to allocate enough default sized huge pages as admin?

I was compiling and running this program but received 'mmap error : cannot allocate memory'.
The comment at the top reads
/*
* Example of using hugepage memory in a user application using the mmap
* system call with MAP_HUGETLB flag. Before running this program make
* sure the administrator has allocated enough default sized huge pages
* to cover the 256 MB allocation.
*
* For ia64 architecture, Linux kernel reserves Region number 4 for hugepages.
* That means the addresses starting with 0x800000... will need to be
* specified. Specifying a fixed address is not required on ppc64, i386
* or x86_64.
*/
I want to check if the administrator has allocated enough default sized huge pages to cover the 256 MB allocation but I am the system administrator. What should I do? I'm on ubuntu 20.04 x86_64 machine. ( a side question : Does mmap use heap area?)
ADD : please see my comment (I added a boot command argument and the code works. I temporarily added boot argument in the grub menu.) but I wish I could add a init script so that this takes effect every time the computer boots.
There seem to be 2 methods.
add vm.nr_hugepages = 16 in /etc/sysctrl.conf and reboot.
I've checked this works.
(as Nate Eldredge commented) add 'hugepages=16' in GRUB_CMDLINE_LINUX="" line (inside quotes) of /etc/default/grub and do update-grub.

ARM: Safe physical memory position (to reserve) for my ARM hypervisor in relation to a Linux/Android guest

I am developing a basic hypervisor on ARM (using the board Arndale Exynos 5250).
I want to load Linux(ubuntu or smth else)/Android as the guest. Currently I'm using a Linaro distribution.
I'm almost there, most of the big problems have already been dealt with, except for the last one: reserving memory for my hypervisor such that the kernel does not try to OVERWRITE it BEFORE parsing the FDT or the kernel command line.
The problem is that my Linaro distribution's U-Boot passes a FDT in R2 to the linux kernel, BUT the kernel tries to overwrite my hypervisor's memory before seeing that I reserved that memory region in the FDT (by decompiling the DTB, modifying the DTS and recompiling it). I've tried to change the kernel command-line parameters, but they are also parsed AFTER the kernel tries to overwrite my reserved portion of memory.
Thus, what I need is a safe memory location in the physical RAM where to put my hypervisor's code at such that the Linux kernel won't try to access (r/w) it BEFORE parsing the FDT or it's kernel command line.
Context details:
The system RAM layout on Exynos 5250 is: physical RAM starts at 0x4000_0000 (=1GB) and has the length 0x8000_0000 (=2GB).
The linux kernel is loaded (by U-Boot) at 0x4000_7000, it's size (uncompressed uImage) is less than 5MB and it's entry point is set to be at 0x4000_8000;
uInitrd is loaded at 0x4200_0000 and has the size less than 2MB
The FDT (board.dtb) is loaded at 0x41f0_0000 (passed in R2) and has the size less than 35KB
I currently load my hypervisor at 0x40C0_0000 and I want to reserve 200MB (0x0C80_0000) starting from that address, but the kernel tries to write there (a stage 2 HYP trap tells me that) before looking in the FDT or in the command line to see that the region is actually reserved. If instead I load my hypervisor at 0x5000_0000 (without even modifying the original DTB or the command line), it does not try to overwrite me!
The FDT is passed directly, not through ATAGs
Since when loading my hypervisor at 0x5000_0000 the kernel does not try to overwrite it whatsoever, I assume there are memory regions that Linux does not touch before parsing the FDT/command-line. I need to know whether this is true or not, and if true, some details regarding these memory regions.
Thanks!
RELATED QUESTION:
Does anyone happen to know what is the priority between the following: ATAGs / kernel-command line / FDT? For instance, if I reserve memory through the kernel command-line, but not in the FDT (.dtb) should it work or is the command-line overriden by the FDT? Is there somekind of concatenation between these three?
As per
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/arm/Booting, safe locations start 128MB from start of RAM (assuming the kernel is loaded in that region, which is should be). If a zImage was loaded lower in memory than what is likely to be the end address of the decompressed image, it might relocate itself higher up before it starts decompressing. But in addition to this, the kernel has a .bss region beyond the end of the decompressed image in memory.
(Do also note that your FDT and initrd locations already violate this specification, and that the memory block you are wanting to reserve covers the locations of both of these.)
Effectively, your reserved area should go after the FDT and initrd in memory - which 0x50000000 is. But anything > 0x08000000 from start of RAM should work, portably, so long as that doesn't overwrite the FDT, initrd or U-Boot in memory.
The priority of kernel/FDT/bootloader command line depends on the kernel configuration - do a menuconfig and check under "Boot options". You can combine ATAGS with the built-in command lines, but not FDT - after all, the FDT chosen node is supposed to be generated by the bootloader - U-boot's FDT support is OK so you should let it do this rather than baking it into the .dts if you want an FDT command line.
The kernel is pretty conservative before it's got its memory map since it has to blindly trust the bootloader has laid things out as specified. U-boot on the other hand is copying bits of itself all over the place and is certainly the culprit for the top end of RAM - if you #define DEBUG in (I think) common/board_f.c you'll get a dump of what it hits during relocation (not including the Exynos iRAM SPL/boot code stuff, but that won't make a difference here anyway).

Why is proc upload so slow?

I've also posted this question on runsubmit.com, a site outside the SE network for SAS-related questions.
At work there are 2 sas servers I use. When I transfer a sas dataset from one to the other via proc upload, it goes at about 2.5MB/s. However, if I map the drive on one server as a network drive and copy and paste the file across, it runs much faster, around 80MB/s (over the same gigabit connection).
Could anyone suggest what might be causing this and what I can do either to fix it or as a workaround?
There is also a third server I use that cannot map network drives on the other two- SAS is the only available means of transferring files from that one, so I need a SAS-based solution. Although individual transfers from this one run at 2.5MB/s, I've found that it's possible to have several transfers all going in parallel, each at 2.5MB/s.
Would SAS FTP via filenames and a data step be any faster than using proc upload? I might try that next, but I would prefer not to use this- we only have SAS 9.1.3, so SFTP isn't available.
Update - Further details:
I'm connecting to a spawner, and I think it uses 'SAS proprietary encryption' (based on what I recall seeing in the logs).
The uploads are Windows client -> Windows remote in the first case and Unix client -> Windows remote in the second case.
The SAS datasets in question are compressed (i.e. by SAS, not some external compression utility).
The transfer rate is similar when using proc upload to transfer external files (.bz2) in binary mode.
All the servers have very fast disk arrays handled by enterprise-grade controllers (minimum 8 drives in RAID 10)
Potential solutions
Parallel PROC UPLOAD - potentially fast enough, but extremely CPU-heavy
PROC COPY - much faster than PROC UPLOAD, much less CPU overhead
SAS FTP - not secure, unknown speed, unknown CPU overhead
Update - test results
Parallel PROC UPLOAD: involves quite a lot of setup* and a lot of CPU, but works reasonably well.
PROC COPY: exactly the same transfer rate per session as proc upload, and far more CPU time used.
FTP: About 20x faster, minimal CPU (100MB/s vs. 2.5MB/s per parallel proc upload).
*I initially tried the following:
local session -> remote session on source server -> n remote sessions
on destination server -> Recombine n pieces on destination server
Although this resulted in n simultaneous transfers, they each ran at 1/n of the original rate, probably due to a CPU bottleneck on the source server. To get it to work with n times the bandwidth of a single transfer, I had to set it up as:
local session -> n remote sessions on source server -> 1 remote
session each on destination server -> Recombine n pieces on destination server
SAS FTP code
filename source ftp '\dir1\dir2'
host='servername'
binary dir
user="&username" pass="&password";
let work = %sysfunc(pathname(work));
filename target "&work";
data _null_;
infile source('dataset.sas7bdat') truncover;
input;
file target('dataset.sas7bdat');
put _infile_;
run;
My understanding of PROC UPLOAD is that it is performing a record-by-record upload of the file along with some conversions and checks, which is helpful in some ways, but not particularly fast. PROC COPY, on the other hand, will happily copy the file without being quite as careful to maintain things like indexes and constraints; but it will be much faster. You just have to define a libref for your server's files.
For example, I sign on to my server and assign it the 'unix' nickname. Then I define a library on it:
libname uwork server=unix slibref=work;
Then I execute the following PROC COPY code, using a randomly generated 1e7 row datafile. Following that, I also RSUBMIT a PROC UPLOAD for comparison purposes.
48 proc copy in=work out=uwork;
NOTE: Writing HTML Body file: sashtml.htm
49 select test;
50 run;
NOTE: Copying WORK.TEST to UWORK.TEST (memtype=DATA).
NOTE: There were 10000000 observations read from the data set WORK.TEST.
NOTE: The data set UWORK.TEST has 10000000 observations and 1 variables.
NOTE: PROCEDURE COPY used (Total process time):
real time 13.07 seconds
cpu time 1.93 seconds
51 rsubmit;
NOTE: Remote submit to UNIX commencing.
3 proc upload data=test;
4 run;
NOTE: Upload in progress from data=WORK.TEST to out=WORK.TEST
NOTE: 80000000 bytes were transferred at 1445217 bytes/second.
NOTE: The data set WORK.TEST has 10000000 observations and 1 variables.
NOTE: Uploaded 10000000 observations of 1 variables.
NOTE: The data set WORK.TEST has 10000000 observations and 1 variables.
NOTE: PROCEDURE UPLOAD used:
real time 55.46 seconds
cpu time 42.09 seconds
NOTE: Remote submit to UNIX complete.
PROC COPY is still not quite as fast as OS copying, but it's much closer in speed. PROC UPLOAD is actually quite a bit slower than even a regular data step, because it's doing some checking; in fact, here the data step is comparable to PROC COPY due to the simplicity of the dataset (and probably the fact that I have a 64k block size, meaning that a data step is using the server's 16k block size while PROC COPY presumably does not).
52 data uwork.test;
53 set test;
54 run;
NOTE: There were 10000000 observations read from the data set WORK.TEST.
NOTE: The data set UWORK.TEST has 10000000 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 12.60 seconds
cpu time 1.66 seconds
In general in 'real world' situations, PROC COPY is faster than a data step, but both are faster than PROC UPLOAD - unless you need to use proc upload because of complexities in your situation (I have never seen a reason to, but I know it is possible). I think that PROC UPLOAD was more necessary in older versions of SAS but is largely unneeded now, but given my experience is fairly limited in terms of hardware setups this may not apply to your situation.
FTP, if available from the source server, is much faster than proc upload or proc copy. These both operate on a record-by-record basis and can be CPU-bound over fast network connections, especially for very wide datasets. A single FTP transfer will attempt to use all available bandwidth, with negligible CPU cost.
This assumes that the destination server can use the unmodified transferred file - if not, the time required to make it usable might negate the increased transfer speed of FTP.

Finding what hard drive sectors occupy a file

I'm looking for a nice easy way to find what sectors occupy a given file. My language preference is C#.
From my A-Level Computing class I was taught that a hard drive has a lookup table on the first few KB of the disk. In this table there is a linked list for each file detailing what sectors that file occupies. So I'm hoping there's a convinient way to look in this table for a certain file and see what sectors it occupies.
I have tried Google'ing but I am finding nothing useful. Maybe I'm not searching for the right thing but I can't find anything at all.
Any help is appreciated, thanks.
About Drives
The physical geometry of modern hard drives is no longer directly accessible by the operating system. Early hard drives were simple enough that it was possible to address them according to their physical structure, cylinder-head-sector. Modern drives are much more complex and use systems like zone bit recording , in which not all tracks have the same amount of sectors. It's no longer practical to address them according to their physical geometry.
from the fdisk man page:
If possible, fdisk will obtain the disk geometry automatically. This is not necessarily the physical disk geometry (indeed, modern disks do not really have anything
like a physical geometry, certainly not something that can be described in simplistic Cylinders/Heads/Sectors form)
To get around this problem modern drives are addressed using Logical Block Addressing, which is what the operating system knows about. LBA is an addressing scheme where the entire disk is represented as a linear set of blocks, each block being a uniform amount of bytes (usually 512 or larger).
About Files
In order to understand where a "file" is located on a disk (at the LBA level) you will need to understand what a file is. This is going to be dependent on what file system you are using. In Unix style file systems there is a structure called an inode which describes a file. The inode stores all the attributes a file has and points to the LBA location of the actual data.
Ubuntu Example
Here's an example of finding the LBA location of file data.
First get your file's inode number
$ ls -i
659908 test.txt
Run the file system debugger. "yourPartition" will be something like sda1, it is the partition that your file system is located on.
$sudo debugfs /dev/yourPartition
debugfs: stat <659908>
Inode: 659908 Type: regular Mode: 0644 Flags: 0x80000
Generation: 3039230668 Version: 0x00000000:00000001
...
...
Size of extra inode fields: 28
EXTENTS:
(0): 266301
The number under "EXTENTS", 266301, is the logical block in the file system that your file is located on. If your file is large there will be multiple blocks listed. There's probably an easier way to get that number, I couldn't find one.
To validate that we have the right block use dd to read that block off the disk. To find out your file system block size, use dumpe2fs.
dumpe2fs -h /dev/yourPartition | grep "Block size"
Then put your block size in the ibs= parameter, and the extent logical block in the skip= parameter, and run dd like this:
sudo dd if=/dev/yourPartition of=success.txt ibs=4096 count=1 skip=266301
success.txt should now contain the original file's contents.
sudo hdparm --fibmap file
For ext, vfat and NTFS ..maybe more.
fibmap is also a linux C library.

Resources