Avoiding excessive memory allocation in golang when using an io.Writer - memory

I am working on a command line tool in Go called redis-mass that converts a bunch of redis commands into redis protocol format.
The first step was to port the node.js version, almost literally to Go. I used ioutil.ReadFile(inputFileName) to get a string version of the file and then returned an encoded string as output.
When I ran this on a file with 2,000,000 redis commands, it took about 8 seconds, compared to about 16 seconds with the node version. I guessed that the reason it was only twice as fast was because it was reading the whole file into memory first, so I changed my encoding function to accept a pair (raw io.Reader, enc io.Writer), and it looks like this:
func EncodeStream(raw io.Reader, enc io.Writer) {
var args []string
var length int
scanner := bufio.NewScanner(raw)
for scanner.Scan() {
command := strings.TrimSpace(scanner.Text())
args = parse(command)
length = len(args)
if length > 0 {
io.WriteString(enc, fmt.Sprintf("*%d\r\n", length))
for _, arg := range args {
io.WriteString(enc, fmt.Sprintf("$%d\r\n%s\r\n", len(arg), arg))
}
}
}
}
However, this took 12 seconds on the 2 million line file, so I used github.com/pkg/profile to see how it was using memory, and it looks like the number of memory allocations is huge:
# Alloc = 3162912
# TotalAlloc = 1248612816
# Mallocs = 46001048
# HeapAlloc = 3162912
Can I constrain the io.Writer to use a fixed sized buffer and avoid all those allocations?
More generally, how can I avoid excessive allocations in this method? Here's the full source for more context

Reduce allocations by working with []byte instead of strings. fmt.Printf directly to the output instead of fmt.Sprintf and io.WriteString.
func EncodeStream(raw io.Reader, enc io.Writer) {
var args []string
var length int
scanner := bufio.NewScanner(raw)
for scanner.Scan() {
command := bytes.TrimSpace(scanner.Bytes())
args = parse(command)
length = len(args)
if length > 0 {
fmt.Printf(enc, "*%d\r\n", length))
for _, arg := range args {
fmt.Printf(enc, "$%d\r\n%s\r\n", len(arg), arg))
}
}
}
}

Related

writing to flash memory dspic33e

I have some questions regarding the flash memory with a dspic33ep512mu810.
I'm aware of how it should be done:
set all the register for address, latches, etc. Then do the sequence to start the write procedure or call the builtins function.
But I find that there is some small difference between what I'm experiencing and what is in the DOC.
when writing the flash in WORD mode. In the DOC it is pretty straightforward. Following is the example code in the DOC
int varWord1L = 0xXXXX;
int varWord1H = 0x00XX;
int varWord2L = 0xXXXX;
int varWord2H = 0x00XX;
int TargetWriteAddressL; // bits<15:0>
int TargetWriteAddressH; // bits<22:16>
NVMCON = 0x4001; // Set WREN and word program mode
TBLPAG = 0xFA; // write latch upper address
NVMADR = TargetWriteAddressL; // set target write address
NVMADRU = TargetWriteAddressH;
__builtin_tblwtl(0,varWord1L); // load write latches
__builtin_tblwth(0,varWord1H);
__builtin_tblwtl(0x2,varWord2L);
__builtin_tblwth(0x2,varWord2H);
__builtin_disi(5); // Disable interrupts for NVM unlock sequence
__builtin_write_NVM(); // initiate write
while(NVMCONbits.WR == 1);
But that code doesn't work depending on the address where I want to write. I found a fix to write one WORD but I can't write 2 WORD where I want. I store everything in the aux memory so the upper address(NVMADRU) is always 0x7F for me. The NVMADR is the address I can change. What I'm seeing is that if the address where I want to write modulo 4 is not 0 then I have to put my value in the 2 last latches, otherwise I have to put the value in the first latches.
If address modulo 4 is not zero, it doesn't work like the doc code(above). The value that will be at the address will be what is in the second set of latches.
I fixed it for writing only one word at a time like this:
if(Address % 4)
{
__builtin_tblwtl(0, 0xFFFF);
__builtin_tblwth(0, 0x00FF);
__builtin_tblwtl(2, ValueL);
__builtin_tblwth(2, ValueH);
}
else
{
__builtin_tblwtl(0, ValueL);
__builtin_tblwth(0, ValueH);
__builtin_tblwtl(2, 0xFFFF);
__builtin_tblwth(2, 0x00FF);
}
I want to know why I'm seeing this behavior?
2)I also want to write a full row.
That also doesn't seem to work for me and I don't know why because I'm doing what is in the DOC.
I tried a simple write row code and at the end I just read back the first 3 or 4 element that I wrote to see if it works:
NVMCON = 0x4002; //set for row programming
TBLPAG = 0x00FA; //set address for the write latches
NVMADRU = 0x007F; //upper address of the aux memory
NVMADR = 0xE7FA;
int latchoffset;
latchoffset = 0;
__builtin_tblwtl(latchoffset, 0);
__builtin_tblwth(latchoffset, 0); //current = 0, available = 1
latchoffset+=2;
__builtin_tblwtl(latchoffset, 1);
__builtin_tblwth(latchoffset, 1); //current = 0, available = 1
latchoffset+=2;
.
. all the way to 127(I know I could have done it in a loop)
.
__builtin_tblwtl(latchoffset, 127);
__builtin_tblwth(latchoffset, 127);
INTCON2bits.GIE = 0; //stop interrupt
__builtin_write_NVM();
while(NVMCONbits.WR == 1);
INTCON2bits.GIE = 1; //start interrupt
int testaddress;
testaddress = 0xE7FA;
status = NVMemReadIntH(testaddress);
status = NVMemReadIntL(testaddress);
testaddress += 2;
status = NVMemReadIntH(testaddress);
status = NVMemReadIntL(testaddress);
testaddress += 2;
status = NVMemReadIntH(testaddress);
status = NVMemReadIntL(testaddress);
testaddress += 2;
status = NVMemReadIntH(testaddress);
status = NVMemReadIntL(testaddress);
What I see is that the value that is stored in the address 0xE7FA is 125, in 0xE7FC is 126 and in 0xE7FE is 127. And the rest are all 0xFFFF.
Why is it taking only the last 3 latches and write them in the first 3 address?
Thanks in advance for your help people.
The dsPIC33 program memory space is treated as 24 bits wide, it is
more appropriate to think of each address of the program memory as a
lower and upper word, with the upper byte of the upper word being
unimplemented
(dsPIC33EPXXX datasheet)
There is a phantom byte every two program words.
Your code
if(Address % 4)
{
__builtin_tblwtl(0, 0xFFFF);
__builtin_tblwth(0, 0x00FF);
__builtin_tblwtl(2, ValueL);
__builtin_tblwth(2, ValueH);
}
else
{
__builtin_tblwtl(0, ValueL);
__builtin_tblwth(0, ValueH);
__builtin_tblwtl(2, 0xFFFF);
__builtin_tblwth(2, 0x00FF);
}
...will be fine for writing a bootloader if generating values from a valid Intel HEX file, but doesn't make it simple for storing data structures because the phantom byte is not taken into account.
If you create a uint32_t variable and look at the compiled HEX file, you'll notice that it in fact uses up the least significant words of two 24-bit program words. I.e. the 32-bit value is placed into a 64-bit range but only 48-bits out of the 64-bits are programmable, the others are phantom bytes (or zeros). Leaving three bytes per address modulo of 4 that are actually programmable.
What I tend to do if writing data is to keep everything 32-bit aligned and do the same as the compiler does.
Writing:
UINT32 value = ....;
:
__builtin_tblwtl(0, value.word.word_L); // least significant word of 32-bit value placed here
__builtin_tblwth(0, 0x00); // phantom byte + unused byte
__builtin_tblwtl(2, value.word.word_H); // most significant word of 32-bit value placed here
__builtin_tblwth(2, 0x00); // phantom byte + unused byte
Reading:
UINT32 *value
:
value->word.word_L = __builtin_tblrdl(offset);
value->word.word_H = __builtin_tblrdl(offset+2);
UINT32 structure:
typedef union _UINT32 {
uint32_t val32;
struct {
uint16_t word_L;
uint16_t word_H;
} word;
uint8_t bytes[4];
} UINT32;

How to find the byte size of a table in Lua

I'm writing a log aggregator and I want to send the logs if it reaches a max byte size. Thus is there a way in Lua to get to know the size of the variable (active_batch size)?
local batch = {
flush_timeout
retry_count
batch_max_size
batch_count
batch_to_execute = {},
active_batch = { entries = {}, count = 0, retries = 0 }
}
You only can have total memory used by LUA by collectgarbage.
In this case I think that storing string len and sum of it will work.

Hash value of String that would be stable across iOS releases?

In documentation String.hash for iOS it says:
You should not rely on this property having the same hash value across
releases of OS X.
(strange why they speak of OS X in iOS documentation)
Well, I need a hasshing function that will not change with iOS releases. It can be simple I do not need anything like SHA. Is there some library for that?
There is another question about this here but the accepted (and only) answer there simply states that we should respect the note in documentation.
Here is a non-crypto hash, for Swift 3:
func strHash(_ str: String) -> UInt64 {
var result = UInt64 (5381)
let buf = [UInt8](str.utf8)
for b in buf {
result = 127 * (result & 0x00ffffffffffffff) + UInt64(b)
}
return result
}
It was derived somewhat from a C++11 constexpr
constexpr uint64_t str2int(char const *input) {
return *input // test for null terminator
? (static_cast<uint64_t>(*input) + // add char to end
127 * ((str2int(input + 1) // prime 127 shifts left almost 7 bits
& 0x00ffffffffffffff))) // mask right 56 bits
: 5381; // start with prime number 5381
}
Unfortunately, the two don't yield the same hash. To do that you'd need to reverse the iterator order in strHash:
for b in buf.reversed() {...}
But that will run 13x slower, somewhat comparable to the djb2hash String extension that I got from https://useyourloaf.com/blog/swift-hashable/
Here are some benchmarks, for a million iterations:
hashValue execution time: 0.147760987281799
strHash execution time: 1.45974600315094
strHashReversed time: 18.7755110263824
djb2hash execution time: 16.0091370344162
sdbmhash crashed
For C++, the str2Int is roughly as fast as Swift 3's hashValue:
str2int execution time: 0.136421

Continuous memory increase when declaring very large array and iterating over stdin

The following code declares two arrays, and then iterates over stdin ( just blindly iterates over the file - no interaction with the arrays ).
This is causing continuous increase in memory.
However, if I just declare two arrays and sleep - there is no increase in memory.
Similarly, if I just iterate over stdin - there is no increase in memory.
But together ( apart from the memory allocated for the arrays) there is a continuous increase.
I measure this by looking at the RES memory using top tool.
I have commented out the first few lines in func doSomething() to show that there is no memory increase when it is commented. Uncommenting the lines and running will cause an increase.
NOTE: This was run on go 1.4.2, 1.5.3 and 1.6
NOTE: You will need to recreate this on a machine with at least 16GB RAM as I have observed it only on the array size of 1 billion.
package main
import (
"bufio"
"fmt"
"io"
"os"
)
type MyStruct struct {
arr1 []int
arr2 []int
}
func (ms *MyStruct) Init(size int, arr1 []int, arr2 []int) error {
fmt.Printf("initializing mystruct arr1...\n")
ms.arr1 = arr1
if ms.arr1 == nil {
ms.arr1 = make([]int, size, size)
}
fmt.Printf("initializing mystruct arr2...\n")
ms.arr2 = arr2
if ms.arr2 == nil {
ms.arr2 = make([]int, size, size)
}
fmt.Printf("done initializing ...\n")
for i := 0; i < size; i++ {
ms.arr1[i] = 0
ms.arr2[i] = 0
}
return nil
}
func doSomething() error {
fmt.Printf("starting...\n")
fmt.Printf("allocating\n")
/* NOTE WHEN UNCOMMENTED CAUSES MEMORY INCREASE
ms := &MyStruct{}
size := 1000000000
ms.Init(size, nil, nil)
*/
fmt.Printf("finished allocating..%d %d\n", len(ms.arr1), len(ms.arr2))
fmt.Printf("reading from stdin...\n")
reader := bufio.NewReader(os.Stdin)
var line string
var readErr error
var lineNo int = 0
for {
if lineNo%1000000 == 0 {
fmt.Printf("read %d lines...\n", lineNo)
}
lineNo++
line, readErr = reader.ReadString('\n')
if readErr != nil {
fmt.Printf("break at %s\n", line)
break
}
}
if readErr == io.EOF {
readErr = nil
}
if readErr != nil {
return readErr
}
return nil
}
func main() {
if err := doSomething(); err != nil {
panic(err)
}
fmt.Printf("done...\n")
}
Is this an issue with my code ? Or is the go system doing something unintended ?
If its the latter, how can I go about debugging this ?
To make it easier to replicate here are pastebin files for good case ( commented portion of the above code) and bad case ( with uncommented portion )
wget http://pastebin.com/raw/QfG22xXk -O badcase.go
yes "1234567890" | go run badcase.go
wget http://pastebin.com/raw/G9xS2fKy -O goodcase.go
yes "1234567890" | go run goodcase.go
Thank you Volker for your above comments. I wanted to capture the process of debugging this as an answer.
The RES top / htop just tells you at a process level what is going on with memory. GODEBUG="gctrace=1" gives you more insight into how the memory is being handled.
A simple run with gctrace set gives the following
root#localhost ~ # yes "12345678901234567890123456789012" | GODEBUG="gctrace=1" go run badcase.go
starting...
allocating
initializing mystruct arr1...
initializing mystruct arr2...
gc 1 #0.050s 0%: 0.19+0.23+0.068 ms clock, 0.58+0.016/0.16/0.25+0.20 ms cpu, 7629->7629->7629 MB, 7630 MB goal, 8 P
done initializing ...
gc 2 #0.100s 0%: 0.070+2515+0.23 ms clock, 0.49+0.025/0.096/0.24+1.6finished allocating..1000000000 1000000000
ms cpu, 15258->15258reading from stdin...
->15258 MB, 15259read 0 lines...
MB goal, 8 P
gc 3 #2.620s 0%: 0.009+0.32+0.23 ms clock, 0.072+0/0.20/0.11+1.8 ms cpu, 15259->15259->15258 MB, 30517 MB goal, 8 P
read 1000000 lines...
read 2000000 lines...
read 3000000 lines...
read 4000000 lines...
....
read 51000000 lines...
read 52000000 lines...
read 53000000 lines...
read 54000000 lines...
What does this mean ?
As you can see, the gc hasn't been called for a while now. This means that all the garbage generated from reader.ReadString hasn't been collected and free'd.
Why isn't the garbage collector collecting this garbage ?
From The go gc
Instead we provide a single knob, called GOGC. This value controls
the total size of the heap relative to the size of reachable objects.
The default value of 100 means that total heap size is now 100% bigger
than (i.e., twice) the size of the reachable objects after the last
collection.
Since GOGC wasn't set - the default was 100%. So, it would have collected the garbage only when it reached ~32GB. ( Since initially the two arrays give you 16GB of heap space - only when heap doubles will the gc trigger ).
How can I change this ?
Try setting the GOGC=25.
With the GOGC as 25
root#localhost ~ # yes "12345678901234567890123456789012" | GODEBUG="gctrace=1" GOGC=25 go run badcase.go
starting...
allocating
initializing mystruct arr1...
initializing mystruct arr2...
gc 1 #0.051s 0%: 0.14+0.30+0.11 ms clock, 0.42+0.016/0.31/0.094+0.35 ms cpu, 7629->7629->7629 MB, 7630 MB goal, 8 P
done initializing ...
finished allocating..1000000000 1000000000
gc 2 #0.102s reading from stdin...
12%: 0.058+2480+0.26 ms clock, 0.40+0.022/2480/0.10+1.8 ms cpu, 15258->15258->15258 MB, 15259 MB goal, 8 P
read 0 lines...
gc 3 #2.584s 12%: 0.009+0.20+0.22 ms clock, 0.075+0/0.24/0.046+1.8 ms cpu, 15259->15259->15258 MB, 19073 MB goal, 8 P
read 1000000 lines...
read 2000000 lines...
read 3000000 lines...
read 4000000 lines...
....
read 19000000 lines...
read 20000000 lines...
gc 4 #6.539s 4%: 0.019+2.3+0.23 ms clock, 0.15+0/2.1/12+1.8 ms cpu, 17166->17166->15258 MB, 19073 MB goal, 8 P
As you can see, another gc was triggered.
But top/htop shows it stable at ~20 GB instead of the calculated 16GB.
The garbage collector doesn't "have" to give it back to the OS. It will sometimes keep it to use efficiently for the future. It doesn't have to keep taking from the OS and giving back - The extra 4 gb is in its pool of free space to use before asking the OS again.

insert numerical sequence in large text file

I need to create a file in this format :
item0000001
item0000002
item0000003
item0000004
item0000005
I was doing this with UltraEdit which has column mode including insert number ( start + increment including leading zeros ).
Unfortunately, UltraEdit bombs out above 1 million rows.
Does anyone know of a text editor with large file capacity that has a similar operation?
BaltoStar has not written which version of UltraEdit was used and how exactly he has tried to create the file.
However, here is an UltraEdit script which can be used to create a file with lines containing an incrementing number with leading zeros according to last number.
To use that script with UltraEdit v14.20 or UEStudio v9.00 or any later version, copy the code block of the script and paste it into a new ASCII file with DOS line terminations in UltraEdit/UEStudio. Save the file for example as CreateLinesWithIncrementingNumber.js into your preferred directory of UE/UES scripts.
Now run the script by clicking on menu item Run Active Script in menu Scripting.
The script prompts the user for first and last value of the incrementing number, and for strings left and right of the incrementing number which can be both also empty strings.
Then lean back and see how the script writes the lines with the incrementing number into a new file in blocks. I created a file with more than 150 MB with an incrementing number from 0 to 5.000.000 within a few seconds using this UltraEdit script.
if (typeof(UltraEdit.clipboardContent) == "string")
{
// Write in blocks of not more than 4 MB into the file. Do not increase
// this value too much as during the script execution much more free
// RAM in a continous block is necessary than the value used here for
// joining the lines in user clipboard 9. A too large value results
// in a memory exception during script execution and the user of the
// script also does not see for several seconds what is going on.
var nBlockSize = 4194304;
// Create a new file and make sure it uses DOS/Windows line terminations
// independent on the user configuration for line endings of new files.
UltraEdit.newFile();
UltraEdit.activeDocument.unixMacToDos();
var sLineTerm = "\r\n"; // Type of line termination is DOS/Windows.
// Ask user of script for the first value to write into the file.
do
{
var nFirstNumber = UltraEdit.getValue("Please enter first value of incrementing number:",1);
if (nFirstNumber < 0)
{
UltraEdit.messageBox("Sorry, but first value cannot be negative.");
}
}
while (nFirstNumber < 0);
// Ask user of script for the last value to write into the file.
do
{
var nLastNumber = UltraEdit.getValue("Please enter last value of incrementing number:",1);
if (nFirstNumber >= nLastNumber)
{
UltraEdit.messageBox("Sorry, but last value must be greater than "+nFirstNumber.toString(10)+".");
}
}
while (nFirstNumber >= nLastNumber);
var sBeforeNumber = UltraEdit.getString("Please enter string left of the incrementing number:",1);
var sAfterNumber = UltraEdit.getString("Please enter string right of the incrementing number:",1);
// http://stackoverflow.com/questions/16378849/ultraedit-how-do-i-pad-out-a-string-with-leading-blanks
// Convert the highest number to a decimal string and get a copy
// of this string with every character replaced by character '0'.
// With last number being 39428 the created string is "00000".
var sLeadingZeros = nLastNumber.toString(10).replace(/./g,"0");
// Instead of writing the string with the incrementing number line
// by line to file which would be very slow and which would create
// lots of undo records, the lines are collected first in an array of
// strings whereby the number of strings in the array is determined
// by value of variable nBlockSize. The lines in the array are
// concatenated into user clipboard 9 and written as block to the
// file using paste command. That is much faster and produces just
// a few undo records even on very large files.
UltraEdit.selectClipboard(9);
UltraEdit.clearClipboard();
// Calculate number of lines per block which depends on the
// lengths of the 4 strings which build a line in the file.
var nLineLength = sBeforeNumber.length + sLeadingZeros.length +
sAfterNumber.length + sLineTerm.length;
var nRemainder = nBlockSize % nLineLength;
var nLinesPerBlock = (nBlockSize - nRemainder) / nLineLength;
var asLines = [];
var nCurrentNumber = nFirstNumber;
while (nLastNumber >= nCurrentNumber)
{
// Convert integer number to decimal string.
var sNumber = nCurrentNumber.toString(10);
// Has the decimal string of the current number less
// characters than the decimal string of the last number?
if (sNumber.length < sLeadingZeros.length)
{
// Build decimal string new with X zeros from the alignment string
// and concatenate this leading zero string with the number string.
sNumber = sLeadingZeros.substr(0,sLeadingZeros.length-sNumber.length) + sNumber;
}
asLines.push(sBeforeNumber + sNumber + sAfterNumber);
if (asLines.length >= nLinesPerBlock)
{
asLines.push(""); // Results in a line termination at block end.
UltraEdit.clipboardContent = asLines.join(sLineTerm);
UltraEdit.activeDocument.paste();
UltraEdit.clearClipboard();
asLines = [];
}
nCurrentNumber++;
}
// Output also the last block.
if (asLines.length)
{
asLines.push("");
UltraEdit.clipboardContent = asLines.join(sLineTerm);
UltraEdit.activeDocument.paste();
UltraEdit.clearClipboard();
}
// Reselect the system clipboard and move caret to top of new file.
UltraEdit.selectClipboard(0);
UltraEdit.activeDocument.top();
}
else if(UltraEdit.messageBox)
{
UltraEdit.messageBox("Sorry, but you need a newer version of UltraEdit/UEStudio for this script.");
}
else
{
UltraEdit.newFile();
UltraEdit.activeDocument.write("Sorry, but you need a newer version of UltraEdit/UEStudio for this script.");
}

Resources