Stream of millions of objects takes too much memory - dart

I'm generating a load of coordinates (made of 3 numbers) within a geographical area. However, using Streams (which should be much more efficient than Lists), fills up the app's memory very quickly, as can be seen in this screenshot from Observatory.
I need a structure where events can go in, and be read out one by one, and when this happens, removed from the structure. As far as I understand, that is what a Stream is. When you add a value, the old one is removed.
Unfortunatley, this doesn't appear to be happening. Instead, the stream just grows larger and larger - or at least something reading it does, but I just run the .length method on the returned stream, and that's it.
Here's the function that starts the Isolate that returns the stream of coordinate tiles. I'll omit the actual generator, as it's not important: it just sends a Coord to the SendPort.
static Stream<Coords<num>> _generateTilesComputer(
DownloadableRegion region,
) async* {
List<List<double>> serialiseOutline(l) => (l as List)
.cast<LatLng>()
.map((e) => [e.latitude, e.latitude])
.toList();
final port = ReceivePort();
final tilesCalc = await Isolate.spawn(
region.type == RegionType.rectangle
? rectangleTiles
: region.type == RegionType.circle
? circleTiles
: lineTiles,
{
'port': port.sendPort,
'rectOutline': region.type != RegionType.rectangle
? null
: serialiseOutline(region.points),
'circleOutline': region.type != RegionType.circle
? null
: serialiseOutline(region.points),
'lineOutline': region.type != RegionType.line
? null
: (region.points as List<List<LatLng>>)
.chunked(4)
.map((e) => e.map(serialiseOutline)),
'minZoom': region.minZoom,
'maxZoom': region.maxZoom,
'crs': region.crs,
'tileSize': region.options.tileSize,
},
);
await for (final Coords<num>? coord in port
.skip(region.start)
.take(((region.end ?? double.maxFinite) - region.start).toInt())
.cast()) {
if (coord == null) {
port.close();
tilesCalc.kill();
return;
}
yield coord;
}
}
}
How can I prevent this memory leak? Happy to add more info if needed, but the full source code can be found at https://github.com/JaffaKetchup/flutter_map_tile_caching.

Does this help a bit? It's your bottom bit. The .batch method is used to read values in batches of 500, which can be changed to a different value if it is needed. The count variable is used to keep track of the number of values processed, and when it reaches the limit, the port is closed and the isolate is killed.
int count = 0;
final limit = ((region.end ?? double.maxFinite) - region.start).toInt();
await for (final Coords<num> coord in port
.skip(region.start)
.batch(500)) {
if (count >= limit) {
port.close();
tilesCalc.kill();
return;
}
count += coord.length;
yield coord;
}
}
}

To force the deletion of values from the stream when they are read out, you can implement a buffer using a StreamController and limit the number of values in the buffer. When the buffer reaches its limit, you can remove the first value in the buffer and add the next one. This will ensure that the memory usage stays under control.
Here's an example implementation:
static Stream<Coords<num>> _generateTilesComputer(
DownloadableRegion region,
) async* {
List<List<double>> serialiseOutline(l) => (l as List)
.cast<LatLng>()
.map((e) => [e.latitude, e.latitude])
.toList();
final port = ReceivePort();
final controller = StreamController<Coords<num>>();
final tilesCalc = await Isolate.spawn(
region.type == RegionType.rectangle
? rectangleTiles
: region.type == RegionType.circle
? circleTiles
: lineTiles,
{
'port': port.sendPort,
'rectOutline': region.type != RegionType.rectangle
? null
: serialiseOutline(region.points),
'circleOutline': region.type != RegionType.circle
? null
: serialiseOutline(region.points),
'lineOutline': region.type != RegionType.line
? null
: (region.points as List<List<LatLng>>)
.chunked(4)
.map((e) => e.map(serialiseOutline)),
'minZoom': region.minZoom,
'maxZoom': region.maxZoom,
'crs': region.crs,
'tileSize': region.options.tileSize,
},
);
final bufferSize = 1000;
int count = 0;
port
.skip(region.start)
.take(((region.end ?? double.maxFinite) - region.start).toInt())
.cast()
.listen((Coords<num> coord) {
if (coord == null) {
controller.close();
port.close();
tilesCalc.kill();
return;
}
if (count >= bufferSize) {
controller.add(coord);
controller.remove(0);
} else {
controller.add(coord);
count++;
}
});
yield* controller.stream;
}

Related

Dart Stream: How to merge emitted items, when they're short after eachother?

Let's assume I have a Stream<int> emitting integers in different time deltas i.e. between 5ms and 1000ms.
When the delta is <= 50ms I want to merge them. for example:
3, (delta:100) 5, (delta:27) 6, (delta:976) 3
I want to consume: 3, 11(merged using addition), 3.
Is this possible?
You can use the debounceBuffer stream transformer from the stream_transform package.
stream
.transform(debounceBuffer(const Duration(milliseconds: 50)))
.map((list) => list.fold(0, (t, e) => t + e))
You can write that easily enough yourself:
Stream<int> debounce(
Stream<int> source, Duration limit, int combine(int a, int b)) async* {
int prev;
var stopwatch;
await for (var event in source) {
if (stopwatch == null) {
// First event.
prev = event;
stopwatch = Stopwatch()..start();
} else {
if (stopwatch.elapsed < limit) {
prev = combine(prev, event);
} else {
yield prev;
prev = event;
}
stopwatch.reset();
}
}
// If any event, yield prev.
if (stopwatch != null) yield prev;
}

An exception of type 'System.OutOfMemoryException' occurred in itextsharp.dll but was not handled in user code

I am using iTextSharp to create pdf. I have 100k records, but I am getting following exception:
An exception of type 'System.OutOfMemoryException' occurred in
itextsharp.dll but was not handled in user code At the line:
bodyTable.AddCell(currentProperty.GetValue(lst, null).ToString());
Code is:
var doc = new Document(pageSize);
PdfWriter.GetInstance(doc, stream);
doc.Open();
//Get exportable count
int columns = 0;
Type currentType = list[0].GetType();
//PREPARE HEADER
//foreach visible columns check if current object has proerpty
//else search in inner properties
foreach (var visibleColumn in visibleColumns)
{
if (currentType.GetProperties().FirstOrDefault(p => p.Name == visibleColumn.Key) != null)
{
columns++;
}
else
{
//check child property objects
var childProperties = currentType.GetProperties();
foreach (var prop in childProperties)
{
if (prop.PropertyType.BaseType == typeof(BaseEntity))
{
if (prop.PropertyType.GetProperties().FirstOrDefault(p => p.Name == visibleColumn.Key) != null)
{
columns++;
break;
}
}
}
}
}
//header
var headerTable = new PdfPTable(columns);
headerTable.WidthPercentage = 100f;
foreach (var visibleColumn in visibleColumns)
{
if (currentType.GetProperties().FirstOrDefault(p => p.Name == visibleColumn.Key) != null)
{
//headerTable.AddCell(prop.Name);
headerTable.AddCell(visibleColumn.Value);
}
else
{
//check child property objects
var childProperties = currentType.GetProperties();
foreach (var prop in childProperties)
{
if (prop.PropertyType.BaseType == typeof(BaseEntity))
{
if (prop.PropertyType.GetProperties().FirstOrDefault(p => p.Name == visibleColumn.Key) != null)
{
//headerTable.AddCell(prop.Name);
headerTable.AddCell(visibleColumn.Value);
break;
}
}
}
}
}
doc.Add(headerTable);
var bodyTable = new PdfPTable(columns);
bodyTable.Complete = false;
bodyTable.WidthPercentage = 100f;
//PREPARE DATA
foreach (var lst in list)
{
int col = 1;
foreach (var visibleColumn in visibleColumns)
{
var currentProperty = currentType.GetProperties().FirstOrDefault(p => p.Name == visibleColumn.Key);
if (currentProperty != null)
{
if (currentProperty.GetValue(lst, null) != null)
bodyTable.AddCell(currentProperty.GetValue(lst, null).ToString());
else
bodyTable.AddCell(string.Empty);
col++;
}
else
{
//check child property objects
var childProperties = currentType.GetProperties().Where(p => p.PropertyType.BaseType == typeof(BaseEntity));
foreach (var prop in childProperties)
{
currentProperty = prop.PropertyType.GetProperties().FirstOrDefault(p => p.Name == visibleColumn.Key);
if (currentProperty != null)
{
var currentPropertyObjectValue = prop.GetValue(lst, null);
if (currentPropertyObjectValue != null)
{
bodyTable.AddCell(currentProperty.GetValue(currentPropertyObjectValue, null).ToString());
}
else
{
bodyTable.AddCell(string.Empty);
}
break;
}
}
}
}
}
doc.Add(bodyTable);
doc.Close();
A back of the envelope computation of the memory requirements given the data you provided for memory consumption gives 100000 * 40 * (2*20+4) = 167MBs. Well within your memory limit, but it is just a lower bound. I imagine each Cell object is pretty big. If each cell would have a 512 byte overhead you could be well looking at 2GB taken. I reckon it might be even more, as PDF is a complex beast.
So you might realistically be looking at a situation where you are actually running out of memory. If not your computers, then at least the bit C# has set aside for its own thing.
I would do one thing first - check memory consumption like here. You might even do well to try with 10, 100, 1000, 10000, 100000 rows and see up until what number of rows the program works.
You could perhaps try a different thing altogether. If you're trying to print a nicely formatted table with a lot of data, perhaps you could output an HTML document, which can be done incrementally and which you can do by just writing stuff to a file, rather than using a third party library. You can then "print" that HTML document to PDF. StackOverflow to the rescue again with this problem.

In Firebase, how do I handle new children added after I statically loaded the latest N?

Here's my pagination/infinite scrolling scenario:
Load the initial N with startAt().limit(N).once('value'). Populate a list items.
On scroll, load the next N. (I pass a priority to startAt() but that's tangential.)
When a new item is added, I'd like to pop it to the top of items.
If I use a .onChildAdded listener for step 3, it finds all the items including those I've already pulled in thus creating duplicates. Is there a better way?
Another method would be to use the .onChildAdded listener for the initial N in step 1 instead of .once, but when the initial N items come in I do items.add(item) to sort one after the other as they are already in order, but with the new one that comes in after the fact I need to somehow know it's unique so I can do items.insert(0, item) to force it to the top of the list. I'm not sure how to set this up, or if I'm off the mark here.
EDIT: Still in flux, see: https://groups.google.com/forum/#!topic/firebase-talk/GyYF7hfmlEM
Here's a working solution I came up with:
class FeedViewModel extends Observable {
int pageSize = 20;
#observable bool reloadingContent = false;
#observable bool reachedEnd = false;
var snapshotPriority = null;
bool isFirstRun = true;
FeedViewModel(this.app) {
loadItemsByPage();
}
/**
* Load more items pageSize at a time.
*/
loadItemsByPage() {
reloadingContent = true;
var itemsRef = f.child('/items_by_community/' + app.community.alias)
.startAt(priority: (snapshotPriority == null) ? null : snapshotPriority).limit(pageSize+1);
int count = 0;
// Get the list of items, and listen for new ones.
itemsRef.once('value').then((snapshot) {
snapshot.forEach((itemSnapshot) {
count++;
// Don't process the extra item we tacked onto pageSize in the limit() above.
print("count: $count, pageSize: $pageSize");
// Track the snapshot's priority so we can paginate from the last one.
snapshotPriority = itemSnapshot.getPriority();
if (count > pageSize) return;
// Insert each new item into the list.
// TODO: This seems weird. I do it so I can separate out the method for adding to the list.
items.add(toObservable(processItem(itemSnapshot)));
// If this is the first item loaded, start listening for new items.
// By using the item's priority, we can listen only to newer items.
if (isFirstRun == true) {
listenForNewItems(snapshotPriority);
isFirstRun = false;
}
});
// If we received less than we tried to load, we've reached the end.
if (count <= pageSize) reachedEnd = true;
reloadingContent = false;
});
// When an item changes, let's update it.
// TODO: Does pagination mean we have multiple listeners for each page? Revisit.
itemsRef.onChildChanged.listen((e) {
Map currentData = items.firstWhere((i) => i['id'] == e.snapshot.name);
Map newData = e.snapshot.val();
newData.forEach((k, v) {
if (k == "createdDate" || k == "updatedDate") v = DateTime.parse(v);
if (k == "star_count") v = (v != null) ? v : 0;
if (k == "like_count") v = (v != null) ? v : 0;
currentData[k] = v;
});
});
}
listenForNewItems(endAtPriority) {
// If this is the first item loaded, start listening for new items.
var itemsRef = f.child('/items').endAt(priority: endAtPriority);
itemsRef.onChildAdded.listen((e) {
print(e.snapshot.getPriority());
print(endAtPriority);
if (e.snapshot.getPriority() != endAtPriority) {
print(e.snapshot.val());
// Insert new items at the top of the list.
items.insert(0, toObservable(processItem(e.snapshot)));
}
});
}
void paginate() {
if (reloadingContent == false && reachedEnd == false) loadItemsByPage();
}
}
Load the initial N with startAt().limit(N).once('value'). Populate a list items.
On the first run, note the first item's priority, then start an onChildAdded listener that has an endAt() with that priority. This means it'll only listen to stuff from there and above.
In that listener, ignore the first event which is the topmost item we already have, and for everything else, add that to the top of the list.
Of course, on scroll, load the next N.
EDIT: Updated w/ some fixes, and including the listener for changes.

async.Future async.Completer - how to "continue" if an error

Some help with the following would be appreciated. I am writing some console test programs, and I want to be able to enter some parameters from the terminal (I don't want to use command line arguments - too many parameters). I have tried some variations, but I cannot find how to accomplish this. The following is the latest version of my test for terminal input. The problem with this program is that if an error is encountered, the Completer closes automatically, and I want to continue from either the Main() or from fGetNumber() function. While I can see why this program doesn't work, it illustrates what I need to achieve - re-enter the number, but I cannot find how to achieve that. If a valid number is entered, there is no problem. If an invalid number is entered, I cannot find out how to re-enter the number.
The code is as follows, and the problem I have is highlighted by "//////////" :
import "dart:async" as async;
import "dart:io";
void main() {
fGetNumber("Enter Nr of Iterations : ", 0, 999999)
.then((int iIters){
print ("In Main : Iterations selected = ${iIters}");
if (iIters == null) {
print ("In Main: Invalid Number of iterations : ${iIters}.");
} else {
fProcessData(iIters);
}
print ("Main Completed");
});
}
async.Future<int> fGetNumber(String sPrompt, int iMin, int iMax) {
print ("In fGetNumber");
int iIters = 0;
async.Completer<int> oCompleter = new async.Completer();
while (!oCompleter.isCompleted) { /////////// This loop does not work ///////
return fGetUserInput(sPrompt).then((String sIters) {
iIters = int.parse(sIters);
if (iIters < iMin || iIters > iMax) throw new Exception("Invalid");
oCompleter.complete(iIters);
return oCompleter.future;
}).catchError((_) => print ("Invalid - number must be from ${iMin} to ${iMax}")
).whenComplete(() => print ("fGetNumber - whenComplete"));// always gets here
}
print ("In fGetNumber (at end of function)"); //// it never gets here
}
async.Future<String> fGetUserInput(String sPrompt) {
print ("In fGetUserInput");
async.Completer<String> oCompleter = new async.Completer();
stdout.write(sPrompt);
async.Stream<String> oStream = stdin.transform(new StringDecoder());
async.StreamSubscription oSub;
oSub = oStream.listen((String sData) {
oCompleter.complete("$sData");
oSub.cancel();
});
return oCompleter.future;
}
void fProcessData(int iIters) {
print ("In fProcessData");
for (int iPos = 1; iPos <= iIters; iPos++ ) {
if (iPos%100 == 0) print ("Processed = ${iPos}");
}
print ("In fProcessData - completed ${iIters}");
}
// This loop does not work
Of course it does - you enter it exactly once, where you immediately return and therefore leave the loop and method.
// always gets here
That's because whenComplete() always gets called, on success or on error.
// it never gets here
Because you already returned out of the method.
So what can be done?
The easiest way would be to not rely on fGetUserInput(). Listen to stdin in fGetNumber and only complete the completer / cancel the subscription if the input is valid:
async.Future<int> fGetNumber(String sPrompt, int iMin, int iMax) {
print ("In fGetNumber");
async.Completer<String> oCompleter = new async.Completer();
stdout.write(sPrompt);
async.Stream<String> oStream = stdin.transform(new StringDecoder());
async.StreamSubscription oSub;
oSub = oStream.listen((String sData) {
try {
int iIters = int.parse(sData);
if (iIters < iMin || iIters > iMax) throw new Exception("Invalid");
oCompleter.complete(iIters);
oSub.cancel();
} catch(e) {
print("Invalid - number must be from ${iMin} to ${iMax}");
stdout.write(sPrompt);
}
});
return oCompleter.future;
}
Are there alternatives?
Of course. There are likely many, many ways to do this. This one for example:
async.Future<int> fGetNumber(String sPrompt, int iMin, int iMax) {
print ("In fGetNumber");
async.Completer<int> oCompleter = new async.Completer();
fGetUserInput(sPrompt, oCompleter, (String sIters) {
try {
int iIters = int.parse(sIters);
if (iIters < iMin || iIters > iMax) throw new Exception("Invalid");
return iIters;
} catch(e) {
print ("Invalid - number must be from ${iMin} to ${iMax}");
stdout.write(sPrompt);
}
return null;
});
return oCompleter.future;
}
void fGetUserInput(String sPrompt, async.Completer oCompleter, dynamic inputValidator(String sData)) {
print ("In fGetUserInput");
stdout.write(sPrompt);
async.Stream<String> oStream = stdin.transform(new StringDecoder());
async.StreamSubscription oSub;
oSub = oStream.listen((String sData) {
var d = inputValidator(sData);
if(d != null) {
oCompleter.complete(d);
oSub.cancel();
}
});
}
If you really feel there should be something addressed by the Dart team, you could write a feature request. But the Completer is designed to only be completed once. Whatever code you write, you can't just loop to complete it again and again.

async Future StreamSubscription Error

Could someone please explain what's wrong with the following code. I'm making two calls to the function fInputData. The first works ok, the second results in an error :
"unhandled exception"
"Bad state: Stream already has subscriber"
I need to write a test console program that inputs multiple parameters.
import "dart:async" as async;
import "dart:io";
void main() {
fInputData ("Enter Nr of Iterations : ")
.then((String sResult){
int iIters;
try {
iIters = int.parse(sResult);
if (iIters < 0) throw new Exception("Invalid");
} catch (oError) {
print ("Invalid entry");
exit(1);
}
print ("In Main : Iterations selected = ${iIters}");
fInputData("Continue Processing? (Y/N) : ") // this call bombs
.then((String sInput){
if (sInput != "y" && sInput != "Y")
exit(1);
fProcessData(iIters);
print ("Main Completed");
});
});
}
async.Future<String> fInputData(String sPrompt) {
async.Completer<String> oCompleter = new async.Completer();
stdout.write(sPrompt);
async.Stream<String> oStream = stdin.transform(new StringDecoder());
async.StreamSubscription oSub;
oSub = oStream.listen((String sInput) {
oCompleter.complete(sInput);
oSub.cancel();
});
return oCompleter.future;
}
void fProcessData(int iIters) {
print ("In fProcessData");
print ("iIters = ${iIters}");
for (int iPos = 1; iPos <= iIters; iPos++ ) {
if (iPos%100 == 0) print ("Processed = ${iPos}");
}
print ("In fProcessData - completed ${iIters}");
}
Some background reading:
Streams comes in two flavours: single or multiple (also known as
broadcast) subscriber. By default, our stream is a single-subscriber
stream. This means that if you try to listen to the stream more than
once, you will get an exception, and using any of the callback
functions or future properties counts as listening.
You can convert the single-subscriber stream into a broadcast stream
by using the asBroadcastStream() method.
So you've got two options - either re-use a single subscription object. i.e. call listen once, and keep the subscription object alive.
Or use a broadcast stream - note there are a number of differences between broadcast streams and single-subscriber streams, you'll need to read about those and make sure they suit your use-case.
Here's an example of reusing a subscriber to ask multiple questions:
import 'dart:async';
import 'dart:io';
main() {
var console = new Console();
var loop;
loop = () => ask(console).then((_) => loop());
loop();
}
Future ask(Console console) {
print('1 + 1 = ...');
return console.readLine().then((line) {
print(line.trim() == '2' ? 'Yup!' : 'Nope :(');
});
}
class Console {
StreamSubscription<String> _subs;
Console() {
var input = stdin
.transform(new StringDecoder())
.transform(new LineTransformer());
_subs = input.listen(null);
}
Future<String> readLine() {
var completer = new Completer<String>();
_subs.onData(completer.complete);
return completer.future;
}
}

Resources