Can you limit the size of data that can be deserialized in Ktor? - json-deserialization

In Ktor, is there a way to limit size of data that can be attempted to be deserialized from JSON? Context is defending against denial-of-service attacks where a malicious client might try and send a huge payload to cause out-of-memory issues
I've used a similar capability in Play before (https://www.playframework.com/documentation/2.8.x/ScalaBodyParsers#Max-content-length). You can set the maximum globally, and also specifically override on individual routes.

Unfortunately, there is no built-in functionality in Ktor to limit the size of a body for deserialization. Still, you can write an interceptor for the ApplicationReceivePipeline pipeline to replace the ByteReadChannel (body bytes are read from this object) with another channel. The latter's implementation will count a total number of bytes read and throw an exception if that count exceeds some limit. Here is an example:
import io.ktor.application.*
import io.ktor.features.*
import io.ktor.request.*
import io.ktor.routing.*
import io.ktor.serialization.*
import io.ktor.server.engine.*
import io.ktor.server.netty.*
import io.ktor.util.pipeline.*
import io.ktor.utils.io.*
import kotlinx.coroutines.GlobalScope
fun main() {
embeddedServer(Netty, port = 2222) {
val beforeSerialization = PipelinePhase("")
receivePipeline.insertPhaseBefore(ApplicationReceivePipeline.Transform, beforeSerialization)
receivePipeline.intercept(beforeSerialization) { receive ->
if (subject.value is ByteReadChannel) {
val readChannel = subject.value as ByteReadChannel
val limit = 1024
var totalBytes = 0
val channel = GlobalScope.writer {
val byteArray = ByteArray(4088)
val bytesRead = readChannel.readAvailable(byteArray)
totalBytes += bytesRead
if (totalBytes > limit) {
throw IllegalStateException("Limit ($limit) for receiving exceeded. Read $totalBytes.")
}
channel.writeFully(byteArray, 0, bytesRead)
}.channel
proceedWith(ApplicationReceiveRequest(receive.typeInfo, channel, reusableValue = true))
}
}
install(ContentNegotiation) {
json()
}
routing {
post("/") {
val r = call.receive<MyData>()
println(r)
}
}
}.start(wait = true)
}
#kotlinx.serialization.Serializable
data class MyData(val x: String)

Related

How to run 2 process and exchange input/output between them multiple times?

I'm trying to do something like this:
Future<String> getOutAndAnswer(testcase) async {
Process python = await Process.start('python', ['tasks/histogram/run.py']);
Process java = await Process.start('java', ['solutions/Histogram.java']);
String results = "";
for(int i = 0; i < testcase; i++){
final String out = await python.stdout.transform(utf8.decoder).first;
java.stdin.writeln(out);
final String answer = await java.stdout.transform(utf8.decoder).first;
python.stdin.writeln(answer);
results += "($out, $answer)";
}
return results;
}
Basically, the python program is responsible for generating the input of each test case, then the java program will take the input and return the answer, which is sent to the python program to check if it's correct or not, and so on for every test case.
But when I try to use the above code I get an error saying I've already listened to the stream once:
Exception has occurred.
StateError (Bad state: Stream has already been listened to.)
Python program:
import os
CASE_DIR = os.path.join(os.path.dirname(__file__), "cases")
test_cases = next(os.walk(CASE_DIR))[2]
print(len(test_cases))
for case in sorted(test_cases):
with open(os.path.join(CASE_DIR, case), 'r') as f:
print(f.readline(), end='', flush=True)
f.readline()
expected_output = f.readline()
user_output = input()
if expected_output != user_output:
raise ValueError("Wrong answer!")
print("EXIT", flush=True)
Java program:
public class Histogram {
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
int t = scanner.nextInt();
for (int i = 0; i < t; i++) {
String input = scanner.nextLine();
String answer = calculateAnswer(input);
System.out.println(answer);
}
}
}
Your issue is with .first which is going to listen to the stream, get the first element, and then immediately stop listening to the stream. See the documentation here: https://api.dart.dev/stable/2.17.3/dart-async/Stream/first.html
You should instead listen once and define an onData method to perform the steps. See the documentation for .listen() here: https://api.dart.dev/stable/2.17.3/dart-async/Stream/listen.html
You could try wrapping the stdout streams in StreamIterator<String>. You will have to give it a try to verify, but I think this is what you are looking for.
Future<String> getOutAndAnswer(int testcase) async {
Process python = await Process.start('python', ['tasks/histogram/run.py']);
Process java = await Process.start('java', ['solutions/Histogram.java']);
String results = "";
StreamIterator<String> pythonIterator = StreamIterator(
python.stdout.transform(utf8.decoder).transform(LineSplitter()));
StreamIterator<String> javaIterator = StreamIterator(
java.stdout.transform(utf8.decoder).transform(LineSplitter()));
for (int i = 0; i < testcase; i++) {
if (await pythonIterator.moveNext()) {
final String out = pythonIterator.current;
if (out == 'EXIT') {
break;
}
java.stdin.writeln(out);
if (await javaIterator.moveNext()) {
final String answer = javaIterator.current;
python.stdin.writeln(answer);
results += "($out, $answer)";
}
}
}
await pythonIterator.cancel();
await javaIterator.cancel();
return results;
}
You may need to add the following imports:
import 'dart:async';
import 'dart:convert';

Import big set in Core data with relations in batches

I'm trying to import a large data set of around 80k objects. I'm trying to follow Apple example
I have two issues:
In the code example there is a comment:
// taskContext.performAndWait runs on the URLSession's delegate queue
// so it won’t block the main thread.
But in my case I'm not using URLSession to fetch the JSON. The file is bundled with the app. In this case how to make sure the import won’t block the main thread. Should I create a custom queue ? Any example ?
In the example it's just importing an array of entities. But in my case I need to import just one entity that has 70k object in a relation to many.
So what I want to achieve is:
If there is a ContactBook don't import anything because we have already imported the JSON.
If there is no ContactBook create one and import all the 70k Contact object to the contacts relation of the ContactBook. This should happen in batches like in the example and should not block the UI.
What I have tried:
private func insertContactbookIfNeeded() {
let fetch: NSFetchRequest<Contactbook> = ContactBook.fetchRequest()
let contactBookCount = (try? context.count(for: fetch)) ?? 0
if contactBookCount > 0 {
return
}
let contacts = Bundle.main.decode([ContactJSON].self, from: "contacts.json")
// Process records in batches to avoid a high memory footprint.
let batchSize = 256
let count = contacts.count
// Determine the total number of batches.
var numBatches = count / batchSize
numBatches += count % batchSize > 0 ? 1 : 0
for batchNumber in 0 ..< numBatches {
// Determine the range for this batch.
let batchStart = batchNumber * batchSize
let batchEnd = batchStart + min(batchSize, count - batchNumber * batchSize)
let range = batchStart..<batchEnd
// Create a batch for this range from the decoded JSON.
let contactsBatch = Array(contacts[range])
// Stop the entire import if any batch is unsuccessful.
if !importOneBatch(contactsBatch) {
assertionFailure("Could not import batch number \(batchNumber) range \(range)")
return
}
}
}
private func importOneBatch(_ contactsBatch: [ContactJSON]) -> Bool {
var success = false
// Create a private queue context.
let taskContext = container.newBackgroundContext()
taskContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy
// NOT TRUE IN MY CASE: (Any suggestion ??)
// taskContext.performAndWait runs on the URLSession's delegate queue
// so it won’t block the main thread.
print("isMainThread: \(Thread.isMainThread)") // prints true
taskContext.performAndWait {
let fetchRequest: NSFetchRequest<ContactBook> = ContactBook.fetchRequest()
fetchRequest.returnsObjectsAsFaults = true
fetchRequest.includesSubentities = false
let contactBookCount = (try? taskContext.count(for: fetchRequest)) ?? 0
var contactBook: ContactBook?
if contactBookCount > 0 {
do {
contactBook = try taskContext.fetch(fetchRequest).first
} catch let error as NSError {
assertionFailure("can't fetch the contactBook \(error)")
}
} else {
contactBook = ContactBook(context: taskContext)
}
guard let book = contactBook else {
assertionFailure("Could not fetch the contactBook")
return
}
// Create a new record for each contact in the batch.
for contactJSON in contactsBatch {
// Create a Contact managed object on the private queue context.
let contact = Contact(context: taskContext)
// Populate the Contact's properties using the raw data.
contact.name = contactJSON.name
contact.subContacts = NSSet(array: contactJSON.subContacts { subC -> Contact in
let contact = Contact(context: taskContext)
contact.name = subC.name
})
book.addToContacts(contact)
}
// Save all insertions and deletions from the context to the store.
if taskContext.hasChanges {
do {
try taskContext.save()
} catch {
print("Error: \(error)\nCould not save Core Data context.")
return
}
// Reset the taskContext to free the cache and lower the memory footprint.
taskContext.reset()
}
success = true
}
return success
}
The problem is that this is very slow because in each batch I fetch the workbook (which is getting bigger in each iteration) to be able to insert new batch of contacts in the contact book. Is there a efficient way to avoid fetching the workbook in each batch ? also any suggestion to make this is faster ? increase the batch size ? create a background queue ?
Update:
I have tried to create a ContactBook once in insertWordbookIfNeeded and pass it to importOneBatch with each iteration but I get:
Thread 1: Exception: "Illegal attempt to establish a relationship
'contactBook' between objects in different contexts

TCP input data packets are combined in Dart

I have a simple TCP client in dart:
import 'dart:io';
void main() {
const sendData = "\$I,Z,0.5,5,0*\r\n";
final socket = Socket.connect("192.168.1.100", int.parse("8008"))
.timeout(Duration(seconds: 5))
.whenComplete(() {
print("Connected!");
}).catchError((_) {
print("Error!");
});
socket.then((soc) {
soc.write(sendData);
soc.listen((data) {
print(String.fromCharCodes(data).trim);
});
});
}
This program sends a special message to server and after that, the server sends back a bunch of data every 10 ms. The output is as follows:
$I,1,250,0,206*$I,1,248,0,192*$I,1,246,0,178*$I,1,245,0,165*
$I,1,244,0,153*$I,1,244,0,141*$I,1,244,0,131*$I,1,245,0,121*
$I,1,246,0,113*$I,1,248,0,105*$I,1,250,0,98*
$I,1,253,0,92*$I,2,0,0,86*$I,2,4,0,82*$I,2,8,0,79*$I,2,12,0,76*
$I,2,18,0,74*$I,2,23,0,74*$I,2,29,0,74*$I,2,36,0,75*$I,2,42,0,77*$I,2,50,0,80*$I,2,58,0,84*$I,2,66,0,89*$I,2,74,0,94*
$I,2,83,0,101*$I,2,93,0,109*$I,2,103,0,117*$I,2,113,0,126*$I,2,124,0,136*$I,2,134,0,147*
$I,2,146,0,159*$I,2,157,0,171*$I,2,169,0,185*$I,2,182,0,199*$I,2,194,0,214*$I,2,207,0,230*$I,2,220,0,246*$I,2,233,1,8*$I,2,247,1,26*$I,3,5,1,44*$I,3,19,1,64*$I,3,33,1,84*
$I,3,48,1,105*$I,3,63,1,126*$I,3,77,1,148*$I,3,93,1,171*$I,3,108,1,194*$I,3,123,1,217*
$I,3,138,1,242*$I,3,154,2,10*$I,3,169,2,35*$I,3,185,2,61*$I,3,201,2,87*$I,3,216,2,113*$I,3,232,2,140*$I,3,248,2,167*$I,4,7,2,195*
$I,4,23,2,223*$I,4,39,2,251*$I,4,54,3,23*$I,4,70,3,51*
The server sends data in $I,1,250,0,206* format, i.e. starts with $ and ends with *. As one may note, several consecutive data packages are concatenated incorrectly.
Whenever I increase the interval, for example to 200 ms, everything is ok.
What should I do?
UPDATE
Besides the BrettSutton answer which is true, the contributers in dart github gave a more complete answer here.
I decided to parse the packet in Socket listen handler, split that and append it to a list. As I want to show the data on a chart, I reset the list after 100 data. However the data could be logged as well.
soc.listen((data) {
var av = data.length;
if (av != 0) {
var stList = String.fromCharCodes(data).trim().split("\$");
stList.forEach((str) {
if (str.isNotEmpty) {
var strS = str.split(",");
if (strS != null) y = parseData(strS);
sampleList.add(y);
}
});
print(sampleList);
if (sampleList.length > 100) {
sampleList.clear();
}
print("==========");
}
});

How can I merge multiple Streams into a higher level Stream?

I have two streams, Stream<A> and Stream<B>. I have a constructor for a type C that takes an A and a B. How do I merge the two Streams into a Stream<C>?
import 'dart:async' show Stream;
import 'package:async/async.dart' show StreamGroup;
main() async {
var s1 = stream(10);
var s2 = stream(20);
var s3 = StreamGroup.merge([s1, s2]);
await for(int val in s3) {
print(val);
}
}
Stream<int> stream(int min) async* {
int i = min;
while(i < min + 10) {
yield i++;
}
}
See also http://news.dartlang.org/2016/03/unboxing-packages-async-part-2.html
prints
10
20
11
21
12
22
13
23
14
24
15
25
16
26
17
27
18
28
19
29
You can use StreamZip in package:async to combine two streams into one stream of pairs, then create the C objects from that.
import "package:async" show StreamZip;
...
Stream<C> createCs(Stream<A> as, Stream<B> bs) =>
new StreamZip([as, bs]).map((ab) => new C(ab[0], ab[1]));
If you need to react when either Stream<A> or Stream<B> emits an event and use the latest value from both streams, use combineLatest.
Stream<C> merge(Stream<A> streamA, Stream<B> streamB) {
return streamA
.combineLatest(streamB, (a, b) => new C(a, b));
}
For people that need to combine more than two streams of different types and get all latest values on each update of any stream.
import 'package:stream_transform/stream_transform.dart';
Stream<List> combineLatest(Iterable<Stream> streams) {
final Stream<Object> first = streams.first.cast<Object>();
final List<Stream<Object>> others = [...streams.skip(1)];
return first.combineLatestAll(others);
}
The combined stream will produce:
streamA: a----b------------------c--------d---|
streamB: --1---------2-----------------|
streamC: -------&----------%---|
combined: -------b1&--b2&---b2%---c2%------d2%-|
Why not StreamZip? Because StreamZip would produce:
streamA: a----b------------------c--------d---|
streamB: --1---------2-----------------|
streamC: -------&----------%---|
combined: -------a1&-------b2%--|
Usage:
Stream<T> sA;
Stream<K> sB;
Stream<Y> sC;
combineLatest([sA, sB, sC]).map((data) {
T resA = data[0];
K resB = data[1];
Y resC = data[2];
return D(resA, resB, resC);
});
To get combined two streams when the second takes a result from the first one use asyncExpand
Stream<UserModel?> getCurrentUserModelStream() {
return FirebaseAuth.instance.authStateChanges().asyncExpand<UserModel?>(
(currentUser) {
if (currentUser == null) {
return Stream.value(null);
}
return FirebaseFirestore.instance
.collection('users')
.doc(currentUser.uid)
.snapshots()
.map((doc) {
final userData = doc.data();
if (userData == null) {
return null;
}
return UserModel.fromJson(userData);
});
},
);
}
Using rxdart, you can use CombineLatestStream to achieve what you want. Note that the new stream doesn't return any value until all streams emitted at least one event:
You can create the combined stream using CombineLatestStream.list():
import 'package:rxdart/rxdart.dart';
Stream<A> s1 = Stream.fromIterable([A()]);
Stream<B> s2 = Stream.fromIterable([B()]);
Stream<dynamic> s3 = CombineLatestStream.list<dynamic>([s1, s2])
..listen(
(value) {
A a = value[0];
B b = value[1];
},
);
Since Dart doesn't support union types, a downside of CombineLatestStream.list() is that events from streams of different types should be casted afterwards (due to List<dynamic>). Another approach is to use CombineLatestStream.combine2() (e.g. with a combiner that creates a Tuple2) to keep the types.

async Future StreamSubscription Error

Could someone please explain what's wrong with the following code. I'm making two calls to the function fInputData. The first works ok, the second results in an error :
"unhandled exception"
"Bad state: Stream already has subscriber"
I need to write a test console program that inputs multiple parameters.
import "dart:async" as async;
import "dart:io";
void main() {
fInputData ("Enter Nr of Iterations : ")
.then((String sResult){
int iIters;
try {
iIters = int.parse(sResult);
if (iIters < 0) throw new Exception("Invalid");
} catch (oError) {
print ("Invalid entry");
exit(1);
}
print ("In Main : Iterations selected = ${iIters}");
fInputData("Continue Processing? (Y/N) : ") // this call bombs
.then((String sInput){
if (sInput != "y" && sInput != "Y")
exit(1);
fProcessData(iIters);
print ("Main Completed");
});
});
}
async.Future<String> fInputData(String sPrompt) {
async.Completer<String> oCompleter = new async.Completer();
stdout.write(sPrompt);
async.Stream<String> oStream = stdin.transform(new StringDecoder());
async.StreamSubscription oSub;
oSub = oStream.listen((String sInput) {
oCompleter.complete(sInput);
oSub.cancel();
});
return oCompleter.future;
}
void fProcessData(int iIters) {
print ("In fProcessData");
print ("iIters = ${iIters}");
for (int iPos = 1; iPos <= iIters; iPos++ ) {
if (iPos%100 == 0) print ("Processed = ${iPos}");
}
print ("In fProcessData - completed ${iIters}");
}
Some background reading:
Streams comes in two flavours: single or multiple (also known as
broadcast) subscriber. By default, our stream is a single-subscriber
stream. This means that if you try to listen to the stream more than
once, you will get an exception, and using any of the callback
functions or future properties counts as listening.
You can convert the single-subscriber stream into a broadcast stream
by using the asBroadcastStream() method.
So you've got two options - either re-use a single subscription object. i.e. call listen once, and keep the subscription object alive.
Or use a broadcast stream - note there are a number of differences between broadcast streams and single-subscriber streams, you'll need to read about those and make sure they suit your use-case.
Here's an example of reusing a subscriber to ask multiple questions:
import 'dart:async';
import 'dart:io';
main() {
var console = new Console();
var loop;
loop = () => ask(console).then((_) => loop());
loop();
}
Future ask(Console console) {
print('1 + 1 = ...');
return console.readLine().then((line) {
print(line.trim() == '2' ? 'Yup!' : 'Nope :(');
});
}
class Console {
StreamSubscription<String> _subs;
Console() {
var input = stdin
.transform(new StringDecoder())
.transform(new LineTransformer());
_subs = input.listen(null);
}
Future<String> readLine() {
var completer = new Completer<String>();
_subs.onData(completer.complete);
return completer.future;
}
}

Resources