I found the folliwing code sample in BlackBerry Java Development, Best Practices. Could somebody explain what the below same code means? What is the this in the code sample poining to?
Avoiding StringBuffer.append (StringBuffer)
To append a String buffer to another, a BlackBerry® Java Application should use net.rim.device.api.util.StringUtilities.append( StringBuffer dst, StringBuffer src[, int offset, int length ] ).
Code sample
public synchronized StringBuffer append(Object obj) {
if (obj instanceof StringBuffer) {
StringBuffer sb = (StringBuffer)obj;
net.rim.device.api.util.StringUtilities.append( this, sb, 0, sb )
return this;
}
return append(String.valueOf(obj));
}
StringBuffer does not offer an overload for the append() method that takes another StringBuffer. This means developers are likely to use StringBuffer.append(String str) and call .toString() on the second StringBuffer. This requires the second buffer to be turned into a string, which is immutable, and then the characters from the string are appended to the first StringBuffer. Thus every character in the second buffer is touched twice, and there is the unnecessary allocation of the String just to transfer the characters to the first StringBuffer.
The efficient way of doing this would copy each character from the second buffer onto the end of the first. However, StringBuffer does not provide any easy way of doing this. Thus the recommendation is to use StringUtilities.append(StringBuffer, StringBuffer) which is able to directly read the characters from the second buffer without copying them into an intermediate collection.
This saves the runtime of the extra copying, the runtime needed to allocate a temporary String, and the memory needed to allocate a temporary string.
It means that the StringBuffer class is not implemented efficiently. Java Strings are supposed to be immutable, that's what StringBuffer is used for. However, the StringBuffer class you're using is not efficient when using StringBuffer.append() so you need to use net.rim.device.api.util.StringUtilities. That's what the code is doing, encapsulating the use of that class in a new append() method.
Related
I want to read the contents of a file piece by piece through an interface (instead of reading the whole file at once with readAsBytes()). openRead() seems to do the trick, but it returns a List<int> type. And I expect it to be Uint8List, because I want to do block operations on some of the contents.
If you convert the returned List<int> to Uint8List, it seems to make a copy of the contents, which is a big loss in efficiency.
Is this how it was designed?
Historically Dart used List<int> for sequences of bytes before a more specific Uint8List class was added. A Uint8List is a subtype of List<int>, and in most cases where a Dart SDK function returns a List<int> for a list of bytes, it's actually a Uint8List object. You therefore usually can just cast the result:
var file = File('/path/to/some/file');
var stream = file.openRead();
await for (var chunk in stream) {
var bytes = chunk as Uint8List;
}
If you are uncomfortable relying on the cast, you can create a helper function that falls back to creating a copy if and only if necessary.
There have been efforts to change the Dart SDK function signatures to use Uint8List types explicitly, and that has happened in some cases (e.g. File.readAsBytes). Such changes would be breaking API changes, so they cannot be done lightly. I don't know why File.openRead was not changed, but it's quite likely that the amount of breakage was deemed to be not worth the effort. (At a minimum, the SDK documentation should be updated to indicate whether it is guaranteed to return a Uint8List object. Also see https://github.com/dart-lang/sdk/issues/39947)
Alternatively, instead of using File.openRead, you could use File.open and then use RandomAccessFile.read, which is declared to return a Uint8List.
In C, before using the scanf or gets "stdio.h" functions to get and store user input, the programmer has to manually allocate memory for the data that's read to be stored in. In Rust, the std::io::Stdin.read_line function can seemingly be used without the programmer having to manually allocate memory prior. All it needs is for there to be a mutable String variable to store the data it reads in. How does it do this seemingly without knowledge about how much memory will be required?
Well, if you want a detailed explanation, you can dig a bit into the read_line method which is part of the BufRead trait. Heavily simplified, the function look like this.
fn read_line(&mut self, target: &mut String)
loop {
// That method fills the internal buffer of the reader (here stdin)
// and returns a slice reference to whatever part of the buffer was filled.
// That buffer is actually what you need to allocate in advance in C.
let available = self.fill_buf();
match memchr(b'\n', available) {
Some(i) => {
// A '\n' was found, we can extend the string and return.
target.push_str(&available[..=i]);
return;
}
None => {
// No '\n' found, we just have to extend the string.
target.push_str(available);
},
}
}
}
So basically, that method extends the string as long as it does not find a \n character in stdin.
If you want to allocate a bit of memory in advance for the String that you pass to read_line, you can create it using String::with_capacity. This will not prevent the String to reallocate if it is not large enough though.
I have my own class (nBuffer) like wxMemoryBuffer and I use it to load/save custom data, it's more convenient than using streams because I have a lot of overloaded methods for different data types based on these:
class nBuffer
{ // ...
bool wr(void* buf, long unsigned int length);// write
bool rd(void* buf, long unsigned int length);// read
}
I'm trying to implemets methods to load/save wxString from/to this buffer.
With wxWidgets 2.8 I've used the next code (simplified):
bool nBuffer::wrString(wxString s)
{ // save string:
int32 lng=s.Length()*4;
wr(&lng,4);// length
wr(s.GetData(),lng);// string itself
return true;
}
bool nBuffer::rdString(wxString &s)
{ // load string:
uint32 lng;
rd(&lng,4);// length
s.Alloc(lng);
rd(s.GetWriteBuf(lng),lng);// string itself
s.UngetWriteBuf();
s=s.Left(lng/4);
return true;
}
This code is not good because:
Is assumes there are 4 bytes of data for each string character (it might be less),
With wxWidgets 3.0, wxString.GetData() returns wxCStrData instead of *void, so the compiler fails on wr(s.GetData(),lng); and I have no idea of how to convert it to a simple byte buffer.
Strange, but I found nothing googling that for hours... Also I've found nothing useful in wxWidgets docs.
The questions are:
That is the preferred, correct and safe way to convert wxString to byte buffer,
The same about converting the byte buffer back to wxString.
For arbitrary wxStrings you need to serialize them in either UTF-8 or UTF-16 format. The former is a de facto standard for data exchange, so I advise to use it, but you could prefer UTF-16 if you know that your data is biased to the sort of characters that take less space in it than in UTF-8 and if space saving is important for you.
Assuming you use UTF-8, serializing is done using utf8_str() method:
wxScopedCharBuffer const utf8 = s.utf8_str();
wr(utf8.data(), utf8.length());
Deserializing is as simple as using wxString::FromUTF8(data, length).
For UTF-16 you would use general mb_str(wxMBConvUTF16) and wxString(data, wxMBConvUTF16, length) methods, which could also be used with wxMBConvUTF8, but the UTF-8-specific methods above are more convenient and, in some build configurations, more efficient.
I'm using ANTLR4 to create a parse tree for my grammar, what I want to do is modify certain nodes in the tree. This will include removing certain nodes and inserting new ones. The purpose behind this is optimization for the language I am writing. I have yet to find a solution to this problem. What would be the best way to go about this?
While there is currently no real support or tools for tree rewriting, it is very possible to do. It's not even that painful.
The ParseTreeListener or your MyBaseListener can be used with a ParseTreeWalker to walk your parse tree.
From here, you can remove nodes with ParserRuleContext.removeLastChild(), however when doing this, you have to watch out for ParseTreeWalker.walk:
public void walk(ParseTreeListener listener, ParseTree t) {
if ( t instanceof ErrorNode) {
listener.visitErrorNode((ErrorNode)t);
return;
}
else if ( t instanceof TerminalNode) {
listener.visitTerminal((TerminalNode)t);
return;
}
RuleNode r = (RuleNode)t;
enterRule(listener, r);
int n = r.getChildCount();
for (int i = 0; i<n; i++) {
walk(listener, r.getChild(i));
}
exitRule(listener, r);
}
You must replace removed nodes with something if the walker has visited parents of those nodes, I usually pick empty ParseRuleContext objects (this is because of the cached value of n in the method above). This prevents the ParseTreeWalker from throwing a NPE.
When adding nodes, make sure to set the mutable parent on the ParseRuleContext to the new parent. Also, because of the cached n in the method above, a good strategy is to detect where the changes need to be before you hit where you want your changes to go in the walk, so the ParseTreeWalker will walk over them in the same pass (other wise you might need multiple passes...)
Your pseudo code should look like this:
public void enterRewriteTarget(#NotNull MyParser.RewriteTargetContext ctx){
if(shouldRewrite(ctx)){
ArrayList<ParseTree> nodesReplaced = replaceNodes(ctx);
addChildTo(ctx, createNewParentFor(nodesReplaced));
}
}
I've used this method to write a transpiler that compiled a synchronous internal language into asynchronous javascript. It was pretty painful.
Another approach would be to write a ParseTreeVisitor that converts the tree back to a string. (This can be trivial in some cases, because you are only calling TerminalNode.getText() and concatenate in aggregateResult(..).)
You then add the modifications to this visitor so that the resulting string representation contains the modifications you try to achieve.
Then parse the string and you get a parse tree with the desired modifications.
This is certainly hackish in some ways, since you parse the string twice. On the other hand the solution does not rely on antlr implementation details.
I needed something similar for simple transformations. I ended up using a ParseTreeWalker and a custom ...BaseListener where I overwrote the enter... methods. Inside this method the ParserRuleContext.children is available and can be manipulated.
class MyListener extends ...BaseListener {
#Override
public void enter...(...Context ctx) {
super.enter...(ctx);
ctx.children.add(...);
}
}
new ParseTreeWalker().walk(new MyListener(), parseTree);
I am looking for a Java DataOutputStream equivalent for Dart where I can write arbitrary types (int, string, float, byte array etc). There is RandomAccessFile but it does not provide byte array or float-double values. ByteArray seems to have some necessary functions but I am not sure how to write it to a file or an OutputStream.
Here is some simple code showing how to write a ByteArray into an OutputStream:
#import('dart:io');
#import('dart:scalarlist');
main() {
File file = new File("c:\\temp\\foo.txt");
OutputStream os = file.openOutputStream();
os.onNoPendingWrites = () {
print('Finished writing. Closing.');
os.flush();
os.close();
};
Uint8List byteList = new Uint8List(64);
ByteArray byteArray = byteList.asByteArray();
int offset = 0;
offset = byteArray.setUint8(offset, 72);
offset = byteArray.setUint8(offset, 101);
offset = byteArray.setUint8(offset, 108);
offset = byteArray.setUint8(offset, 108);
offset = byteArray.setUint8(offset, 111);
offset = byteArray.setUint8(offset, 0);
byteArray.setFloat32(offset, 1.0);
os.write(byteList);
}
This has been around for a while, but I searched and didn't find good DataInput/OutputStream interoperability classes. I wanted a version that works with streams, so I could process files that don't comfortably fit in RAM. So I wrote one.
It's published over at https://pub.dev/packages/jovial_misc in io_streams, or if you prefer, https://github.com/zathras/misc/tree/master/dart/jovial_misc. I made it so it interoperates with java.io.DataInputStream and java.io.DataOutputStream. Code using it looks a little like this:
import 'package:convert/convert.dart';
import 'package:jovial_misc/io_utils.dart';
void main() async {
final acc = ByteAccumulatorSink();
final out = DataOutputSink(acc);
out.writeUTF8('Hello, world.');
out.close();
final stream = Stream<List<int>>.fromIterable([acc.bytes]);
final dis = DataInputStream(stream);
print(await dis.readUTF8());
await dis.close();
}
The Stream<List<int>> would of course typically come from a socket, or File.openRead(), etc. There's also a DataInputStream variant that is synchronous and takes an Iterable, if you do have all the byte data available up front.
DataInputStream and DataOutputSink are pretty much the obvious mapping of the java.io classes. The tricky part is the buffer management, since a stream shoves data at you in List<int> instances that probably aren't lined up with the data you want. And, of course, it's necessary to do everything asynchronously.
HTH.
You are essentially asking for arbitrary object serialization. And while the Dart VM has one, it isn't exposed to programmers (it is only used for snapshotting and message passing). I'd say that it would be a mistake to expose it -- in different situations, we have different requirements for serialization and "one true solution" isn't gonna work (Java showed us that already).
For example, I'm working on a MsgPack implementation for Dart, I know that Protobuf port is also in the works, maybe someone will start a Thrift port... the possibilities are endless.
The closest thing I could find is this package: https://github.com/TomCaserta/dart_io/ . Unfortunately there is a bug when reading to the end of the byte array - see my pull request in GitHub.
You could use this class:
https://github.com/TomCaserta/dart_io/blob/master/lib/data_output.dart
Unfortunately (a) it doesn't handle streams; (b) writeLong doesn't take a single integer. I have raised an issue for the Dart SDK: https://github.com/dart-lang/sdk/issues/31166
Edit: I have forked the dart_io package and fixed the two problems described above. My new package is published as dart_data_io:
https://github.com/markmclaren2/dart_data_io