String Split in DXL - ibm-doors

I have a string
Ex: "We prefer questions that can be answered; not just discussed "
now i want to split this string from ";"
like
We prefer questions that can be answered
and
not just discussed
is this possible in DXL.
i am learning DXL, so i don't have any idea whether we can split or not.
Note : This is not a home work.

I'm sorry for necroing this post. Being new to DXL I spent some time with the same challenge. I noticed that the implementations available on the have different specifications of "splitting" a string. Loving the Ruby language, I missed an implementation which comes at least close to the Ruby version of String#split.
Maybe my findings will be helpful to anybody.
Here's a functional comparison of
Variant A: niol's implementation (which at a first glance, appears to be the same implementation which is usually found at Capri Soft,
Variant B: PJT's implementation,
Variant C: Brett's implementation and
Variant D: my implementation (which provides the correct functionality imo).
To eliminate structural difference, all implementations were implemented in functions, returning a Skip list or an Array.
Splitting results
Note that all implementations return different results, depending on their definition of "splitting":
string mellow yellow; delimiter ello
splitVariantA returns 1 elements: ["mellow yellow" ]
splitVariantB returns 2 elements: ["m" "llow yellow" ]
splitVariantC returns 3 elements: ["w" "w y" "" ]
splitVariantD returns 3 elements: ["m" "w y" "w" ]
string now's the time; delimiter
splitVariantA returns 3 elements: ["now's" "the" "time" ]
splitVariantB returns 2 elements: ["" "now's the time" ]
splitVariantC returns 5 elements: ["time" "the" "" "now's" "" ]
splitVariantD returns 3 elements: ["now's" "the" "time" ]
string 1,2,,3,4,,; delimiter ,
splitVariantA returns 4 elements: ["1" "2" "3" "4" ]
splitVariantB returns 2 elements: ["1" "2,,3,4,," ]
splitVariantC returns 7 elements: ["" "" "4" "3" "" "2" "" ]
splitVariantD returns 7 elements: ["1" "2" "" "3" "4" "" "" ]
Timing
Splitting the string 1,2,,3,4,, with the pattern , for 10000 times on my machine gives these timings:
splitVariantA() : 406 ms
splitVariantB() : 46 ms
splitVariantC() : 749 ms
splitVariantD() : 1077 ms
Unfortunately, my implementation D is the slowest. Surprisingly, the regular expressions implementation C is pretty fast.
Source code
// niol, modified
Array splitVariantA(string splitter, string str){
Array tokens = create(1, 1);
Buffer buf = create;
int str_index;
buf = "";
for(str_index = 0; str_index < length(str); str_index++){
if( str[str_index:str_index] == splitter ){
array_push_str(tokens, stringOf(buf));
buf = "";
}
else
buf += str[str_index:str_index];
}
array_push_str(tokens, stringOf(buf));
delete buf;
return tokens;
}
// PJT, modified
Skip splitVariantB(string s, string delimiter) {
int offset
int len
Skip skp = create
if ( findPlainText(s, delimiter, offset, len, false)) {
put(skp, 0, s[0 : offset -1])
put(skp, 1, s[offset +1 :])
}
return skp
}
// Brett, modified
Skip splitVariantC (string s, string delim) {
Skip skp = create
int i = 0
Regexp split = regexp "^(.*)" delim "(.*)$"
while (split s) {
string temp_s = s[match 1]
put(skp, i++, s[match 2])
s = temp_s
}
put(skp, i++, s[match 2])
return skp
}
Skip splitVariantD(string str, string pattern) {
if (null(pattern) || 0 == length(pattern))
pattern = " ";
if (pattern == " ")
str = stringStrip(stringSqueeze(str, ' '));
Skip result = create;
int i = 0; // index for searching in str
int j = 0; // index counter for result array
bool found = true;
while (found) {
// find pattern
int pos = 0;
int len = 0;
found = findPlainText(str[i:], pattern, pos, len, true);
if (found) {
// insert into result
put(result, j++, str[i:i+pos-1]);
i += pos + len;
}
}
// append the rest after last found pattern
put(result, j, str[i:]);
return result;
}

Quick join&split I could come up with. Seams to work okay.
int array_size(Array a){
int size = 0;
while( !null(get(a, size, 0) ) )
size++;
return size;
}
void array_push_str(Array a, string str){
int array_index = array_size(a);
put(a, str, array_index, 0);
}
string array_get_str(Array a, int index){
return (string get(a, index, 0));
}
string str_join(string joiner, Array str_array){
Buffer joined = create;
int array_index = 0;
joined += "";
for(array_index = 0; array_index < array_size(str_array); array_index++){
joined += array_get_str(str_array, array_index);
if( array_index + 1 < array_size(str_array) )
joined += joiner;
}
return stringOf(joined)
}
Array str_split(string splitter, string str){
Array tokens = create(1, 1);
Buffer buf = create;
int str_index;
buf = "";
for(str_index = 0; str_index < length(str); str_index++){
if( str[str_index:str_index] == splitter ){
array_push_str(tokens, stringOf(buf));
buf = "";
}else{
buf += str[str_index:str_index];
}
}
array_push_str(tokens, stringOf(buf));
delete buf;
return tokens;
}

If you only split the string once this is how I would do it:
string s = "We prefer questions that can be answered; not just discussed"
string sub = ";"
int offset
int len
if ( findPlainText(s, sub, offset, len, false)) {
/* the reason why I subtract one and add one is to remove the delimiter from the out put.
First print is to print the prefix and then second is the suffix.*/
print s[0 : offset -1]
print s[offset +1 :]
} else {
// no delimiter found
print "Failed to match"
}
You could also use regular expressions refer to the DXL reference manual. It would be better to use regular expressions if you want to split up the string by multiple delimiters such as str = "this ; is an;example"

ACTUALLY WORKS:
This solution will split as many times as needed, or none, if the delimiter doesn't exist in the string.
This is what I have used instead of a traditional "split" command.
It actually skips the creation of an array, and just loops through each string that would be in the array and calls "someFunction" on each of those strings.
string s = "We prefer questions that can be answered; not just discussed"
// for this example, ";" is used as the delimiter
Regexp split = regexp "^(.*);(.*)$"
// while a ";" exists in s
while (split s) {
// save the text before the last ";"
string temp_s = s[match 1]
// call someFunction on the text after the last ";"
someFunction(s[match 2])
// remove the text after the last ";" (including ";")
s = temp_s
}
// call someFunction again for the last (or only) string
someFunction(s)
Sorry for necroing an old post; I just didn't find the other answers useful.

Perhaps someone would find handy this fused solution as well. It splits string in Skip, based on delimiter, which can actually have length more then one.
Skip splitString(string s1, string delimit)
{
int offset, len
Skip splited = create
while(findPlainText(s1, delimit, offset, len, false))
{
put(splited, s1[0:offset-1], s1[0:offset-1])
s1 = s1[offset+length(delimit):length(s1)-1]
}
if(length(s1)>0)
{
put (splited, s1, s1)
}
return splited
}

I tried this out and worked out for me...
string s = "We prefer questions that can be answered,not just discussed,hiyas"
string sub = ","
int offset
int len
string s1=s
while(length(s1)>0){
if ( findPlainText(s1, sub, offset, len, false)) {
print s1[0 : offset -1]"\n"
s1= s1[offset+1:length(s1)]
}
else
{
print s1
s1=""
}
}

Here is a better implementation. This is a recursive split of the string by searching a keyword.
pragma runLim, 10000
string s = "We prefer questions that can be answered,not just discussed,hiyas;
Next Line,Var1,Nemesis;
Next Line,Var2,Nemesis1;
Next Line,Var3,Nemesis2;
New,Var4,Nemesis3;
Next Line,Var5,Nemesis4;
New,Var5,Nemesis5;"
string sub = ","
int offset
int len
string searchkey=null
string curr=s
string nxt=s
string searchline=null
string Modulename=""
string Attributename=""
string Attributevalue=""
while(findPlainText(curr,"Next Line", offset,len,false))
{
int intlen=offset
searchkey=curr[offset:length(curr)]
if(findPlainText(searchkey,"Next Line",offset,len,false))
{
curr=searchkey[offset+1:length(searchkey)]
}
if(findPlainText(searchkey,";",offset,len,false))
{
searchline=searchkey[0:offset]
}
int counter=0
while(length(searchline)>0)
{
if (findPlainText(searchline, sub, offset, len, false))
{
if(counter==0)
{
Modulename=searchline[0 : offset -1]
counter++
}
else if(counter==1)
{
Attributename=searchline[0 : offset -1]
counter++
}
searchline= searchline[offset+1:length(searchline)]
}
else
{
if(counter==2)
{
Attributevalue=searchline[0:length(searchline)-2]
counter++
}
searchline=""
}
}
print "Modulename="Modulename " Attributename=" Attributename " Attributevalue= "Attributevalue "\n"
}

Related

Check if element is the last value in fold function

I am using fold on an array which hasn't been assign to a variable and want to check whether the element is the last value. With a conventional for loop I can do this:
List<int> ints = [1, 2, 3];
int sum = 0;
for (int num in ints]) {
if (num != ints.last) {
sum = sum + num;
}
}
print(sum);
Is it possible to do this with fold instead?
int foldSum = [1, 2, 3].fold(0, (int prev, element) => prev + element);
print(foldSum);
I can't find any way of check when fold is at the last value. Note: this is a simplified example of my problem and the reason the list isn't assigned to a variable (allowing me to use .last) is because it is the result of a call to .map().
For completeness, below is the actual code (which won't obviously won't be runnable in isolation but will help illustrate my problem) I am trying to convert to use .map and .fold:
String get fieldsToSqlInsert {
String val = "";
for (Column column in columns) {
if (data.containsKey(column.name)) {
val = '$val "${data[column.name]}"';
} else {
val = "$val NULL";
}
if (column != columns.last) {
val = "$val,";
}
}
return val;
}
But it doesn't work because I don't know how to check when fold is at the final element:
String get fieldsToSqlInsert => columns
.map((column) =>
data.containsKey(column.name) ? data[column.name] : "NULL")
.fold("", (val, column) => column != columns.last ? "$val," : val);
If you simply want to exclude the last element from further calculation, you can just use take to do so:
String get fieldsToSqlInsert => columns.take(columns.length - 1)...

Dart: How to iterate digits of integer?

How to iterate digits of integer? for example sum of digits here, it works, but is any way to right way?
int sumOfDigits(int num) {
int sum = 0;
String numtostr = num.toString();
for (var i = 0; i < numtostr.length; i++) {
sum = sum + int.parse(numtostr[i]);
}
return sum;
}
If you're looking for a shorter way to do this, you can combine split, map and reduce
int sum = num.split('').map((e) => int.parse(e)).reduce((t, e) => t + e);
You can even do this:
int sum = num.split('').map(int.parse).reduce((t, e) => t + e);
Thank you #julemand101
It's fairly inefficient to create a string, then split the string, and parse the individual digits back to integers.
How about something like:
Iterable<int> digitsOf(int number) sync* {
do {
yield = number.remainder(10);
number ~/= 10;
} while (number != 0);
}
This iterates the digits of the (non-negative) number in base 10, from least significant to most significant, without allocating any strings along the way.
If you want the digits in the reverse order, you can either create a list from the iterable above and reverse it, or use a different approach:
Iterable<int> digitsHighToLow(int number) sync* {
var base = 1;
while (base * 10 < number) {
base = base * 10;
}
do {
var digit = number ~/ base;
yield digit;
number = (number - digit * base) * 10;
} while (number != 0);
}
(again, only works on non-negative numbers, you'll have to figure out what you want for negative numbers, either throw, or try negating the number, it's the same digits after all, or something else).

String Newline not displaying in Doors

I have csv file containing some data like:
374,Test Comment multiplelines \n Here's the 2nd line,Other_Data
Where 374 is the object ID from doors, then some commentary and then some other data.
I have a piece of code that reads the data from the CSV file, stores it in the appropriate variables and then writes it to the doors Object.
Module Openend_module = edit("path_to_mod", true,true,true)
Object o ;
Column c;
string attrib;
string oneLine ;
string OBJECT_ID = "";
string Comment = "";
String Other_data = "";
int offset;
string split_text(string s)
{
if (findPlainText(s, sub, offset, len, false))
{
return s[0 : offset -1]
}
else
{
return ""
}
}
Stream input = read("Path_to_Input.txt");
input >> oneLine
OBJECT_ID = split_text(oneLine)
oneLine = oneLine[offset+1:]
Comment = split_text(oneLine)
Other_data = oneLine[offset+1:]
When using print Comment the output in the DXL console is : Test Comment multiplelines \n Here's the 2nd line
for o in Opened_Module do
{
if (o."Absolute Number"""==OBJECT_ID ){
attrib = "Result_Comment " 2
o.attrib = Comment
}
}
But after writing to the doors object, the \n is not taken into consideration and the result is as follows:
I've tried putting the string inside a Buffer and using stringOf() but the escape character just disappeared.
I've also tried adding \r\n and \\n to the input csv text but still no luck
This isn't the most efficient way of handling this, but I have a relatively straightforward fix.
I would suggest adding the following:
Module Openend_module = edit("path_to_mod", true,true,true)
Object o ;
Column c;
string attrib;
string oneLine ;
string OBJECT_ID = "";
string Comment = "";
String Other_data = "";
int offset;
string split_text(string s)
{
if (findPlainText(s, sub, offset, len, false))
{
return s[0 : offset -1]
}
else
{
return ""
}
}
Stream input = read("Path_to_Input.txt");
input >> oneLine
OBJECT_ID = split_text(oneLine)
oneLine = oneLine[offset+1:]
Comment = split_text(oneLine)
Other_data = oneLine[offset+1:]
//Modification to comment string
int x
int y
while ( findPlainText ( Comment , "\\n" , x , y , false ) ) {
Comment = ( Comment [ 0 : x - 1 ] ) "\n" ( Comment [ x + 2 : ] )
}
This will run the comment string through a parser, replacing string "\n" with the char '\n'. Be aware- this will ignore any trailing spaces at the end of a line.
Let me know if that helps.

Nom Parser To Unescape String

I'm writing a Nom parser for RCS. RCS Files tend to be ISO-8859-1 encoded. One of the grammar productions is for a String. This is #-delimited and literal # symbols are escaped as ##.
#A String# -> A String
#A ## String# -> A # String
I have a working function (see end). IResult is from Nom, you either return the parsed thing, plus the rest of the unparsed input, or an Error/Incomplete. Cow is used to return a reference built on the original input slice if no unescaping was required, or an owned string if it was.
Are there any built in Nom macros that could have helped with this parse?
#[macro_use]
extern crate nom;
use std::str;
use std::borrow::Cow;
use nom::*;
/// Parse an RCS String
fn string<'a>(input: &'a[u8]) -> IResult<&'a[u8], Cow<'a, str>> {
let len = input.len();
if len < 1 {
return IResult::Incomplete(Needed::Unknown);
}
if input[0] != b'#' {
return IResult::Error(Err::Code(ErrorKind::Custom(0)));
}
// start of current chunk. Chunk is a piece of unescaped input
let mut start = 1;
// current char index in input
let mut i = start;
// FIXME only need to allocate if input turned out to need unescaping
let mut s: String = String::new();
// Was the input escaped?
let mut escaped = false;
while i < len {
// Check for end delimiter
if input[i] == b'#' {
// if there's another # then it is an escape sequence
if i + 1 < len && input[i + 1] == b'#' {
// escaped #
i += 1; // want to include the first # in the output
s.push_str(str::from_utf8(&input[start .. i]).unwrap());
start = i + 1;
escaped = true;
} else {
// end of string
let result = if escaped {
s.push_str(str::from_utf8(&input[start .. i]).unwrap());
Cow::Owned(s)
} else {
Cow::Borrowed(str::from_utf8(&input[1 .. i]).unwrap())
};
return IResult::Done(&input[i + 1 ..], result);
}
}
i += 1;
}
IResult::Incomplete(Needed::Unknown)
}
It looks like the way to use the nom library is using the macro combinators. A quick browse of the source code gives some nice examples of parsers, including parsing of strings with escape characters. This is what I came up with:
#[macro_use]
extern crate nom;
use nom::*;
named!(string< Vec<u8> >, delimited!(
tag!("#"),
fold_many0!(
alt!(
is_not!(b"#") |
map!(
complete!(tag!("##")),
|_| &b"#"[..]
)
),
Vec::new(),
|mut acc: Vec<u8>, bytes: &[u8]| {
acc.extend(bytes);
acc
}
),
tag!("#")
));
#[test]
fn it_works() {
assert_eq!(string(b"#string#"), IResult::Done(&b""[..], b"string".to_vec()));
assert_eq!(string(b"#string with ## escapes#"), IResult::Done(&b""[..], b"string with # escapes".to_vec()));
assert_eq!(string(b"#invalid string"), IResult::Incomplete(Needed::Size(16)));
}
As you can see, I simply copy the bytes into a vector using Vec::extend - you could be more sophisticated here and return a Cow byte slice if you wanted.
The escaped! macro does not appear to be of use in this case unfortunately, as it can't seem to work when the terminator is the same as the escape character (which is actually a pretty common case).

How to generate unique (short) URL folder name on the fly...like Bit.ly

I'm creating an application which will create a large number of folders on a web server, with files inside of them.
I need the folder name to be unique. I can easily do this with a GUID, but I want something more user friendly. It doesn't need to be speakable by users, but should be short and standard characters (alphas is best).
In short: i'm looking to do something like Bit.ly does with their unique names:
www.mydomain.com/ABCDEF
Is there a good reference on how to do this? My platform will be .NET/C#, but ok with any help, references, links, etc on the general concept, or any overall advice to solve this task.
Start at 1. Increment to 2, 3, 4, 5, 6, 7,
8, 9, a, b...
A, B, C...
X, Y, Z, 10, 11, 12, ... 1a, 1b,
You get the idea.
You have a synchronized global int/long "next id" and represent it in base 62 (numbers, lowercase, caps) or base 36 or something.
I'm assuming that you know how to use your web server's redirect capabilities. If you need help, just comment :).
The way I would do it would be generating a random integer (between the integer values of 'a' and 'z'); converting it into a char; appending it to a string; and repeating until we reach the needed length. If it generates a value already in the database, repeat the process. If it was unique, store it in the database with the name of the actual location and the name of the alias.
This is a bit hack-like because it assumes that 'a' through 'z' are actually in sequence in their integer values.
Best I could think of :(.
In Perl, without modules so you can translate more easly.
sub convert_to_base {
my ($n, $b) = #_;
my #digits;
while ($n) {
my $digits = $n % $b;
unshift #digits, $digit;
$n = ($n - $digit) / $b;
}
unshift #digits, 0 if !#digits;
return #digits;
}
# Whatever characters you want to use.
my #digit_set = ( '0'..'9', 'a'..'z', 'A'..'Z' );
# The id of the record in the database,
# or one more than the last id you generated.
my $id = 1;
my $converted =
join '',
map { $digit_set[$_] }
convert_to_base($id, 0+#digits_set);
I needed something similar to what you're trying to accomplish. I retooled my code to generate folders so try this. It's setup for a console app, but you can use it in a website also.
private static void genRandomFolders()
{
string basepath = "C:\\Users\\{username here}\\Desktop\\";
int count = 5;
int length = 8;
List<string> codes = new List<string>();
int total = 0;
int i = count;
Random rnd = new Random();
while (i-- > 0)
{
string code = RandomString(rnd, length);
if (!codes.Exists(delegate(string c) { return c.ToLower() == code.ToLower(); }))
{
//Create directory here
System.IO.Directory.CreateDirectory(basepath + code);
}
total++;
if (total % 100 == 0)
Console.WriteLine("Generated " + total.ToString() + " random folders...");
}
Console.WriteLine();
Console.WriteLine("Generated " + total.ToString() + " total random folders.");
}
public static string RandomString(Random r, int len)
{
//string str = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890"; //uppercase only
//string str = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890"; //All
string str = "abcdefghjkmnpqrstuvwxyz123456789"; //Lowercase only
StringBuilder sb = new StringBuilder();
while ((len--) > 0)
sb.Append(str[(int)(r.NextDouble() * str.Length)]);
return sb.ToString();
}

Resources