In the z3 tutorial, str.indexof and seq.indexof are mentioned separately. However, in the Z3 C API, there is only one relevant function, Z3_mk_seq_index. Does that function do special-case handling for the case of sequences of 8-bit BitVecs? Meaning, if my sequences are all strings, will Z3_mk_seq_index still use the z3str3 codebase? Or will it fall back on z3's sequence solver?
There's no difference between a sequence of 8-bit values and a string in z3. In fact, all strings are treated as sequences of 8-bit values.
The only difference between seq.indexof and str.indexof is their type. The latter is simply a specialization of the former.
To switch between the regular string solver and z3str3, simply use the appropriate command line switch smt.string_solver. Valid options are seq, z3str3, and auto.
Related
I would like to write a function to check if the input is a string or not like this:
is_string(Input) ->
case check_if_string(Input) of
true -> {ok, Input};
false -> error
end.
But I found it is tricky to check whether the input is a string in Erlang.
The string definition in Erlang is here: http://erlang.org/doc/man/string.html.
Any suggestions?
Thanks in advance.
In Erlang a string can be actually quite a few things, so there are a few ways to do this depending on exactly what you mean by "a string". It is worth bearing in mind that every sort of string in Erlang is a list of character or lexeme values of some sort.
Encodings are not simple things, particularly when Unicode is involved. Characters can be almost arbitrarily high values, lexemes are globbed together in deep lists of integers, and Erlang iolist()s (which are super useful) are deep lists of mixed integer and binary values that get automatically flattened and converted during certain operations. If you are dealing with anything other than flat lists of printable ASCII values then I strongly recommend you read these:
Unicode module docs
String module docs
IO Library module docs
So... this is not a very simple question.
What to do about all the confusion?
Quick answer that always works: Consider the origin of the data.
You should know what kind of data you are dealing with, whether it is coming over a socket or from a file, or especially if you are generating it yourself. On the edges of your system you may need some help purifying data, though, because network clients send all sorts of random trash from time to time.
Some helper functions for the most common cases live in the io_lib module:
io_lib:char_list/1: Returns true if the input is a list of characters in the unicode range.
io_lib:deep_char_list/1: Returns true if the input is a deep list of legal chars.
io_lib:deep_latin1_char_list/1: Returns true if the input is a deep list of Latin-1 (your basic printable ASCII values from 32 to 126).
io_lib:latin1_char_list/1: Returns true if the input is a flat list of Latin-1 characters (90% of the time this is what you're looking for)
io_lib:printable_latin1_list/1: Returns true if the input is a list of printable Latin-1 (If the above isn't what you wanted, 9% of the time this is the one you want)
io_lib:printable_list/1: Returns true if the input is a flat list of printable chars.
io_lib:printable_unicode_list/1: Returns true if the input is a flat list of printable unicode chars (for that 1% of the time that this is your problem -- except that for some of us, myself included here in Japan, this covers 99% of my input checking cases).
For more particular cases you can either use a regex from the re module or write your own recursive function that zips through a string for those special cases where a regex either doesn't fit, is impossible, or could make you vulnerable to regex attacks.
In erlang, string can be represented by list or binary.
If string is used as list then you can use following function to check:
is_string([C|T]) when (C >= 0) and (C =< 255) ->
is_string(T);
is_string([]) ->
true;
is_string(_) ->
false.
If string is used as binary in code then is_binary(Term) in build function can be used.
Using C++ APIs, how can you extract the decimal value of a bit-vector constant from a model.
There are several C-Functions that allow you to extract different types of values, depending on the expected size of the numerals: Z3_get_numeral_int, Z3_get_numeral_uint, Z3_get_numeral_uint64, Z3_get_numeral_int64. For numbers that don't fit into those basic types, we can use the Z3_get_numeral_string function to get a string representation that can be parsed into your preferred big-int representation.
Note that these functions are C-functions, not C++ functions, but those two APIs mix nicely. (See e.g, also z3 C++ API & ite).
There's a bunch of ways you can compare strings in modern Delphi (say 2010-XE3):
'<=' operator which resolves to UStrCmp / LStrCmp
CompareStr
AnsiCompareStr
Can someone give (or point to) a description of what those methods do, in principle?
So far I've figured that AnsiCompareStr calls CompareString on Windows, which is a "textual" comparison (i.e. takes into account unicode combined characters etc). Simple CompareStr does not do that and seems to do a binary comparison instead.
But what is the difference between CompareStr and UStrCmp? Between UStrCmp and LStrCmp? Do they all produce identical results? Do those results change between versions of Delphi?
I'm asking because I need a comparison which will always produce the same results, so that indexes in app built with one version of Delphi remain consistent with code built with another.
AnsiCompareStr is specified as taking locale into account, and should return identical results regardless of Delphi version, but may return different results based on Windows version and/or settings.. CompareStr is a pure binary comparison: "The comparison operation is based on the 16-bit ordinal value of each character and is not affected by the current locale" (for the CompareStr(const S1, S2: string) overload). UStrCmp also uses a pure binary comparison: "Strings are compared according to the ordinal values that make up the characters that make up the string." So there should not be a difference between the latter two. The way they return the result is different, so two implementations are needed (although it would be possible to make one rely on the other).
As for the differences between LStrCmp and UStrCmp, LStrCmp takes AnsiStrings, UStrCmp takes UnicodeStrings. It's entirely possible that two characters (let's say A and B) are ordered in the misnamed "ANSI" code page as A < B, but are ordered in Unicode as A > B. You should almost always just use the comparison appropriate for the data you have.
In the Erlang shell, I can do the following:
A = 300.
300
<<A:32>>.
<<0, 0, 1, 44>>
But when I try the following:
B = term_to_binary({300}).
<<131,104,1,98,0,0,1,44>>
<<B:32>>
** exception error: bad argument
<<B:64>>
** exception error: bad argument
In the first case, I'm taking an integer and using the bitstring syntax to put it into a 32-bit field. That works as expected. In the second case, I'm using the term_to_binary BIF to turn the tuple into a binary, from which I attempt to unpack certain bits using the bitstring syntax. Why does the first example work, but the second example fail? It seems like they're both doing very similar things.
The difference between in a binary and a bitstring is that the length of a binary is evenly divisible by 8, i.e. it contains no 'partial' bytes; a bitstring has no such restriction.
This difference is not your problem here.
The problem you're facing is that your syntax is wrong. If you would like to extract the first 32 bits from the binary, you need to write a complete matching statement - something like this:
<<B1:32, _/binary>> = B.
Note that the /binary is important, as it will match the remnant of the binary regardless of its length. If omitted, the matched length defaults to 8 (i.e. one byte).
You can read more about binaries and working with them in the Erlang Reference Manual's section on bit syntax.
EDIT
To your comment, <<A:32>> isn't just for integers, it's for values. Per the link I gave, the bit syntax allows you to specify many aspects of binary matching, including data types of bound variables - while the default type is integer, you can also say float or binary (among others). The :32 part indicates that 32 bits are required for a match - that may or may not be meaningful depending on your data type, but that doesn't mean it's only valid for integers. You could, for example, say <<Bits:10/bitstring>> to describe a 10-bit bitstring. Hope that helps!
The <<A:32>> syntax constructs a binary. To deconstruct a binary, you need to use it as a pattern, instead of using it as an expression.
A = 300.
% Converts a number to a binary.
B = <<A:32>>.
% Converts a binary to a number.
<<A:32>> = B.
I'm slightly confused and hoping for enlightenment.
I'm using Delphi 2010 for this project and I'm trying to compare 2 strings.
Using the code below fails
if AnsiStrIComp(PAnsiChar(sCatName), PAnsiChar(CatNode.CatName)) = 0 then...
because according to the debugger only the first character of each string is being compared (i.e. if sCatName is "Automobiles", PAnsiChar(sCatName) is "A").
I want to be able to compare strings that may be in different languages, for example English vs Japanese.
In this case I am looking for a match, but I have other functions used for sorting, etc. where I need to know how the strings compare (less than, equal, greater than).
I assume that sCatName and CatNode.CatName are defined as strings (= UnicodeStrings)?. They should be.
There is no need to convert the strings to null-terminated strings! This you (mostly) only need to do when working with the Windows API.
If you want to test equality of two strings, use SameStr(S1, S2) (case sensitive matching) or SameText(S1, S2) (case insensitive matching), or simply S1 = S2 in the first case. All three options return true or false, depending on the strings equality.
If you want to get a numerical value based on the ordinal values of the characters (as in sorting), then use CompareStr(S1, S2) or CompareText(S1, S2). These return a negative integer, zero, or a positive integer.
(You might want to use the Ansi- functions: AnsiSameStr, AnsiSameText, AnsiCompareStr, and AnsiCompareText; these functions will use the current locale. The non Ansi- functions will accept a third, optional parameter, explicitly specifying the locale to use.)
Update
Please read Remy Lebeau's comments regarding the cause of the problem.
What about simple sCatName=CatNode.CatName? If they are strings it should work.