I'm trying to add a few extra features to my ejabberd mod_muc_room, but jlib:now_to_utc_string doesn't seem to accept Unix timestamps and requires them to be in Erlang's built-in format. Trying to use "1519633372486003" instead of "{1519,633372,486003}" makes mod_muc_room crash.
I found at least several ways to convert an Erlang timestamp into a Unix timestamp, but I can't find a way to make a reverse conversion.
Is there a way to do that without converting integer to binary and binary to tuple before concatenating the numbers together and converting them back into numbers?
You can use div and rem to extract the three values:
1> M = 1000000.
1000000
2> T = 1519633372486003.
1519633372486003
3> {T div M div M, T div M rem M, T rem M}.
{1519,633372,486003}
Related
This code is an excerpt from this book.
count_characters(Str) ->
count_characters(Str, #{}).
count_characters([H|T], #{ H => N }=X) ->
count_characters(T, X#{ H := N+1 });
count_characters([H|T], X) ->
count_characters(T, X#{ H => 1 });
count_characters([], X) ->
X.
So,
1> count_characters("hello").
#{101=>1,104=>1,108=>2,111=>1}
What I understand from this is that, count_characters() takes an argument hello, and place it to the first function, i.e count_characters(Str).
What I don't understand is, how the string characters are converted into ascii value without using $, and got incremented. I am very new to erlang, and would really appreciate if you could help me understand the above. Thank you.
In erlang the string literal "hello" is just a more convenient way of writing the list [104,101,108,108,111]. The string format is syntactic sugar and nothing erlang knows about internally. An ascii string is internally string is internally stored as a list of 32-bit integers.
This also becomes confusing when printing lists where the values happen to be within the ascii range:
io:format("~p~n", [[65,66]]).
will print
"AB"
even if you didn't expect a string as a result.
As said previously, there is no string data type in Erlang, it uses the internal representation of an integer list, so
"hello" == [$h,$e,$l,$l,$o] == [104|[101|[108|[108|[111|[]]]]]]
Which are each a valid representation of an integer list.
To make the count of characters, the function use a new Erlang data type: a map. (available only since R17)
A map is a collection of key/value pairs, in your case the keys will be the characters, and the values the occurrence of each characters.
The function is called with an empty map:count_characters(Str, #{}).
Then it goes recursively through the list, and for each head H, 2 cases are posible:
The character H was already found, then the current map X will match with the pattern #{ H => N } telling us that we already found N times H, so we continue the recursion with the rest of the list and a new map where the value associated to H is now N+1: count_characters(T, X#{ H := N+1 }.
The character H is found for the first time, then we continue the recursion with the rest of the list and a new map where the key/value pair H/1 is added: count_characters(T, X#{ H => 1 }).
When the end of the list is reached, simply return the map: count_characters([], X) -> X.
I am trying to create a file descriptor using the command:
$ MAHOUT_HOME/core/target/mahout-core--job.jar org.apache.mahout.classifier.df.tools.Describe -p testdata/KDDTrain+.arff -f testdata/KDDTrain+.info -d N 3 C 2 N C 4 N C 8 N 2 C 19 N L
from the link:
https://mahout.apache.org/users/classification/partial-implementation.html on my data file but whatever file I take and change the number of attributes string N 3 C 2 N C 4 N C 8 N 2 C 19 N L .
I get the following exception:
Exception in thread "main" java.lang.IllegalArgumentException: Wrong number of attributes in the string
Please help!
There are a couple of reasons for which you might get an error like that...
Wrong Descriptor: Putting this for a sake of completeness. You must have already checked this one out. You have actually given a wrong descriptor for the data. Re-check the number and type of columns and then give them correctly to the descriptor.
Bad separator: Re-check the delimiter used in the data. That also might create some trouble. May be the data you have has some wrongly placed delimiter in some records. Make sure of that.
Special Characters: In my few experiments, I have noticed mahout does not enjoy if there are certain special characters, or data consists of characters of language other than English (unless of course, you tweak around the code). So make sure you have a way of handling them, and you should be good to go.
Anyways all these fight just so you can create a descriptor of the data. ATB.
Old question, but I had a more acute answer that I discovered after landing here with the same problem.
In this particular case, the problem I found was that the format of data file (from http://nsl.cs.unb.ca/NSL-KDD/) seems to have changed from the example as listed on the Mahout Random Forest example page.
The example lists a line format with the specifier
N 3 C 2 N C 4 N C 8 N 2 C 19 N L
but there's an extra element at the end of the lines; for example:
13,tcp,telnet,SF,118,2425,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,26,10,0.38,0.12,0.04,0.00,0.00,0.00,0.12,0.30,guess_passwd,2
which has one more field. Adding another number field (N) to the end of the specifier, as
N 3 C 2 N C 4 N C 8 N 2 C 19 N L N
I had luck using just the plain .txt file format instead of the .arff file format.
I have many text files of this format
....
<snip>
'FOP' 0.19 1 24 1 25 7 8 /
'FOP' 0.18 1 24 1 25 9 11 /
/
TURX
560231
300244
70029
200250
645257
800191
900333
600334
770291
300335
220287
110262 /
SUBTRACT
'TURX' 'TURY'/
</snip>
......
where the portions I snipped off contain other various data in various formats. The file format is inconsistent (machine generated), the only thing one is assured of is the keyword TURX which may appear more than once. If it appears alone on one line, then the next few lines will contain numbers that I need to fetch into an array. The last number will have a space then a forward slash (/). I can then use this array in other operations afterwards.
How do I "search" or parse a file of unknown format in fortran, and how do I get a loop to fetch the rest of the data, please? I am really new to this and I HAVE to use fortran. Thanks.
Fortran 95 / 2003 have a lot of string and file handling features that make this easier.
For example, this code fragment to process a file of unknown length:
use iso_fortran_env
character (len=100) :: line
integer :: ReadCode
ReadLoop: do
read (75, '(A)', iostat=ReadCode ) line
if ( ReadCode /= 0 ) then
if ( ReadCode == iostat_end ) then
exit ReadLoop
else
write ( *, '( / "Error reading file: ", I0 )' ) ReadCode
stop
end if
end if
! code to process the line ....
end do ReadLoop
Then the "process the line" code can contain several sections depending on a logical variable "Have_TURX". If Have_TRUX is false you are "seeking" ... test whether the line contains "TURX". You could use a plain "==" if TURX is always at the start of the string, or for more generality you could use the intrinsic function "index" to test whether the string "line" contains TURX.
Once the program is in the mode Have_TRUX is true, then you use "internal I/O" to read the numeric value from the string. Since the integers have varying lengths and are left-justified, the easiest way is to use "list-directed I/O": combining these:
read (line, *) integer_variable
Then you could use the intrinsic function "index" again to test whether the string also contains a slash, in which case you change Have_TRUX to false and end reading mode.
If you need to put the numbers into an array, it might be necessary to read the file twice, or to backspace the file, because you will have to allocate the array, and you can't do that until you know the size of the array. Or you could pop the numbers into a linked list, then when you hit the slash allocate the array and fill it from the linked list. Or if there is a known maximum number of values you could use a temporary array, then transfer the numbers to an allocatable output array. This is assuming that you want the output argument of the subroutine be an allocatable array of the correct length, and the it returns one group of numbers per call:
integer, dimension (:), allocatable, intent (out) :: numbers
allocate (numbers (1: HowMany) )
P.S. There is a brief summary of the language features at http://en.wikipedia.org/wiki/Fortran_95_language_features and the gfortran manual has a summary of the intrinsic procedures, from which you can see what built in functions are available for string handling.
I'll give you a nudge in the right direction so that you can finish your project.
Some basics:
Do/While as you'll need some sort of loop
structure to loop through the file
and then over the numbers. There's
no for loop in Fortran, so use this
type.
Read
to read the strings.
To start you need something like this:
program readlines
implicit none
character (len=30) :: rdline
integer,dimension(1000) :: array
! This sets up a character array with 30 positions and an integer array with 1000
!
open(18,file='fileread.txt')
do
read(18,*) rdline
if (trim(rdline).eq.'TURX') exit !loop until the trimmed off portion matches TURX
end do
See this thread for way to turn your strings into integers.
Final edit: Looks like MSB has got most of what I just found out. The iostat argument of the read is the key to it. See this site for a sample program.
Here was my final way around it.
PROGRAM fetchnumbers
implicit none
character (len=50) ::line, numdata
logical ::is_numeric
integer ::I,iost,iost2,counter=0,number
integer, parameter :: long = selected_int_kind(10)
integer, dimension(1000)::numbers !Can the number of numbers be up to 1000?
open(20,file='inputfile.txt') !assuming file is in the same location as program
ReadLoop: do
read(20,*,iostat=iost) line !read data line by line
if (iost .LT. 0) exit !end of file reached before TURX was found
if (len_trim(line)==0) cycle ReadLoop !ignore empty lines
if (index(line, 'TURX').EQ.1) then !prepare to begin capturing
GetNumbers: do
read(20, *,iostat=iost2)numdata !read in the numbers one by one
if (.NOT.is_numeric(numdata)) exit !no more numbers to read
if (iost2 .LT. 0) exit !end of file reached while fetching numbers
read (numdata,*) number !read string value into a number
counter = counter + 1
Storeloop: do I =1,counter
if (I<counter) cycle StoreLoop
numbers(counter)=number !storing data into array
end do StoreLoop
end do GetNumbers
end if
end do ReadLoop
write(*,*) "Numbers are:"
do I=1,counter
write(*,'(I14)') numbers(I)
end do
END PROGRAM fetchnumbers
FUNCTION is_numeric(string)
IMPLICIT NONE
CHARACTER(len=*), INTENT(IN) :: string
LOGICAL :: is_numeric
REAL :: x
INTEGER :: e
is_numeric = .FALSE.
READ(string,*,IOSTAT=e) x
IF (e == 0) is_numeric = .TRUE.
END FUNCTION is_numeric
I have a binary M such that 34= will always be present and the rest may vary between any number of digits but will always be an integer.
M = [<<"34=21">>]
When I run this command I get an answer like
hd([X || <<"34=", X/binary >> <- M])
Answer -> <<"21">>
How can I get this to be an integer with the most care taken to make it as efficient as possible?
[<<"34=",X/binary>>] = M,
list_to_integer(binary_to_list(X)).
That yields the integer 21
As of R16B, the BIF binary_to_integer/1 can be used:
OTP-10300
Added four new bifs, erlang:binary_to_integer/1,2,
erlang:integer_to_binary/1, erlang:binary_to_float/1 and
erlang:float_to_binary/1,2. These bifs work similarly to how
their list counterparts work, except they operate on
binaries. In most cases converting from and to binaries is
faster than converting from and to lists.
These bifs are auto-imported into erlang source files and can
therefore be used without the erlang prefix.
So that would look like:
[<<"34=",X/binary>>] = M,
binary_to_integer(X).
A string representation of a number can be converted by N-48. For multi-digit numbers you can fold over the binary, multiplying by the power of the position of the digit:
-spec to_int(binary()) -> integer().
to_int(Bin) when is_binary(Bin) ->
to_int(Bin, {size(Bin), 0}).
to_int(_, {0, Acc}) ->
erlang:trunc(Acc);
to_int(<<N/integer, Tail/binary>>, {Pos, Acc}) when N >= 48, N =< 57 ->
to_int(Tail, {Pos-1, Acc + ((N-48) * math:pow(10, Pos-1))}).
The performance of this is around 100 times slower than using the list_to_integer(binary_to_list(X)) option.
By definition the integer division returns the quotient.
Why 4613.9145 div 100. gives an error ("bad argument") ?
For div the arguments need to be integers. / accepts arbitrary numbers as arguments, especially floats. So for your example, the following would work:
1> 4613.9145 / 100.
46.139145
To contrast the difference, try:
2> 10 / 10.
1.0
3> 10 div 10.
1
Documentation: http://www.erlang.org/doc/reference_manual/expressions.html
Update: Integer division, sometimes denoted \, can be defined as:
a \ b = floor(a / b)
So you'll need a floor function, which isn't in the standard lib.
% intdiv.erl
-module(intdiv).
-export([floor/1, idiv/2]).
floor(X) when X < 0 ->
T = trunc(X),
case X - T == 0 of
true -> T;
false -> T - 1
end;
floor(X) ->
trunc(X) .
idiv(A, B) ->
floor(A / B) .
Usage:
$ erl
...
Eshell V5.7.5 (abort with ^G)
> c(intdiv).
{ok,intdiv}
> intdiv:idiv(4613.9145, 100).
46
Integer division in Erlang, div, is defined to take two integers as input and return an integer. The link you give in an earlier comment, http://mathworld.wolfram.com/IntegerDivision.html, only uses integers in its examples so is not really useful in this discussion. Using trunc and round will allow you use any arguments you wish.
I don't know quite what you mean by "definition." Language designers are free to define operators however they wish. In Erlang, they have defined div to accept only integer arguments.
If it is the design decisions of Erlang's creators that you are interested in knowing, you could email them. Also, if you are curious enough to sift through the (remarkably short) grammar, you can find it here. Best luck!
Not sure what you're looking for, #Bertaud. Regardless of how it's defined elsewhere, Erlang's div only works on integers. You can convert the arguments to integers before calling div:
trunc(4613.9145) div 100.
or you can use / instead of div and convert the quotient to an integer afterward:
trunc(4613.9145 / 100).
And trunc may or may not be what you want- you may want round, or floor or ceiling (which are not defined in Erlang's standard library, but aren't hard to define yourself, as miku did with floor above). That's part of the reason Erlang doesn't assume something and do the conversion for you. But in any case, if you want an integer quotient from two non-integers in Erlang, you have to have some sort of explicit conversion step somewhere.