test_1:
addi $sp, $sp, -4
sw $ra, 0($sp)
jal test_2
lw $ra, 0($sp)
addi $sp, $sp, 4
jr $ra
test_2:
addi $sp, $sp, -4
sw $ra, 4($sp)
jal test_3
lw $ra, 4($sp)
addi $sp, $sp, 4
jr $ra
Error: Error in : invalid program counter value: 0x00000000. Go: execution terminated with errors.
I don't understand why I get this error. My understanding is that test_1 makes space for one in the stack and saves its $ra. After the jal test_2 instruction we jump to test_2 and here test_2 makes space for the second $ra (-4 -4 = -8).
What am I doing wrong?
Related
Sorry, I don't know how to correctly formulate this question. But I will try it with a small example.
I'm searching for a directive in GAS to add a specific amount of padding instructions (NOP's) to the next label. The following example is a extract of the vector table for the AArch64 architecture. Each entry for the handler should be 32 instructions (128 bytes) long.
My goal:
_vec_tbl_cur_sp0:
_vec_tbl_cur_sp0_syn:
mov x0, #0x0
bl _vec_handler_syn
eret
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
_vec_tbl_cur_sp0_irq:
Here I want, that GAS fill the table with the correct amount of NOP's after the ERET instruction to the next label _vec_tbl_cur_sp0_irq.
Is there a directive or macro in GAS to do this?
I have found the solution. The required directive is .balgin.
.balign[wl] abs-expr, abs-expr, abs-expr
Pad the location counter (in the current subsection) to a particular
storage boundary. The first expression (which must be absolute) is the
alignment request in bytes. For example `.balign 8' advances the
location counter until it is a multiple of 8. If the location counter
is already a multiple of 8, no change is needed.
The second expression (also absolute) gives the fill value to be
stored in the padding bytes. It (and the comma) may be omitted. If it
is omitted, the padding bytes are normally zero. However, on some
systems, if the section is marked as containing code and the fill
value is omitted, the space is filled with no-op instructions.
The third expression is also absolute, and is also optional. If it is
present, it is the maximum number of bytes that should be skipped by
this alignment directive. If doing the alignment would require
skipping more bytes than the specified maximum, then the alignment is
not done at all. You can omit the fill value (the second argument)
entirely by simply using two commas after the required alignment; this
can be useful if you want the alignment to be filled with no-op
instructions when appropriate.
The .balignw and .balignl directives are variants of the .balign
directive. The .balignw directive treats the fill pattern as a two
byte word value. The .balignl directives treats the fill pattern as a
four byte longword value. For example, .balignw 4,0x368d will align to
a multiple of 4. If it skips two bytes, they will be filled in with
the value 0x368d (the exact placement of the bytes depends upon the
endianness of the processor). If it skips 1 or 3 bytes, the fill value
is undefined.
Quote from the GAS manual
The following works as excepted:
_vec_tbl_s:
_vec_tbl_cur_sp0:
_vec_tbl_cur_sp0_syn:
mov x0, #0x0
bl _vec_handler_syn
eret
.balign 128
_vec_tbl_cur_sp0_irq:
mov x0, #0x0
bl _vec_handler_irq
eret
.balign 128
_vec_tbl_cur_sp0_fiq:
mov x0, #0x0
bl _vec_handler_fiq
eret
.balign 128
_vec_tbl_cur_sp0_err:
mov x0, #0x0
bl _vec_handler_err
eret
.balign 128
The disassembled binary:
$ aarch64-none-elf-objdump -d vector.o
vector.o: file format elf64-littleaarch64
Disassembly of section .text:
0000000000000000 <_vec_tbl_cur_sp0>:
0: d2800000 mov x0, #0x0 // #0
4: 940001ff bl 800 <_vec_handler_syn>
8: d69f03e0 eret
c: d503201f nop
10: d503201f nop
14: d503201f nop
18: d503201f nop
1c: d503201f nop
20: d503201f nop
24: d503201f nop
28: d503201f nop
2c: d503201f nop
30: d503201f nop
34: d503201f nop
38: d503201f nop
3c: d503201f nop
40: d503201f nop
44: d503201f nop
48: d503201f nop
4c: d503201f nop
50: d503201f nop
54: d503201f nop
58: d503201f nop
5c: d503201f nop
60: d503201f nop
64: d503201f nop
68: d503201f nop
6c: d503201f nop
70: d503201f nop
74: d503201f nop
78: d503201f nop
7c: d503201f nop
0000000000000080 <_vec_tbl_cur_sp0_irq>:
80: d2800000 mov x0, #0x0 // #0
84: 940001e1 bl 808 <_vec_handler_irq>
88: d69f03e0 eret
8c: d503201f nop
90: d503201f nop
94: d503201f nop
98: d503201f nop
9c: d503201f nop
a0: d503201f nop
a4: d503201f nop
a8: d503201f nop
ac: d503201f nop
b0: d503201f nop
b4: d503201f nop
b8: d503201f nop
bc: d503201f nop
c0: d503201f nop
c4: d503201f nop
c8: d503201f nop
cc: d503201f nop
d0: d503201f nop
d4: d503201f nop
d8: d503201f nop
dc: d503201f nop
e0: d503201f nop
e4: d503201f nop
e8: d503201f nop
ec: d503201f nop
f0: d503201f nop
f4: d503201f nop
f8: d503201f nop
fc: d503201f nop
0000000000000100 <_vec_tbl_cur_sp0_fiq>:
100: d2800000 mov x0, #0x0 // #0
104: 940001c3 bl 810 <_vec_handler_fiq>
108: d69f03e0 eret
10c: d503201f nop
110: d503201f nop
114: d503201f nop
118: d503201f nop
11c: d503201f nop
120: d503201f nop
124: d503201f nop
128: d503201f nop
12c: d503201f nop
130: d503201f nop
134: d503201f nop
138: d503201f nop
13c: d503201f nop
140: d503201f nop
144: d503201f nop
148: d503201f nop
14c: d503201f nop
150: d503201f nop
154: d503201f nop
158: d503201f nop
15c: d503201f nop
160: d503201f nop
164: d503201f nop
168: d503201f nop
16c: d503201f nop
170: d503201f nop
174: d503201f nop
178: d503201f nop
17c: d503201f nop
0000000000000180 <_vec_tbl_cur_sp0_err>:
180: d2800000 mov x0, #0x0 // #0
184: 940001a5 bl 818 <_vec_handler_err>
188: d69f03e0 eret
18c: d503201f nop
190: d503201f nop
194: d503201f nop
198: d503201f nop
19c: d503201f nop
1a0: d503201f nop
1a4: d503201f nop
1a8: d503201f nop
1ac: d503201f nop
1b0: d503201f nop
1b4: d503201f nop
1b8: d503201f nop
1bc: d503201f nop
1c0: d503201f nop
1c4: d503201f nop
1c8: d503201f nop
1cc: d503201f nop
1d0: d503201f nop
1d4: d503201f nop
1d8: d503201f nop
1dc: d503201f nop
1e0: d503201f nop
1e4: d503201f nop
1e8: d503201f nop
1ec: d503201f nop
1f0: d503201f nop
1f4: d503201f nop
1f8: d503201f nop
1fc: d503201f nop
Sorry, asked too quickly.
Trying to parse this file with Ruby CSV.
https://www.sec.gov/files/data/broker-dealers/company-information-about-active-broker-dealers/bd070219.txt
However, I am getting an error.
CSV.open(file_name, "r", { :col_sep => "\t", :row_sep => "\n\r" }).each do |row|
puts row
end
CSV::MalformedCSVError: New line must be <"\n\r"> not <"\r"> in line
1.
Windows row_sep is "\r\n", not "\n\r". However this CSV is malformed. Looking at it using a hex editor it appears to be using "\r\r\n".
It's tab-delimited.
In addition it is not using proper quoting, line 247 has 600 "B" STREET STE. 2204, so you need to turn off quote characters.
quote_char: nil, col_sep: "\t", row_sep: "\r\r\n"
There's an extra tab on the end, each line ends with \t\r\r\n. You can also look at it as using a row_sep of "\r\n" with an extra \r field.
quote_char: nil, col_sep: "\t", row_sep: "\r\n"
Or you can view it as having a row_sep of \t\r\r\n and no extra field.
quote_char: nil, col_sep: "\t", row_sep: "\t\r\r\n"
Either way, it's a mess.
I used a hex editor to look at the file as text and raw data side by side. This let me see what's truly at the end of the line.
87654321 0011 2233 4455 6677 8899 aabb ccdd eeff 0123456789abcdef
00000000: 3030 3030 3030 3139 3034 0941 4252 4148 0000001904.ABRAH
00000010: 414d 2053 4543 5552 4954 4945 5320 434f AM SECURITIES CO
00000020: 5250 4f52 4154 494f 4e09 3030 3832 3934 RPORATION.008294
00000030: 3532 0933 3732 3420 3437 5448 2053 5452 52.3724 47TH STR
00000040: 4545 5420 4354 2e20 4e57 0920 0947 4947 EET CT. NW. .GIG
00000050: 2048 4152 424f 5209 5741 0939 3833 3335 HARBOR.WA.98335
00000060: 090d 0d0a 3030 3030 3030 3233 3033 0950 ....0000002303.P
^^^^^^^^^
Hex 09 0d 0d 0a is \t\r\r\n.
Alternatively, you can print the lines with p and any invisible characters will be revealed.
f = File.open(file_name)
p f.readline
"0000001904\tABRAHAM SECURITIES CORPORATION\t00829452\t3724 47TH STREET CT. NW\t \tGIG HARBOR\tWA\t98335\t\r\r\n"
Use :row_sep => :auto instead of :row_sep => "\n\r":
CSV.open(file_name, "r", { :col_sep => "\t", :row_sep => :auto }).each do |row|
puts row
end
I have a Google Cloud Dataflow job with which I would like to extract named entities from a text using a specific spacy model neural coref.
Running the extraction without beam I can extract entities but when I try to run it with the DirectRunner the job fails due to a serialisation error from msgpack. I am not sure how to proceed in debugging this problem.
My requirements are quite barebones with requirements of:
apache-beam[gcp]==2.4
spacy==2.0.12
ujson==1.35
The issue might be something related to how spacy and beam are interplaying as the stacktrace shows spacy spouting out some of its methods which it shouldn't be doing.
Weird spacy log behaviour from stacktrace:
T4: <class 'entity.extract_entities.EntityExtraction'>
# T4
D2: <dict object at 0x1126c0398>
T4: <class 'spacy.lang.en.English'>
# T4
D2: <dict object at 0x1126b54b0>
D2: <dict object at 0x1126d1168>
F2: <function is_alpha at 0x11266d320>
# F2
F2: <function is_ascii at 0x112327c08>
# F2
F2: <function is_digit at 0x11266d398>
# F2
F2: <function is_lower at 0x11266d410>
# F2
F2: <function is_punct at 0x112327b90>
# F2
F2: <function is_space at 0x11266d488>
# F2
F2: <function is_title at 0x11266d500>
# F2
F2: <function is_upper at 0x11266d578>
# F2
F2: <function like_url at 0x11266d050>
# F2
F2: <function like_num at 0x110d55140>
# F2
F2: <function like_email at 0x112327f50>
# F2
Fu: <functools.partial object at 0x11266c628>
F2: <function _create_ftype at 0x1070af500>
# F2
T1: <type 'functools.partial'>
F2: <function _load_type at 0x1070af398>
# F2
# T1
F2: <function is_stop at 0x11266d5f0>
# F2
D2: <dict object at 0x1126b7168>
T4: <type 'set'>
# T4
# D2
# Fu
F2: <function is_oov at 0x11266d668>
# F2
F2: <function is_bracket at 0x112327cf8>
# F2
F2: <function is_quote at 0x112327d70>
# F2
F2: <function is_left_punct at 0x112327de8>
# F2
F2: <function is_right_punct at 0x112327e60>
# F2
F2: <function is_currency at 0x112327ed8>
# F2
Fu: <functools.partial object at 0x110d49ba8>
F2: <function _get_attr_unless_lookup at 0x1106e26e0>
# F2
F2: <function lower at 0x11266d140>
# F2
D2: <dict object at 0x112317c58>
# D2
D2: <dict object at 0x110e38168>
# D2
D2: <dict object at 0x112669c58>
# D2
# Fu
F2: <function word_shape at 0x11266d0c8>
# F2
F2: <function prefix at 0x11266d1b8>
# F2
F2: <function suffix at 0x11266d230>
# F2
F2: <function get_prob at 0x11266d6e0>
# F2
F2: <function cluster at 0x11266d2a8>
# F2
F2: <function _return_en at 0x11266f0c8>
# F2
# D2
B2: <built-in function unpickle_vocab>
# B2
T4: <type 'spacy.strings.StringStore'>
# T4
My current hypothesis is that perhaps there is some problem with my setup.py but I am not sure what is causing the issue currently.
The full stacktrace is:
/Users/chris/coref_entity_extraction/venv/lib/python2.7/site-packages/msgpack_numpy.py:183: DeprecationWarning: encoding is deprecated, Use raw=False instead.
return _unpackb(packed, **kwargs)
/Users/chris/coref_entity_extraction/venv/lib/python2.7/site-packages/msgpack_numpy.py:132: DeprecationWarning: encoding is deprecated.
use_bin_type=use_bin_type)
T4: <class 'entity.extract_entities.EntityExtraction'>
# T4
D2: <dict object at 0x1126c0398>
T4: <class 'spacy.lang.en.English'>
# T4
D2: <dict object at 0x1126b54b0>
D2: <dict object at 0x1126d1168>
F2: <function is_alpha at 0x11266d320>
# F2
F2: <function is_ascii at 0x112327c08>
# F2
F2: <function is_digit at 0x11266d398>
# F2
F2: <function is_lower at 0x11266d410>
# F2
F2: <function is_punct at 0x112327b90>
# F2
F2: <function is_space at 0x11266d488>
# F2
F2: <function is_title at 0x11266d500>
# F2
F2: <function is_upper at 0x11266d578>
# F2
F2: <function like_url at 0x11266d050>
# F2
F2: <function like_num at 0x110d55140>
# F2
F2: <function like_email at 0x112327f50>
# F2
Fu: <functools.partial object at 0x11266c628>
F2: <function _create_ftype at 0x1070af500>
# F2
T1: <type 'functools.partial'>
F2: <function _load_type at 0x1070af398>
# F2
# T1
F2: <function is_stop at 0x11266d5f0>
# F2
D2: <dict object at 0x1126b7168>
T4: <type 'set'>
# T4
# D2
# Fu
F2: <function is_oov at 0x11266d668>
# F2
F2: <function is_bracket at 0x112327cf8>
# F2
F2: <function is_quote at 0x112327d70>
# F2
F2: <function is_left_punct at 0x112327de8>
# F2
F2: <function is_right_punct at 0x112327e60>
# F2
F2: <function is_currency at 0x112327ed8>
# F2
Fu: <functools.partial object at 0x110d49ba8>
F2: <function _get_attr_unless_lookup at 0x1106e26e0>
# F2
F2: <function lower at 0x11266d140>
# F2
D2: <dict object at 0x112317c58>
# D2
D2: <dict object at 0x110e38168>
# D2
D2: <dict object at 0x112669c58>
# D2
# Fu
F2: <function word_shape at 0x11266d0c8>
# F2
F2: <function prefix at 0x11266d1b8>
# F2
F2: <function suffix at 0x11266d230>
# F2
F2: <function get_prob at 0x11266d6e0>
# F2
F2: <function cluster at 0x11266d2a8>
# F2
F2: <function _return_en at 0x11266f0c8>
# F2
# D2
B2: <built-in function unpickle_vocab>
# B2
T4: <type 'spacy.strings.StringStore'>
# T4
Traceback (most recent call last):
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Users/chris/coref_entity_extraction/main.py", line 29, in <module>
run()
File "/Users/chris/coref_entity_extraction/main.py", line 24, in run
entities = records | 'ExtractEntities' >> beam.ParDo(EntityExtraction())
File "/Users/chris/coref_entity_extraction/venv/lib/python2.7/site-packages/apache_beam/transforms/core.py", line 784, in __init__
super(ParDo, self).__init__(fn, *args, **kwargs)
File "/Users/chris/coref_entity_extraction/venv/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 638, in __init__
self.fn = pickler.loads(pickler.dumps(self.fn))
File "/Users/chris/coref_entity_extraction/venv/lib/python2.7/site-packages/apache_beam/internal/pickler.py", line 204, in dumps
s = dill.dumps(o)
File "/Users/chris/coref_entity_extraction/venv/lib/python2.7/site-packages/dill/dill.py", line 259, in dumps
dump(obj, file, protocol, byref, fmode, recurse)#, strictio)
File "/Users/chris/coref_entity_extraction/venv/lib/python2.7/site-packages/dill/dill.py", line 252, in dump
pik.dump(obj)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 425, in save_reduce
save(state)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/Users/chris/coref_entity_extraction/venv/lib/python2.7/site-packages/apache_beam/internal/pickler.py", line 172, in new_save_module_dict
return old_save_module_dict(pickler, obj)
File "/Users/chris/coref_entity_extraction/venv/lib/python2.7/site-packages/dill/dill.py", line 841, in save_module_dict
StockPickler.save_dict(pickler, obj)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 655, in save_dict
self._batch_setitems(obj.iteritems())
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 692, in _batch_setitems
save(v)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 425, in save_reduce
save(state)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/Users/chris/coref_entity_extraction/venv/lib/python2.7/site-packages/apache_beam/internal/pickler.py", line 172, in new_save_module_dict
return old_save_module_dict(pickler, obj)
File "/Users/chris/coref_entity_extraction/venv/lib/python2.7/site-packages/dill/dill.py", line 841, in save_module_dict
StockPickler.save_dict(pickler, obj)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 655, in save_dict
self._batch_setitems(obj.iteritems())
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 687, in _batch_setitems
save(v)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 401, in save_reduce
save(args)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 568, in save_tuple
save(element)
File "/Users/chris/.pyenv/versions/2.7.14/lib/python2.7/pickle.py", line 306, in save
rv = reduce(self.proto)
File "vectors.pyx", line 108, in spacy.vectors.Vectors.__reduce__
File "vectors.pyx", line 409, in spacy.vectors.Vectors.to_bytes
File "/Users/chris/coref_entity_extraction/venv/lib/python2.7/site-packages/spacy/util.py", line 485, in to_bytes
serialized[key] = getter()
File "vectors.pyx", line 404, in spacy.vectors.Vectors.to_bytes.serialize_weights
File "/Users/chris/coref_entity_extraction/venv/lib/python2.7/site-packages/msgpack_numpy.py", line 165, in packb
return Packer(**kwargs).pack(o)
File "msgpack/_packer.pyx", line 282, in msgpack._cmsgpack.Packer.pack
File "msgpack/_packer.pyx", line 288, in msgpack._cmsgpack.Packer.pack
File "msgpack/_packer.pyx", line 285, in msgpack._cmsgpack.Packer.pack
File "msgpack/_packer.pyx", line 232, in msgpack._cmsgpack.Packer._pack
File "msgpack/_packer.pyx", line 279, in msgpack._cmsgpack.Packer._pack
TypeError: can not serialize 'buffer' object
I have no idea about how to debug this issue with beam. To reproduce the whole issue I have setup a repo with instructions about how to set everything: https://github.com/swartchris8/coref_barebones
Are you able to run the same code from a regular Python program (not from a Beam DoFn) ?
If not, check whether you are storing any non-serializable state in a Beam DoFn (or any other function that will be serialized by Beam). This prevents Beam runners from serializing these functions (to be sent to workers) hence should be avoided.
In the end I got rid of the above the issue by changing the package versions installed. I do think it debugging the beam setup process is quite painful though my approach was just to manually try different package permutations.
How do I print to multiple files simultaneously in Julia? Is there a cleaner way other than:
for f in [open("file1.txt", "w"), open("file2.txt", "w")]
write(f, "content")
close(f)
end
From your question I assume that you do not mean write in parallel (which probably would not speed up things due to the fact that the operation is probably IO bound).
Your solution has one small problem - it does not guarantee that f is closed if write throws an exception.
Here are three alternative ways to do it making sure the file is closed even on error:
for fname in ["file1.txt", "file2.txt"]
open(fname, "w") do f
write(f, "content")
end
end
for fname in ["file1.txt", "file2.txt"]
open(f -> write(f, "content"), fname, "w")
end
foreach(fn -> open(f -> write(f, "content"), fn, "w"),
["file1.txt", "file2.txt"])
They give the same result so the choice is a matter of taste (you can derive some more variations of similar implementations).
All the methods are based on the following method of open function:
open(f::Function, args...; kwargs....)
Apply the function f to the result of open(args...; kwargs...)
and close the resulting file descriptor upon completion.
Observe that the processing will still be terminated if an exception is actually thrown somewhere (it is only guaranteed that the file descriptor will be closed). In order to ensure that every write operation is actually attempted you can do something like:
for fname in ["file1.txt", "file2.txt"]
try
open(fname, "w") do f
write(f, "content")
end
catch ex
# here decide what should happen on error
# you might want to investigate the value of ex here
end
end
See https://docs.julialang.org/en/latest/manual/control-flow/#The-try/catch-statement-1 for the documentation of try/catch.
If you really want to write in parallel (using multiple processes) you can do it as follows:
using Distributed
addprocs(4) # using, say, 4 examples
function ppwrite()
#sync #distributed for i in 1:10
open("file$(i).txt", "w") do f
write(f, "content")
end
end
end
For comparison, the serial version would be
function swrite()
for i in 1:10
open("file$(i).txt", "w") do f
write(f, "content")
end
end
end
On my machine (ssd + quadcore) this leads to a ~70% speedup:
julia> #btime ppwrite();
3.586 ms (505 allocations: 25.56 KiB)
julia> #btime swrite();
6.066 ms (90 allocations: 6.41 KiB)
However, be aware that these timings might drastically change for real content, which might have to be transferred to different processes. Also they probably won't scale as IO will typically be the bottleneck at some point.
Update: larger (string) content
julia> using Distributed, Random, BenchmarkTools
julia> addprocs(4);
julia> global const content = [string(rand(1000,1000)) for _ in 1:10];
julia> function ppwrite()
#sync #distributed for i in 1:10
open("file$(i).txt", "w") do f
write(f, content[i])
end
end
end
ppwrite (generic function with 1 method)
julia> function swrite()
for i in 1:10
open("file$(i).txt", "w") do f
write(f, content[i])
end
end
end
swrite (generic function with 1 method)
julia> #btime swrite()
63.024 ms (110 allocations: 6.72 KiB)
julia> #btime ppwrite()
23.464 ms (509 allocations: 25.63 KiB) # ~ 2.7x speedup
Doing the same thing with string representations of larger 10000x10000 matrices (3 instead of 10) results in
julia> #time swrite()
7.189072 seconds (23.60 k allocations: 1.208 MiB)
julia> #time swrite()
7.293704 seconds (37 allocations: 2.172 KiB)
julia> #time ppwrite();
16.818494 seconds (2.53 M allocations: 127.230 MiB) # > 2x slowdown of first call
julia> #time ppwrite(); # 30%$ slowdown of second call
9.729389 seconds (556 allocations: 35.453 KiB)
Just to add a coroutine version that does IO in parallel like the multiple-process one, but also avoids the data duplication and transfer.
julia> using Distributed, Random
julia> global const content = [randstring(10^8) for _ in 1:10];
julia> function swrite()
for i in 1:10
open("file$(i).txt", "w") do f
write(f, content[i])
end
end
end
swrite (generic function with 1 method)
julia> #time swrite()
1.339323 seconds (23.68 k allocations: 1.212 MiB)
julia> #time swrite()
1.876770 seconds (114 allocations: 6.875 KiB)
julia> function awrite()
#sync for i in 1:10
#async open("file$(i).txt", "w") do f
write(f, "content")
end
end
end
awrite (generic function with 1 method)
julia> #time awrite()
0.243275 seconds (155.80 k allocations: 7.465 MiB)
julia> #time awrite()
0.001744 seconds (144 allocations: 14.188 KiB)
julia> addprocs(4)
4-element Array{Int64,1}:
2
3
4
5
julia> function ppwrite()
#sync #distributed for i in 1:10
open("file$(i).txt", "w") do f
write(f, "content")
end
end
end
ppwrite (generic function with 1 method)
julia> #time ppwrite()
1.806847 seconds (2.46 M allocations: 123.896 MiB, 1.74% gc time)
Task (done) #0x00007f23fa2a8010
julia> #time ppwrite()
0.062830 seconds (5.54 k allocations: 289.161 KiB)
Task (done) #0x00007f23f8734010
If you only needed to read the files line by line, you could probably do something like this:
for (line_a, line_b) in zip(eachline("file_a.txt"), eachline("file_b.txt"))
# do stuff
end
As eachline will return an iterable EachLine, which will have an I/O stream linked to it.
I am having text file t.txt,I want to calculate sum of all the digits in text file
Example
--- t.txt ---
The rahul jumped in 2 the well. The water was cold at 1 degree Centigrade. There were 3 grip holes on the walls. The well was 17 feet deep.
--- EOF --
sum 2+1+3+1+7
My ruby code to calculate sum is
ruby -e "File.read('t.txt').split.inject(0){|mem, obj| mem += obj.to_f}"
But i am not getting any answer??
str = "The rahul jumped in 2 the well. The water was cold at 1 degree Centigrade. There were 3 grip holes on the walls. The well was 17 feet deep."
To get sum of all integers:
str.scan(/\d+/).sum(&:to_i)
# => 23
Or to get sum of all digits as in your example:
str.scan(/\d+?/).sum(&:to_i)
# => 14
PS: I used sum seeing Rails tag. If you are only using Ruby you can use inject instead.
Example with inject
str.scan(/\d/).inject(0) { |sum, a| sum + a.to_i }
# => 14
str.scan(/\d+/).inject(0) { |sum, a| sum + a.to_i }
# => 23
Your statement is computing correctly. Just add puts before File read as:
ruby -e "puts File.read('t.txt').split.inject(0){|mem, obj| mem += obj.to_f}"
# => 23.0
For summing single digit only:
ruby -e "puts File.read('t.txt').scan(/\d/).inject(0){|mem, obj| mem += obj.to_f}"
# => 14.0
Thanks