Object does not exist in the pdf file structure - parsing

%PDF-1.5
...
10737 0 obj
<</MarkInfo<</Marked true>>/Metadata 161 0 R/PageLayout/OneColumn/Pages 10732 0 R/StructTreeRoot 206 0 R/Type/Catalog>>
endobj
10738 0 obj
<</Contents[10740 0 R 10741 0 R 10747 0 R 10748 0 R 10749 0 R 10750 0 R 10751 0 R 10752 0 R]/CropBox[0.0 0.0 516.0 728.64]/MediaBox[0.0 0.0 516.0 728.64]/Parent 10733 0 R/Resources<</ColorSpace<</CS0 10771 0 R/CS1 10772 0 R>>/ExtGState<</GS0 10773 0 R>>/Font<</C2_0 10778 0 R/C2_1 10783 0 R/C2_2 10788 0 R/C2_3 10793 0 R/C2_4 10798 0 R/TT0 10800 0 R/TT1 10802 0 R/TT2 10804 0 R/TT3 10806 0 R/TT4 10808 0 R>>/XObject<</Im0 10769 0 R>>>>/Rotate 0/StructParents 0/Tabs/S/Type/Page>>
endobj
10739 0 obj
<</Filter/FlateDecode/First 410/Length 3756/N 38/Type/ObjStm>>stream
10771 0 10772 21 10773 42 10774 138 10775 190 10776 442 10777 741 10778 752 10779 869 10780 921 10781 1190 10782 2050 10783 2061 10784 2192 10785 2244 10786 2504 10787 3456 10788 3467 10789 3587 10790 3639 10791 3903 10792 6058 10793 6069 10794 6196 10795 6248 10796 6507 10797 8153 10798 8164 10799 8284 10800 8496 10801 9662 10802 9894 10803 11072 10804 11325 10805 11779 10806 11985 10807 13147 10808 13395
[/ICCBased 10753 0 R][/ICCBased 10754 0 R]
<</AIS false/BM/Normal/CA 1.0/OP false/OPM 1/SA true/SMask/None/Type/ExtGState/ca 1.0/op false>>
<</Ordering(Identity)/Registry(Adobe)/Supplement 0>><</Ascent 858/CIDSet 10757 0 R/CapHeight 719/Descent -148/Flags 4/FontBBox[-16 -148 1008 858]/FontFamily(\xfe\xff\x00H\x00Y\xc9\x11\xac\xe0\xb5\x15)/FontFile2 10758 0 R/FontName/YDRADB+H2gtrM/FontStretch/Normal/FontWeight 400/ItalicAngle 0/StemV 60/Type/FontDescriptor/XHeight 520>>
...
endstream
endobj
...
No. - Type
10732 - Pages
206 - StructTreeRoot
10771, 10772, 10773, 10778 ... - Font
Many indirect objects including 10732, 206, 10771 and 10772 do not exist in the pdf file.
But I think I found objects 10771~10808 in object 10739 stream.
Q1. Why are there no object 10732(Pages) and 206(StructTreeRoot) in the pdf file?
Q2. Why are indirect objects in stream?
I would be grateful if you would suggest any explanations or resources for reference.

Starting with version 1.5 PDF supports so called object streams, i.e. stream objects which contain other non-stream objects.
Your object 10739 is such an object stream as you can see in its Type ObjStm.
This allows those other objects to be compressed. In particular structure tree objects which otherwise can substantially increase the size of a PDF, can be compressed fairly well, reducing their impact on the document size.
For details please study the PDF specification, section 7.5.7 – Object Streams, in either the current PDF specification ISO 32000-2 or its predecessor ISO 32000-1.
Adobe has shared a copy of ISO 32000-1 on their web site which merely has its ISO page headers replaced. Simply google for "PDF32000_2008"; currently it is located at https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf but as far as I know this isn't a permalink.

Related

unmapped reads using bwa

i'm trying to use BWA MEM to align some WGS files, but i notice something strange.
When I used samtools flagstat to check these .bam files, I notice that most reads were unmapped.
76124692 + 0 in total (QC-passed reads + QC-failed reads)
308 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
708109 + 0 mapped (0.93% : N/A)
76124384 + 0 paired in sequencing
38062192 + 0 read1
38062192 + 0 read2
0 + 0 properly paired (0.00% : N/A)
12806 + 0 with itself and mate mapped
694995 + 0 singletons (0.91% : N/A)
11012 + 0 with mate mapped to a different chr
1682 + 0 with mate mapped to a different chr (mapQ>=5)
Previously, I used Samtofastq to convert my .bam file to .fastq. When I head this file, this is shown:
#SRR1513845.100000000/1
AACGAAACGAAAAGAAAAGAAAAGAAAGAAAAAGAAAGGAACAGAAAAG
+
AAA?=>'2&)&)&&))2(-'(,.%)&31%%'6/6,(1,501046124&6
#SRR1513845.100000000/2
AATTAATTAAGCCCCGAAGGAAGCGAGAAACACTG
+
AAA?B=AB#A#A=?A>AA#?.#?8<.1;><*17?<
#SRR1513845.100000001/1
TATAACCATATAACAAATCCAAGCCCAACAGAGAAGAGAAACAAAAAGA
+
>27<#>&856;.'.&9.%>%::-5194&:+'5);;%1&'/%%999%5(8
#SRR1513845.100000001/2
TCCAACTGATATCGTAATT
+
#3<#A>:8;?:383>=3:=
#SRR1513845.100000003/1
TATCGGTCTTGTTTAG
+
=1;=6?(4>4A13?0A
#SRR1513845.100000003/2
TTCAGGTGCCTCGAAGTTGGATAAGG
+
==>>9#;?3<A5>7);)<9-<25<9?
#SRR1513845.100000004/1
GTCATTTAGCCCAAGAGAATGGC
+
BB#ABA##A?</A>>25A;#4:5
#SRR1513845.100000004/2
GGAGATCGAGTCAAATTTTATGCTAGGTAT
+
%A:<#7A##=4AA?7<A5>#;3&?>>:;:>
#SRR1513845.100000012/1
GCGTCGTTATCCAAAA
+
>A:9:?88=<=0&>>9
#SRR1513845.100000012/2
TGGAAATATTTATTACCCCCCCCCCCCCCCCCCCCCCCCCC
+
A;>#A;4;=??8=:#;-4<?632;=:67;>=):9>9%88=9
#SRR1513845.100000016/1
CGTGGAATGGGGTGTGATTTAATTATCGAATGGCGTCCGATCCAGATT
These characters (<.#;:) are normal and influence in bwa's alignment?
Here is my bwa code:
bwa mem -M -t 38 -p hsa_GRCh38.fa SRR1513_fastqtosam.fq -o SRRR1513_aligned.bam
and my samtofastq code
java -Xmx8G -jar picard.jar SamToFastq \
I= SRR1513_fastqtosam.bam \
FASTQ= SRR1513_fastqtosam.fq \
CLIPPING_ATTRIBUTE=XT \
CLIPPING_ACTION=2 \
INTERLEAVE=true \
NON_PF=true TMP_DIR=./temp
I'm stuck in this from a few hours.
Thanks in advance!
UPDATE:
I just notice a flag during bwa mem alignment
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation FR as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs

Decode UDP message with LUA

I'm relatively new to lua and programming in general (self taught), so please be gentle!
Anyway, I wrote a lua script to read a UDP message from a game. The structure of the message is:
DATAxXXXXaaaaBBBBccccDDDDeeeeFFFFggggHHHH
DATAx = 4 letter ID and x = control character
XXXX = integer shows the group of the data (groups are known)
aaaa...HHHHH = 8 single-precision floating point numbers
The last ones is those numbers I need to decode.
If I print the message as received, it's something like:
DATA*{V???A?A?...etc.
Using string.byte(), I'm getting a stream of bytes like this (I have "formatted" the bytes to reflect the structure above.
68 65 84 65/42/20 0 0 0/237 222 28 66/189 59 182 65/107 42 41 65/33 173 79 63/0 0 128 63/146 41 41 65/0 0 30 66/0 0 184 65
The first 5 bytes are of course the DATA*. The next 4 are the 20th group of data. The next bytes, the ones I need to decode, and are equal to those values:
237 222 28 66 = 39.218
189 59 182 65 = 22.779
107 42 41 65 = 10.573
33 173 79 63 = 0.8114
0 0 128 63 = 1.0000
146 41 41 65 = 10.573
0 0 30 66 = 39.500
0 0 184 65 = 23.000
I've found C# code that does the decode with BitConverter.ToSingle(), but I haven't found any like this for Lua.
Any idea?
What Lua version do you have?
This code works in Lua 5.3
local str = "DATA*\20\0\0\0\237\222\28\66\189\59\182\65..."
-- Read two float values starting from position 10 in the string
print(string.unpack("<ff", str, 10)) --> 39.217700958252 22.779169082642 18
-- 18 (third returned value) is the next position in the string
For Lua 5.1 you have to write special function (or steal it from François Perrad's git repo )
local function binary_to_float(str, pos)
local b1, b2, b3, b4 = str:byte(pos, pos+3)
local sign = b4 > 0x7F and -1 or 1
local expo = (b4 % 0x80) * 2 + math.floor(b3 / 0x80)
local mant = ((b3 % 0x80) * 0x100 + b2) * 0x100 + b1
local n
if mant + expo == 0 then
n = sign * 0.0
elseif expo == 0xFF then
n = (mant == 0 and sign or 0) / 0
else
n = sign * (1 + mant / 0x800000) * 2.0^(expo - 0x7F)
end
return n
end
local str = "DATA*\20\0\0\0\237\222\28\66\189\59\182\65..."
print(binary_to_float(str, 10)) --> 39.217700958252
print(binary_to_float(str, 14)) --> 22.779169082642
It’s little-endian byte-order of IEEE-754 single-precision binary:
E.g., 0 0 128 63 is:
00111111 10000000 00000000 00000000
(63) (128) (0) (0)
Why that equals 1 requires that you understand the very basics of IEEE-754 representation, namely its use of an exponent and mantissa. See here to start.
See #Egor‘s answer above for how to use string.unpack() in Lua 5.3 and one possible implementation you could use in earlier versions.

Erlang application exit,, but vm is still running

My Erlang applicaation processed crashed and then exited, but found that the erlang VM is still running.
I could recieve pong when ping this "suspended node"
Types regs() and the results show below, there is not my app process.
(hub#192.168.1.140)4> regs().
** Registered procs on node 'hub#192.168.1.140' **
Name Pid Initial Call Reds Msgs
application_controlle <0.7.0> erlang:apply/2 30258442 1390
auth <0.20.0> auth:init/1 189 0
code_server <0.26.0> erlang:apply/2 1194028 0
erl_epmd <0.19.0> erl_epmd:init/1 138 0
erl_prim_loader <0.3.0> erlang:apply/2 2914236 0
error_logger <0.6.0> gen_event:init_it/6 49983527 0
file_server_2 <0.25.0> file_server:init/1 16185407 0
global_group <0.24.0> global_group:init/1 107 0
global_name_server <0.13.0> global:init/1 1385 0
gr_counter_sup <0.43.0> supervisor:gr_counter_sup 253 0
gr_lager_default_trac <0.70.0> gr_counter:init/1 121 0
gr_lager_default_trac <0.72.0> gr_manager:init/1 46 0
gr_lager_default_trac <0.69.0> gr_param:init/1 117 0
gr_lager_default_trac <0.71.0> gr_manager:init/1 46 0
gr_manager_sup <0.45.0> supervisor:gr_manager_sup 484 0
gr_param_sup <0.44.0> supervisor:gr_param_sup/1 253 0
gr_sup <0.42.0> supervisor:gr_sup/1 237 0
inet_db <0.16.0> inet_db:init/1 749 0
inet_gethost_native <0.176.0> inet_gethost_native:serve 4698517 0
inet_gethost_native_s <0.175.0> supervisor_bridge:inet_ge 41 0
init <0.0.0> otp_ring0:start/2 30799457 0
kernel_safe_sup <0.35.0> supervisor:kernel/1 278 0
kernel_sup <0.11.0> supervisor:kernel/1 47618 0
lager_crash_log <0.52.0> lager_crash_log:init/1 97712230 0
lager_event <0.50.0> gen_event:init_it/6 1813660437 0
lager_handler_watcher <0.51.0> supervisor:lager_handler_ 358 0
lager_sup <0.49.0> supervisor:lager_sup/1 327 0
net_kernel <0.21.0> net_kernel:init/1 110769667 0
net_sup <0.18.0> supervisor:erl_distributi 313 0
os_cmd_port_creator <0.582.0> erlang:apply/2 81 0
rex <0.12.0> rpc:init/1 15653480 0
standard_error <0.28.0> erlang:apply/2 9 0
standard_error_sup <0.27.0> supervisor_bridge:standar 41 0
timer_server <0.100.0> timer:init/1 59356077 0
user <0.31.0> group:server/3 23837008 0
user_drv <0.30.0> user_drv:server/2 12239455 0
** Registered ports on node 'hub#192.168.1.140' **
Name Id Command
ok
It rarely occurs, but anyone explains it?
System: CentOS 5.8
Erlang: R15B03

Getting a 2D histogram of a grayscale image in Julia

Using the Images package, I can open up a color image, convert it to Gray scale and then :
using Images
img_gld = imread("...path to some color jpg...")
img_gld_gs = convert(Image{Gray},img_gld)
#change from floats to Array of values between 0 and 255:
img_gld_gs = reinterpret(Uint8,data(img_gld_gs))
Now I've got a 1920X1080 array of Uint8's:
julia> img_gld_gs
1920x1080 Array{Uint8,2}
Now I want to get a histogram of the 2D array of Uint8 values:
julia> hist(img_gld_gs)
(0.0:50.0:300.0,
6x1080 Array{Int64,2}:
1302 1288 1293 1302 1297 1300 1257 1234 … 12 13 13 12 13 15 14
618 632 627 618 623 620 663 686 189 187 187 188 185 183 183
0 0 0 0 0 0 0 0 9 9 8 7 8 7 7
0 0 0 0 0 0 0 0 10 12 9 7 13 7 9
0 0 0 0 0 0 0 0 1238 1230 1236 1235 1230 1240 1234
0 0 0 0 0 0 0 0 … 462 469 467 471 471 468 473)
But, instead of 6x1080, I'd like 256 slots in the histogram to show total number of times each value has appeared. I tried:
julia> hist(img_gld_gs,256)
But that gives:
(2.0:1.0:252.0,
250x1080 Array{Int64,2}:
So instead of a 256x1080 Array, it's 250x1080. Is there any way to force it to have 256 bins (without resorting to writing my own hist function)? I want to be able to compare different images and I want the histogram for each image to have the same number of bins.
Assuming you want a histogram for the entire image (rather than one per row), you might want
hist(vec(img_gld_gs), -1:255)
which first converts the image to a 1-dimensional vector. (You can also use img_gld_gs[:], but that copies the data.)
Also note the range here: the hist function uses a left-open interval, so it will omit counting zeros unless you use something smaller than 0.
hist also accepts a vector (or range) as an optional argument that specifies the edge boundaries, so
hist(img_gld_gs, 0:256)
should work.

puppettop? How can I see what puppet is doing?

I'm getting started with Puppet. I added some lines to a manifest and when running Puppet now it's been at 100% cpu for a very longtime. Is there a good way to see what puppet is actually doing? puppettop?
top gives me this, which is quite useless:
top - 20:02:11 up 1 day, 2:30, 5 users, load average: 1.02, 1.12, 0.93
Tasks: 164 total, 2 running, 162 sleeping, 0 stopped, 0 zombie
Cpu(s): 12.5%us, 0.0%sy, 0.0%ni, 87.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 32809072k total, 10412396k used, 22396676k free, 243832k buffers
Swap: 16768892k total, 0k used, 16768892k free, 6978500k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9761 root 20 0 646m 558m 3568 R 100 1.7 20:42.88 puppet
10509 guaka 20 0 17344 1352 972 R 0 0.0 0:00.28 top
1 root 20 0 24216 2192 1344 S 0 0.0 0:00.85 init
2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd
3 root 20 0 0 0 0 S 0 0.0 0:02.83 ksoftirqd/0
4 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/0:0
5 root 0 -20 0 0 0 S 0 0.0 0:00.00 kworker/0:0H

Resources