unmapped reads using bwa - alignment

i'm trying to use BWA MEM to align some WGS files, but i notice something strange.
When I used samtools flagstat to check these .bam files, I notice that most reads were unmapped.
76124692 + 0 in total (QC-passed reads + QC-failed reads)
308 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
708109 + 0 mapped (0.93% : N/A)
76124384 + 0 paired in sequencing
38062192 + 0 read1
38062192 + 0 read2
0 + 0 properly paired (0.00% : N/A)
12806 + 0 with itself and mate mapped
694995 + 0 singletons (0.91% : N/A)
11012 + 0 with mate mapped to a different chr
1682 + 0 with mate mapped to a different chr (mapQ>=5)
Previously, I used Samtofastq to convert my .bam file to .fastq. When I head this file, this is shown:
#SRR1513845.100000000/1
AACGAAACGAAAAGAAAAGAAAAGAAAGAAAAAGAAAGGAACAGAAAAG
+
AAA?=>'2&)&)&&))2(-'(,.%)&31%%'6/6,(1,501046124&6
#SRR1513845.100000000/2
AATTAATTAAGCCCCGAAGGAAGCGAGAAACACTG
+
AAA?B=AB#A#A=?A>AA#?.#?8<.1;><*17?<
#SRR1513845.100000001/1
TATAACCATATAACAAATCCAAGCCCAACAGAGAAGAGAAACAAAAAGA
+
>27<#>&856;.'.&9.%>%::-5194&:+'5);;%1&'/%%999%5(8
#SRR1513845.100000001/2
TCCAACTGATATCGTAATT
+
#3<#A>:8;?:383>=3:=
#SRR1513845.100000003/1
TATCGGTCTTGTTTAG
+
=1;=6?(4>4A13?0A
#SRR1513845.100000003/2
TTCAGGTGCCTCGAAGTTGGATAAGG
+
==>>9#;?3<A5>7);)<9-<25<9?
#SRR1513845.100000004/1
GTCATTTAGCCCAAGAGAATGGC
+
BB#ABA##A?</A>>25A;#4:5
#SRR1513845.100000004/2
GGAGATCGAGTCAAATTTTATGCTAGGTAT
+
%A:<#7A##=4AA?7<A5>#;3&?>>:;:>
#SRR1513845.100000012/1
GCGTCGTTATCCAAAA
+
>A:9:?88=<=0&>>9
#SRR1513845.100000012/2
TGGAAATATTTATTACCCCCCCCCCCCCCCCCCCCCCCCCC
+
A;>#A;4;=??8=:#;-4<?632;=:67;>=):9>9%88=9
#SRR1513845.100000016/1
CGTGGAATGGGGTGTGATTTAATTATCGAATGGCGTCCGATCCAGATT
These characters (<.#;:) are normal and influence in bwa's alignment?
Here is my bwa code:
bwa mem -M -t 38 -p hsa_GRCh38.fa SRR1513_fastqtosam.fq -o SRRR1513_aligned.bam
and my samtofastq code
java -Xmx8G -jar picard.jar SamToFastq \
I= SRR1513_fastqtosam.bam \
FASTQ= SRR1513_fastqtosam.fq \
CLIPPING_ATTRIBUTE=XT \
CLIPPING_ACTION=2 \
INTERLEAVE=true \
NON_PF=true TMP_DIR=./temp
I'm stuck in this from a few hours.
Thanks in advance!
UPDATE:
I just notice a flag during bwa mem alignment
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation FR as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs

Related

Object does not exist in the pdf file structure

%PDF-1.5
...
10737 0 obj
<</MarkInfo<</Marked true>>/Metadata 161 0 R/PageLayout/OneColumn/Pages 10732 0 R/StructTreeRoot 206 0 R/Type/Catalog>>
endobj
10738 0 obj
<</Contents[10740 0 R 10741 0 R 10747 0 R 10748 0 R 10749 0 R 10750 0 R 10751 0 R 10752 0 R]/CropBox[0.0 0.0 516.0 728.64]/MediaBox[0.0 0.0 516.0 728.64]/Parent 10733 0 R/Resources<</ColorSpace<</CS0 10771 0 R/CS1 10772 0 R>>/ExtGState<</GS0 10773 0 R>>/Font<</C2_0 10778 0 R/C2_1 10783 0 R/C2_2 10788 0 R/C2_3 10793 0 R/C2_4 10798 0 R/TT0 10800 0 R/TT1 10802 0 R/TT2 10804 0 R/TT3 10806 0 R/TT4 10808 0 R>>/XObject<</Im0 10769 0 R>>>>/Rotate 0/StructParents 0/Tabs/S/Type/Page>>
endobj
10739 0 obj
<</Filter/FlateDecode/First 410/Length 3756/N 38/Type/ObjStm>>stream
10771 0 10772 21 10773 42 10774 138 10775 190 10776 442 10777 741 10778 752 10779 869 10780 921 10781 1190 10782 2050 10783 2061 10784 2192 10785 2244 10786 2504 10787 3456 10788 3467 10789 3587 10790 3639 10791 3903 10792 6058 10793 6069 10794 6196 10795 6248 10796 6507 10797 8153 10798 8164 10799 8284 10800 8496 10801 9662 10802 9894 10803 11072 10804 11325 10805 11779 10806 11985 10807 13147 10808 13395
[/ICCBased 10753 0 R][/ICCBased 10754 0 R]
<</AIS false/BM/Normal/CA 1.0/OP false/OPM 1/SA true/SMask/None/Type/ExtGState/ca 1.0/op false>>
<</Ordering(Identity)/Registry(Adobe)/Supplement 0>><</Ascent 858/CIDSet 10757 0 R/CapHeight 719/Descent -148/Flags 4/FontBBox[-16 -148 1008 858]/FontFamily(\xfe\xff\x00H\x00Y\xc9\x11\xac\xe0\xb5\x15)/FontFile2 10758 0 R/FontName/YDRADB+H2gtrM/FontStretch/Normal/FontWeight 400/ItalicAngle 0/StemV 60/Type/FontDescriptor/XHeight 520>>
...
endstream
endobj
...
No. - Type
10732 - Pages
206 - StructTreeRoot
10771, 10772, 10773, 10778 ... - Font
Many indirect objects including 10732, 206, 10771 and 10772 do not exist in the pdf file.
But I think I found objects 10771~10808 in object 10739 stream.
Q1. Why are there no object 10732(Pages) and 206(StructTreeRoot) in the pdf file?
Q2. Why are indirect objects in stream?
I would be grateful if you would suggest any explanations or resources for reference.
Starting with version 1.5 PDF supports so called object streams, i.e. stream objects which contain other non-stream objects.
Your object 10739 is such an object stream as you can see in its Type ObjStm.
This allows those other objects to be compressed. In particular structure tree objects which otherwise can substantially increase the size of a PDF, can be compressed fairly well, reducing their impact on the document size.
For details please study the PDF specification, section 7.5.7 – Object Streams, in either the current PDF specification ISO 32000-2 or its predecessor ISO 32000-1.
Adobe has shared a copy of ISO 32000-1 on their web site which merely has its ISO page headers replaced. Simply google for "PDF32000_2008"; currently it is located at https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf but as far as I know this isn't a permalink.

Decode UDP message with LUA

I'm relatively new to lua and programming in general (self taught), so please be gentle!
Anyway, I wrote a lua script to read a UDP message from a game. The structure of the message is:
DATAxXXXXaaaaBBBBccccDDDDeeeeFFFFggggHHHH
DATAx = 4 letter ID and x = control character
XXXX = integer shows the group of the data (groups are known)
aaaa...HHHHH = 8 single-precision floating point numbers
The last ones is those numbers I need to decode.
If I print the message as received, it's something like:
DATA*{V???A?A?...etc.
Using string.byte(), I'm getting a stream of bytes like this (I have "formatted" the bytes to reflect the structure above.
68 65 84 65/42/20 0 0 0/237 222 28 66/189 59 182 65/107 42 41 65/33 173 79 63/0 0 128 63/146 41 41 65/0 0 30 66/0 0 184 65
The first 5 bytes are of course the DATA*. The next 4 are the 20th group of data. The next bytes, the ones I need to decode, and are equal to those values:
237 222 28 66 = 39.218
189 59 182 65 = 22.779
107 42 41 65 = 10.573
33 173 79 63 = 0.8114
0 0 128 63 = 1.0000
146 41 41 65 = 10.573
0 0 30 66 = 39.500
0 0 184 65 = 23.000
I've found C# code that does the decode with BitConverter.ToSingle(), but I haven't found any like this for Lua.
Any idea?
What Lua version do you have?
This code works in Lua 5.3
local str = "DATA*\20\0\0\0\237\222\28\66\189\59\182\65..."
-- Read two float values starting from position 10 in the string
print(string.unpack("<ff", str, 10)) --> 39.217700958252 22.779169082642 18
-- 18 (third returned value) is the next position in the string
For Lua 5.1 you have to write special function (or steal it from François Perrad's git repo )
local function binary_to_float(str, pos)
local b1, b2, b3, b4 = str:byte(pos, pos+3)
local sign = b4 > 0x7F and -1 or 1
local expo = (b4 % 0x80) * 2 + math.floor(b3 / 0x80)
local mant = ((b3 % 0x80) * 0x100 + b2) * 0x100 + b1
local n
if mant + expo == 0 then
n = sign * 0.0
elseif expo == 0xFF then
n = (mant == 0 and sign or 0) / 0
else
n = sign * (1 + mant / 0x800000) * 2.0^(expo - 0x7F)
end
return n
end
local str = "DATA*\20\0\0\0\237\222\28\66\189\59\182\65..."
print(binary_to_float(str, 10)) --> 39.217700958252
print(binary_to_float(str, 14)) --> 22.779169082642
It’s little-endian byte-order of IEEE-754 single-precision binary:
E.g., 0 0 128 63 is:
00111111 10000000 00000000 00000000
(63) (128) (0) (0)
Why that equals 1 requires that you understand the very basics of IEEE-754 representation, namely its use of an exponent and mantissa. See here to start.
See #Egor‘s answer above for how to use string.unpack() in Lua 5.3 and one possible implementation you could use in earlier versions.

Mysterious Out Of Memory Error In 6502 BASIC

I am trying to create a simple word search game that will put characters on a board on xpet in the VICE emulator using 6502 BASIC. I find that it used to work fine before I added the status indicator, but now it fails, running out of memory at 45, for more than 3 words. This is very strange, because the status indicator uses the same variables that the rest of the program uses, meaning that it should not take up any more space. Does anyone have any idea what could be causing this error? Is there a hidden bug?
1 uu = 0 : goto 600 : rem So you can just type run. uu=0: (toggle indicator on)
2 rem This is a simple wordsearch game generator
3 rem that takes words and creates a board and key from it.
4 rem Words can only be read/written top to bottom or left to right.
5 rem The board is a square between 3x3 and 21x21. More information at 12000.
8 print "Zero Words, terminating game..." : rem This small piece
9 end : rem of code here is what happens if you put 0 for the number of words.
10 rem Below is the default data (it's random). Hopefully it is enough.
11 data "rat","clam","ten","sip","lie","ugly","not","yet","thou","if"
12 data "dog","cat","no","yes","maybe","truck","car","trap","lamb"
13 data "three","long","short","two","one","zero","cow","cart","house"
14 data "hair","hall","heart","head","who","is","truth","false","cake"
15 data "five","six","seven","eight","nine","ten","good","bad","rad"
16 data "thy","art","short","long","potatoes","tomato","ramble"
17 data "crumble","shambles","lion","turtle","beach","breach"
18 data "data","read","bamboozle","words","list","assasin","off"
19 data "are","viking","knight","sword","random","treacherous","..."
21 rem If it is not enough data, the game will reach the "..." and
22 rem your game will be cut short (have less words). There are
23 rem about 70 to 100 words in this default list.
24 rem Do not remove the "...", or it will crash when you ask for
25 rem too many words (more than the number provided).
40 rem This is substring from le to ri where st$ goes from 0 to len-1
41 rem Left is inclusive and right is not inclusive.
42 rem Assumes that st$ exists; returns sr$. le and ri must also exit.
43 rem This encapsulates the mid function for ease of use.
45 le = le+1: rem: remember basic strings go from 1 to len
46 sr$ = mid$(st$,le,ri-le+1)
47 le = le-1: rem: we try to avoid outside effects
48 return
49 rem: A subroutine that returns a random
50 rem: Boolean (tosses a coin).
51 rem: It will return 0 or 1 randomly; output is in mr.
52 mn = 0: mx = 1
53 gosub 500
54 gosub 10000 : return
55 rem Assumes strings named s0$ and s1$ exist.
56 rem (these are the strings we search for an intersection)
57 s2$ = "": rem returns a string of the intersections in s2$.
58 for i = 0 to len(s0$)-1 : rem: for all characters in s0$
59 st$ = s0$: le = i : ri = i+1
60 gosub 40
61 ch$ = sr$ : rem temp var
62 for j = 0 to len(s1$)-1
63 le = j: ri = j+1: st$ = s1$
64 gosub 40
65 if ch$ = sr$ then s2$ = s2$ + ch$
66 next j
67 next i
68 gosub 10000 : return
99 rem Gets a random float.
100 mr = rnd(0)*(mx-mn)+mn
101 gosub 10000 : return
300 rem Scrambles the order of ps$().
301 for i = 0 to ip-1
302 mx = ip-1
303 mn = 0
304 gosub 500
305 sp$ = ps$(mr)
307 ps$(mr) = ps$(i)
309 ps$(i) = sp$
311 next i
312 gosub 10000 : return
499 rem Random Integer subroutine.
500 mx = mx+1 : rem increment here so top is inclusive
501 gosub 100 : rem run exclusive floating point version
502 mr = int(mr) : rem we turn it into an integer via truncation
503 mx = mx - 1 : rem reset mx, to avoid side effects
504 gosub 10000 : return
505 goto 535
530 rem This subroutine returns gp$(), a two
531 rem dimensional array of all the words where
532 rem for each value of ps$() where a word is
533 rem the list in gp$() of that value has the words
534 rem that match/intersect with it.
535 dim gp$(ip-1,ip-1)
536 for l = 0 to ip-1
537 ix = 0
538 for cc = 0 to ip-1
539 s0$ = ps$(l)
540 s1$ = ps$(cc)
541 gosub 55
543 if s2$ = "" then goto 545
544 if s1s$ = ps$(l) then goto 545 :rem same word
545 gp$(l,ix) = ps$(cc)
549 ix = ix + 1
555 rem SKIP
557 next cc
559 next l
560 gosub 10000 : return
588 rem I highly recommend running this in warp mode to get
589 rem feel for the game, because it could be really slow.
590 rem You can speed it up, however, by not having a status indicator.
592 rem Near the end are guides for developers
593 rem and users. This is our main subroutine.
594 rem It will provide the user with helpful prompts
595 rem and take input from the user regarding how many words, and
596 rem what words to use.
598 rem The global variable "ma" is tracked as the length of the
599 rem longest word. sz is the number of words we play with.
600 restore : ma = 0 : print "How many words do you want?"
601 print "(not too many, or too long please)"
602 print "(we recommend that you put less than 12 words"
603 print "of lengths of 7 characters or below)"
604 input "How many words"; in$ : print "" : print "" : print ""
605 print "Note: spaces and periods are removed from the ends of words."
606 print "This is a good game for children."
607 print "You can only have one of each word."
608 print "Words aren't confirmed as real words."
609 print "So please enter real words to play with them."
610 print "However, we also have default words. To let ";
611 print "the computer finish filling the list, ";
612 print "just enter .... . Enter ... to finish early."
613 ip = val(in$) : if ip <= 0 then goto 8 : rem 0 words
614 dim pr$(ip-1) : sz = 0
615 for i = 0 to ip-1
616 input "Next word"; ii$ : rem Buggy emulator shows "?" regardless
617 ln = len(ii$): if ln > ma then ma = ln:rem update mx wrd lngth
619 if ii$ = "..." then if i = 0 then goto 8 : rem 0 words
620 if ii$ = "..." then sz = -1 : if ii$ = "..." then goto 635
621 if ii$ = "...." then if ip = 1 then goto 900
622 if ii$ = "...." then goto 628 : rem fill with default
623 gosub 2000
624 if cn = 1 then print "that word was already added"
625 if cn = 1 then goto 616 : rem: get the word again
626 gosub 9000 : pr$(i) = ii$ : sz = i
627 next i
628 if i >= ip-1 then goto 635
629 for j = i to ip-1 :rem: fill in the data
630 read ii$ : if ii$ = "..." then goto 1200
631 gosub 2000
632 if cn = 1 then goto 629
633 pr$(j) = ii$ : sz = j
634 next j
635 gosub 890 : rj = 0 : pa = 0 : dim re$(ip-1) : dim pl$(ip-1)
639 print "" : print "" : print ""
640 print "Would you like to have a status indicator? (y/n)"
645 print "Note that it slows you down considerably."
647 input ""; ii$
649 if ii$ = "n" then uu = 1 : if ii$ = "n" then goto 670
650 if ii$ = "y" then uu = 0 : if ii$ = "y" then goto 670
660 rem else
665 print "You must enter y or n. y is Yes and n is No."
668 goto 647
670 rem
671 print "Please wait. The game is loading..."
672 print "In some cases this could take a few minutes..." :
693 print "" : print "" : if ip = 1 then goto 910
694 gosub 4000 : gosub 301 : gosub 535
695 gosub 1500
696 gosub 6500
697 gosub 7000
698 gosub 1000
699 rem go
701 print "" : print "" : print ""
702 print "*********************"
703 print "***WORDSEARCH GAME***"
704 print "*********************"
710 gosub 5000 :rem: print
711 gosub 6600 :rem: the key array and answer list.
790 end
888 rem Turn pr$() into ps$() where the size is the number
889 rem of words (in case they terminate early with "...").
890 if sz = -1 then gosub 1300
891 dim ps$(sz)
892 for i = 0 to sz
893 ps$(i) = pr$(i)
894 next i
896 ip = sz+1
897 rem to make work with "ip-1" elsewhere
898 return
900 dim ps$(0)
901 ps$(0) = "rat"
902 rj = 0 : pa = 0 : dim re$(ip-1) : dim pl$(ip-1)
903 goto 910
906 rem This subroutine continues the running of the game
907 rem normally, except it deals with the case where
908 rem there is only one word. The normal code fails
909 rem in this case (which is why we need 910).
910 gosub 4000
911 gosub 6500
912 dr$ = ps$(0)
913 gosub 949
914 gosub 1001
931 print "" : print "" : print ""
932 print "*********************"
933 print "***WORDSEARCH GAME***"
934 print "*********************"
935 gosub 5000 : rem print the actual game
936 gosub 6600 : rem the key array
940 end
947 rem This subroutine places one word randomly on the board.
948 if ps$(0) = "" then goto 8 : rem (the case with only one word)
949 if len(ps$(0)) > sh then goto 981
950 gosub 49 : me = mr : dr$ = ps$(0)
951 if me = 0 then goto 960 : rem vertical
952 rem here is the code for horizontal
953 mn = 0
954 mx = a-len(dr$)
955 gosub 40 : x = mr
956 mn = 0
957 mx = a-1
958 gosub 40 : y = mr
959 goto 979
960 rem here is our code for vertical
962 mn = 0
963 mx = a-len(dr$)
964 gosub 40 : y = mr
965 mn = 0
967 mx = a-1
968 gosub 40 : x = mr
969 goto 979
979 gosub 6000 : pl$(0) = dr$
980 gosub 10000 : return
981 re$(0) = ps$(0)
982 return
998 rem This subroutine fills the board with random
999 rem characters, assuming that words have already been
1000 rem placed. It copies wo$() to wr$(), then replaces the dots.
1001 for i = 0 to a-1
1002 for j = 0 to a-1
1003 wr$(i,j) = wo$(i,j)
1007 next j
1008 next i
1009 for i = 0 to a-1
1010 for j = 0 to a-1
1012 if wr$(i,j) <> "." then goto 1020
1014 mn = 65: mx = 90
1015 gosub 500
1016 wr$(i,j) = chr$(mr)
1020 next j
1021 next i
1100 gosub 10000 : return
1197 rem This is a helper that will, in the read data loop,
1198 rem use the early termination functionality to avoid crashes.
1199 rem It is here to avoid clutter with ":" commands in main.
1200 sz = -1
1201 if pr$(0) = "..." then goto 8
1202 goto 635
1299 rem Finds our sz; fixes a bug using "...".
1300 sz = 0
1301 for po = 0 to ip-1
1302 if pr$(po) <> "" then sz = po
1303 next po
1304 return
1500 rem This subroutines orders words into an array
1501 rem called fd$(). It orders them in a way such
1502 rem that words next to each other will have an
1503 rem intersection if that is possible. Otherwise,
1504 rem they may not.
1505 rem It makes use of our previous 2D array gp$() to do this.
1510 dim fd$(ip-1) : rem list of words in order we place them
1511 fd$(0) = ps$(0)
1520 for k = 1 to ip-1
1521 st$ = fd$(k-1)
1522 gosub 3500
1523 f = 0 :rem number of words in this subarray
1524 for kc = 0 to ip-1
1525 if gp$(ni,f) = "" then goto 1528
1526 f = kc
1527 next kc
1528 rem now we will find a matching word that is new
1529 for d = 0 to f-1
1530 ii$ = gp$(ni,d) : rem dst word that is matching
1531 gosub 1700 :rem cn = whether it was contained
1532 if cn = 0 then fd$(k) = ii$
1533 if cn = 0 then goto 1649
1534 next d :rem check next subarray item
1535 rem at this point we need an entirely new word
1536 for d = 0 to ip-1
1537 ii$ = ps$(d)
1560 gosub 1700
1570 if cn = 0 then fd$(k) = ii$
1580 if cn = 0 then goto 1649
1590 rem else (haven't found it yet)
1600 next d
1610 rem this should work and is deterministic
1611 rem
1649 next k
1650 gosub 10000 : return
1700 rem Assumes fd$() and ii$ and ip exist. Returns cn.
1701 rem Checks whether ii$ is in fd$(), with 0 as no, and 1 as yes.
1702 cn = 0
1703 for l = 0 to ip-1
1704 if cn = 1 then gosub 10000 : return
1705 if fd$(l) = ii$ then cn = 1
1706 next l
1707 gosub 10000 : return
2000 rem Assumes pr$() and ii$ an ip exist. Outputs cn.
2001 rem returns whether ii$ is in pr$() (0 for no, 1 for yes).
2002 cn = 0
2003 for l = 0 to ip-1
2004 if cn = 1 then gosub 10000 : return
2005 if pr$(l) = ii$ then cn = 1
2006 next l
2007 return
3500 rem Finds the index, outputted as ni, at which st$ is in ps$
3501 rem If it is not present, then it returns -1.
3505 for l = 0 to ip - 1
3510 ni = l
3515 if ps$(ni) = st$ then gosub 10000 : return
3520 next l
3525 ni = -1
3530 gosub 10000 : return
3999 rem Set the board dimensions.
4000 sh = 21: if ma > sh then ma = sh : rem sh = screen height
4001 a = int(ma + ip/5+ ip/8 + ip/10 + ip/20): if a < 3 then a = 3
4002 rem This is an emperically determined subroutine.
4003 if a < ma+1 then a = ma+1
4005 if a > sh then a = sh
4006 dim wr$(a-1,a-1): dim wo$(a-1,a-1) : rem square grid
4007 gosub 10000 : return
4998 rem This subroutine prints the regular game
4999 rem board to the screen.
5000 rem
5001 for i = 0 to a-1
5002 for j = 0 to a-1 : rem now that it is a by a cant use b :P
5003 print wr$(i,j);
5004 next j
5005 print ""
5006 next i
5007 return
6000 rem This subroutine places a word, dr$, on the board & key
6002 rem at wr$/wo$(y,x). me is 1 for horizontal, 0 for vertical.
6003 rem It assumes that this is possible.
6004 for l = 0 to len(dr$)-1
6005 st$ = dr$
6006 le = l
6007 ri = l+1
6008 gosub 40
6009 wr$(y,x) = sr$: wo$(y,x) = sr$
6010 if me = 1 then x = x + 1
6011 if me = 0 then y = y + 1
6012 next l
6013 gosub 10000 : return
6497 rem This little subroutine fills the key board
6498 rem with dots, on top of which words will later be
6499 rem placed. It makes the game more usable.
6500 for po = 0 to a-1
6520 for pp = 0 to a-1
6530 wo$(po,pp) = "."
6540 next pp
6550 next po
6560 gosub 10000 : return
6567 rem This simple subroutine waits for any key to be
6578 rem pressed, and waits in the meantime.
6569 print "Press Any Key To Continue..." : rem ANY KEY
6570 get tm$ : if tm$ = "" then goto 6570
6571 return
6596 rem This subroutine prints out the words
6597 rem that were used and the words that were rejected.
6598 rem It prints the key out last, in case the user wants to search
6599 rem for a word they didn't realize was there once they see
6600 rem the list. (Some people are like that...)
6605 print "To check your answers, "
6611 gosub 6569
6612 print "Words:"
6613 for i = 0 to ip - 1
6614 print pl$(i); : print " ";
6615 next i
6616 print "": gosub 6569
6617 print "Rejected words:"
6618 for i = 0 to ip-1
6619 print re$(i); : print " ";
6620 next i
6621 print "" : gosub 6569 : print "Key:" : print ""
6640 for kl = 0 to a-1
6641 for lk = 0 to a-1
6642 print wo$(kl,lk);
6643 next lk
6644 print ""
6645 next kl
6646 gosub 6569 : print "*********************"
6647 return
6995 rem This subroutine is our smart placing subroutine.
6996 rem Its goal is to place words from fd$() in a way
6997 rem to maximize intersections of words and create
6998 rem the best possible game.
6999 rem Assumes we have fd$(), wo$() and wr$() all ready
7000 id = 0 : rem the initial word spot on the array
7003 if ip = 3 then goto 7500 : rem this somehow fixes a bug
7004 rem above is due to really bad emulator bug, nonsensical
7005 if id > ip-1 then return : rem we have placed all words
7006 if id = ip-1 then goto 7900 : rem 1 word
7010 rem else
7011 goto 7300
7200 rem This is the general smart placer
7300 se$ = fd$(id+1) : si$ = fd$(id)
7310 s0$ = si$ : s1$ = se$ : gosub 55 : fr = 0 : rf = 0
7311 if s2$ = "" then goto 7450
7315 le = 0 : ri = 1 : st$ = s2$ : gosub 40 : tm$ = sr$
7320 for pi = 0 to len(si$)-1
7325 le = pi : ri = pi+1 : st$ = si$ : gosub 40
7330 if sr$ = tm$ then goto 7350
7335 fr = fr + 1
7340 next pi
7350 for po = 0 to len(se$)-1
7355 le = po : ri = po+1 : st$ = se$ : gosub 40
7360 if sr$ = tm$ then goto 7375
7365 rf = rf + 1
7370 next po
7375 rem rf : si$ ;;;;; fr : se$
7380 rem
7385 mf = 1 : mk = 0 : rem 1st word is horizontal, second is vertical
7395 for xx = 0 to a-len(si$)
7400 for yy = rf to a-len(se$)+rf
7401 x = xx : y = yy : dr$ = si$
7402 if xx < 0 then goto 7430
7403 if xx > a-1 then 7430
7404 if yy < 0 then 7428
7405 if yy > a-1 then 7428
7407 kd = mf
7408 gosub 8300
7410 if kk = 0 then goto 7428
7412 x = xx + fr : y = yy - rf
7413 dr$ = se$ : kd = mk
7414 gosub 8300
7415 if kk = 0 then goto 7428
7417 dr$ = si$ : x = xx : y = yy : me = mf : gosub 6000
7420 dr$ = se$ : x = xx + fr : y = yy - rf : me = mk : gosub 6000
7425 pl$(pa) = si$ : pa = pa + 1 : pl$(pa) = se$ : pa = pa + 1
7426 goto 7461
7428 next yy
7430 next xx
7450 dr$ = si$ : gosub 8000
7460 dr$ = se$ : gosub 8000
7461 id = id + 2
7470 goto 7005
7500 dr$ = fd$(0) : gosub 8000
7501 dr$ = fd$(1) : gosub 8000
7502 dr$ = fd$(2) : gosub 8000
7503 return
7900 rem This places the last word
7901 dr$ = fd$(ip-1)
7902 gosub 8000
7926 return
7990 rem This subroutine tries to place a word called dr$
7991 rem anywhere possible on the board by testing
7992 rem all locations with all directions allowed in the game.
8000 rem It is meant for words not placed by our smart placer.
8002 if len(dr$) > sh then goto 8028
8003 for pn = 0 to a-1
8004 for pk = 0 to a-1
8006 kd = 1
8008 gosub 8300
8010 if kk = 1 then gosub 8100 : rem place
8012 if kk = 1 then gosub 10000 : return
8014 rem else (if kk = 0)
8016 kd = 0
8018 gosub 8300
8020 if kk = 1 then gosub 8100 : rem place
8022 if kk = 1 then gosub 10000 : return
8024 next pk
8026 next pn
8028 kk = 0 : re$(rj) = dr$ : rj = rj + 1 : rem add to rejects
8030 gosub 10000 : return
8099 rem This is a parameter converter from 8000 to 6000.
8100 rem It will place dr$ at pn, pk (format (y,x)).
8101 y = pn : x = pk: me = kd: gosub 6000
8102 pl$(pa) = dr$ : pa = pa + 1
8103 gosub 10000 : return
8260 rem this subroutine will check whether
8261 rem we can place a string called dr$
8262 rem in direction kd (0 is vertical, 1 is horizontal)
8263 rem at the position
8264 rem pn, pk (coords as (y,x))
8300 rem returns kk whther we can place
8301 rem (kk = 0 if we can't, = 1 if we can)
8303 lu = len(dr$)
8305 if kd = 1 then goto 8350 : rem horizontal
8307 kk = 1 : rem vertical
8309 if pn+lu-1 > a-1 then kk = 0
8310 for pc = pn to pn+lu-1
8312 le = pc-pn
8313 ri = le+1 : st$ = dr$
8314 gosub 40 : lc$ = sr$
8315 if wo$(pc,pk) <> "." then kk = 0
8316 if wo$(pc,pk) = lc$ then kk = 1
8317 if kk = 0 then gosub 10000 : return
8318 next pc
8319 gosub 10000 : return
8350 kk = 1
8351 if pk+lu-1 > a-1 then kk = 0
8352 if kk = 0 then gosub 10000 : return
8353 for pc = pk to pk+lu-1
8362 le = pc-pk
8363 ri = le+1 : st$ = dr$
8364 gosub 40 : lc$ = sr$
8374 if wo$(pn,pc) <> "." then kk = 0
8376 if wo$(pc,pk) = lc$ then kk = 1
8384 if kk = 0 then gosub 10000 : return
8388 next pc
8389 gosub 10000 : return
8999 rem This subroutine takes in string ii$ and removes the spaces
9000 rem from the end. It is meant to avoid disastrous user errors.
9008 goto 9010
9010 st$ = ii$
9015 tm$ = ""
9020 for ck = 0 to len(ii$)-1
9030 le = ck
9033 ri = ck+1
9035 gosub 40
9050 if sr$ = "" then goto 9200
9100 tm$ = tm$ + sr$
9200 next ck
9300 ii$ = tm$
9301 return
10000 if uu = 1 then return : rem Set uu to 0 on line 1 to not status indicate.
10001 rem This is our rudimentary status indicator
10002 rem function. It will print "." to the indicator line and delete it.
10005 print ".";
10006 for sz=0 to 50 : print ""; : next sz
10007 print chr$(20); : rem this deletes
10008 rem This function is called when stuff happens;
10009 rem that does not mean that it is called consistently, but
10010 rem instead it means that this function will be called
10011 rem as long as the system is not frozen/crashed. So it
10012 rem may take a while for the dot to appear, but if they do, wait.
10013 return
11001 rem Note I can use the same spot because its a comment.
11002 rem Welcome to the function/method guide!
11003 rem Functions will be written like this:
11004 rem <line number>(<arg 1>, <arg 2> , <...>): <quick description>=<output>
11005 rem Longer descriptions can be found at the functions themselves.
11006 rem Also, some functions are called at slightly earlier or later
11007 rem line numbers due to comments. They are still the same function.
11008 rem Some modules are not functions, but just extensions of other functions
11009 rem for readability. These are also here and clearly labeled as not being
11010 rem independent functions for the purposes of clearness.
11011 rem 10 DATA
11012 rem 40 (le, ri, st$): substring function = sr$
11013 rem 49 (none): toincoss function = mr
11014 rem 55 (s0$, s1$): intersection = s2$
11015 rem 100 (mn,mx): random float = mr
11016 rem 300 (ps$()): scrambles ps$() = none
11017 rem 500 (mn,mx): random int = mr
11018 rem 505 (ps$()): grouping finder = gp$()()
11019 rem 600 (none): main = none (the game)
11020 rem 890 (pr$()): creates ps$() = ps$()
11021 rem 900 CONTINUATION OF 600 (single word case)
11022 rem 948 CONTINUATION OF 900 (places one single word randomly)
11023 rem 1000 (wo$()(), wr$()()): fill the board = none
11024 rem 1200 EXTENSION OF 600
11025 rem 1300 (pr$()): helper counter for pr$() > ps$() = sz
11026 rem 1500 (ps$(), gp$()()): fd$() generator = fd$()
11027 rem 1700 (ii$, fd$()): fd$() contains = cn
11028 rem 2000 (ii$, pr$()): pr$() contains = cn
11029 rem 3500 (ps$(), st$): 1500 helper for finding index of value = ni
11030 rem 4000 (ma, ip): board dimensions = wo$()(), wr$()()
11031 rem 5000 (wr$()()): print the game = none
11032 rem 6000 (dr$, me, wo$()(), wr$()()): places a word = none
11033 rem 6500 (wo$()()): fill the key with dots = none
11034 rem 6569 (none): press and key = none
11035 rem 6596 (pl$(), re$(), wo$()()): prints out end game info = none
11036 rem 7000 (fd$(), wo$(), wr$()): smart placer = none
11037 rem 8000 (dr$, wo$(), wr$()): dumb placer = none
11038 rem 8100 CONTINUATION OF 8000
11039 rem 8300 (dr$, y, x, wo$(), wr$()): check if we can place = kk
11040 rem 9000 (ii$): removes end spaces (rstrip) = ii$
11041 rem 10000 (none): Status indicator = none
11101 rem ---Below is a simple guide to the game's inner workings---
11102 rem This is a simple word search game ideal for children and
11103 rem other such individuals. It runs very slowly, and scales
11104 rem badly with both space and time (though, it should scale worse
11105 rem with time, than with space relative to word size and number of
11106 rem words since it has at most two dimensional arrays). To play
11107 rem the game to maximum enjoyment, I recommend warp mode, then
11108 rem normal mode if you want to get some time to dash to manzanita
11109 rem and grab a cup of coffee in the morning (or something).
11110 rem Luckily it has a rough status indicator. If it fills the
11111 rem entirety of the screen do not be afraid, its just being called
11112 rem a lot because it is basically called any time anything at all
11113 rem happens, and a lot is happening.
11114 rem The game works as follows: it will find all combinations of words
11115 rem which can work, and then it will group the words into an order
11116 rem to be placed in, which is based on which words are best grouped
11117 rem with each other (though, it randomly chooses if two words both
11118 rem can be grouped with the same word, leading to different outcomes
11119 rem for the game; note that while some code may look purely
11120 rem deterministic, due to the fact that the entire array ps$() from which
11121 rem the input array pr$() was loaded, it is not since ps$() was scrambled
11122 rem randomly, meaning that any order may be found, and due to the inner
11123 rem workings of the combination finder, it will order the combinations
11124 rem in different ways; this is why every single possible combination of
11125 rem words is technically possible in this game). After loading the
11126 rem placement order list (fd$()), it will place in pairs that will form
11127 rem crosses. It will always make the first word horizontal and the second
11128 rem vertical, but due to the initial scramble, every single possibility
11129 rem can technically occur (thought it typically makes the game less
11130 rem "random"). If a pair can't form a cross (since it either has none
11131 rem of the same letters, or it can't fit it in, it will try to fit each
11132 rem word individually. If that fails, then it will add the word to
11133 rem the rejects list (placed words are added to a placed words list).
11134 rem After each word is placed or rejected, the game is printed, and
11135 rem after the player desires to see the solution it shows first the
11136 rem words, then the rejects, then the answer key. The key is last, in
11137 rem case players see words they want to now find, and don't want to
11138 rem have the key be revealed just yet.
11139 rem
11140 rem Crash Notes:
11141 rem Crashes after:
11142 rem Many, many big words (ie if you put 30+ words of length
11143 rem 12+ it will probably take forever, and might run out of memory).
11144 rem I have run it with 50 words of average length 4 and it actually works
11145 rem fine, just really slowly; remember you can speed up with uu = 1.
11146 rem It tanks especially if you add long words because
11147 rem that is the main factor of grid size.
11148 rem
11149 rem Note: "l" as an iterator is always the letter, not the number.
11150 rem (I know it can look confusing).
11154 rem
11155 rem Side note for users: if you put in tens or
11156 rem hundreds of words as the numbers and terminate early, you still
11157 rem allocate an array of that size; this will crash the game. Also,
11158 rem if you ever get suspicious crashes (ie out of memory with 5 words)
11159 rem try restarting or reloading. The emulator is buggy too. Other
11160 rem successful options to fix emulator bugs include printing items
11161 rem in random places and then deleting the print statements. It's
11162 rem nonsensical, but it worked for me.
11168 rem
11169 rem Empty strings as "words" can break the game on occasion.
11170 rem This is not dealt with because users entering empty strings
11171 rem as words are clearly not planning on playing a legitimate game
11172 rem so it does not matter for the general populace.
11173 rem
11174 rem Good Luck!
11175 rem Thank you for playing!
The problem is not with line 45, despite your ?OUT OF MEMORY AT LINE 45 error. That line in and of itself cannot trigger OOM, since it is merely incrementing a memory variable that would at worst trigger ?OVERFLOW eventually.
The problem is that your BASIC program is now sufficiently large that the PET cannot allocate enough memory for your runtime variables - which is why it was working until you added that last routine.
So: either optimise your memory usage (by trimming/optimising the code and/or refining variable use) or reconfigure xPET for more RAM at startup.
I tried to optimize your code a litte bit.
What I've done:
used CBM Program studio to edit the program listing. A useful feature of this IDE is smart comments. They can be seen only in the editor and removed from actual program listing. I've changed longer comments as smart comments.
Combined shorter lines using ':' operator
$in variable used only once. I have changed alll $ii varibles to
$in
Status indicator routine is simplified and moved to 1000. (If I recall correctly subroutines with shorter line numbers will work faster.)
Reduced the number of status indicator calls.
I have tested the resulting code and it is working up to 17 words in PET 8032 with 32K
1 rem This is a simple wordsearch game generator
2 rem that takes words and creates a board and key from it.
3 rem Words can only be read/written top to bottom or left to right.
4 rem The board is a square between 3x3 and 21x21.
6 uu = 1 : goto 600 : rem (toggle indicator on)
8 print "Zero Words, terminating game..." : rem This small piece
9 end : rem of code here is what happens if you put 0 for the number of words.
10 rem Below is the default data (it's random). Hopefully it is enough.
11 data "rat","clam","ten","sip","lie","ugly","not","yet","thou","if"
12 data "dog","cat","no","yes","maybe","truck","car","trap","lamb"
13 data "three","long","short","two","one","zero","cow","cart","house"
14 data "hair","hall","heart","head","who","is","truth","false","cake"
15 data "five","six","seven","eight","nine","ten","good","bad","rad"
16 data "thy","art","short","long","potatoes","tomato","ramble"
17 data "crumble","shambles","lion","turtle","beach","breach"
18 data "data","read","bamboozle","words","list","assasin","off"
19 data "are","viking","knight","sword","random","treacherous","..."
21 rem If it is not enough data, the game will reach the "..." and
22 rem your game will be cut short (have less words).
40 rem This is substring from le to ri where st$ goes from 0 to len-1
41 rem Left is inclusive and right is not inclusive.
42 rem Assumes that st$ exists; returns sr$. le and ri must also exit.
43 rem This encapsulates the mid function for ease of use.
45 le = le+1: rem: remember basic strings go from 1 to len
46 sr$ = mid$(st$,le,ri-le+1)
47 le = le-1: rem: we try to avoid outside effects
48 return
50 rem: A subroutine that returns a random Boolean (tosses a coin).
51 rem: It will return 0 or 1 randomly; output is in mr.
52 mn = 0: mx = 1
53 gosub 500
54 gosub 100 : return
55 rem Assumes strings named s0$ and s1$ exist.
56 rem (these are the strings we search for an intersection)
57 s2$ = "": rem returns a string of the intersections in s2$.
58 for i = 0 to len(s0$)-1 : rem: for all characters in s0$
59 st$ = s0$: le = i : ri = i+1
60 gosub 40
61 ch$ = sr$ : rem temp var
62 for j = 0 to len(s1$)-1
63 le = j: ri = j+1: st$ = s1$
64 gosub 40
65 if ch$ = sr$ then s2$ = s2$ + ch$
66 next j : next i
68 gosub 100 : return
99 rem Status Indicator
100 if uu=0 then return
110 on uu goto 120,130
120 print "."; : uu=2 :return
130 print chr$(20); : uu=1 :return
300 rem Scrambles the order of ps$().
310 for i = 0 to ip-1
320 mx=ip-1 : mn=0 : gosub 500
330 sp$ = ps$(mr) : ps$(mr) = ps$(i) : ps$(i) = sp$
350 next i
360 gosub 100 : return
499 rem Random Integer subroutine.
500 mr = rnd(0)*(mx+1-mn)+mn : rem Gets a random float.
502 mr = int(mr) : rem we turn it into an integer via truncation
504 return
530 rem This subroutine returns gp$(), a two
531 rem dimensional array of all the words where
532 rem for each value of ps$() where a word is
533 rem the list in gp$() of that value has the words
534 rem that match/intersect with it.
535 dim gp$(ip-1,ip-1)
536 for l=0 to ip-1 : ix=0 : for cc=0 to ip-1
539 s0$=ps$(l) : s1$=ps$(cc)
541 gosub 55
543 if s2$ = "" then goto 545
544 if s1s$ = ps$(l) then goto 545 :rem same word
545 gp$(l,ix) = ps$(cc)
549 ix = ix + 1
557 next cc : next l
560 gosub 100 : return
!-=======================================================================
!-I highly recommend running this in warp mode to get feel for the game
!-because it could be really slow. You can speed it up, however, by not
!-having a status indicator. Near the end are guides for developers and
!-users. This is our main subroutine. It will provide the user with
!- helpful prompts
!-and take input from the user regarding how many words, and what words
!-to use. The global variable "ma" is tracked as the length of the
!-longest word. sz is the number of words we play with.
!-=======================================================================
599 rem main subroutine
600 restore : ma = 0 : print "How many words do you want?"
601 print "(not too many, or too long please)"
602 print "(we recommend that you put less than 12 words"
603 print "of lengths of 7 characters or below)"
604 input "How many words"; in$ : print "" : print "" : print ""
605 print "Note: spaces and periods are removed from the ends of words."
606 print "This is a good game for children."
607 print "You can only have one of each word."
608 print "Words aren't confirmed as real words."
609 print "So please enter real words to play with them."
610 print "However, we also have default words. To let ";
611 print "the computer finish filling the list, ";
612 print "just enter .... . Enter ... to finish early."
613 ip = val(in$) : if ip <= 0 then goto 8 : rem 0 words
614 dim pr$(ip-1) : sz = 0
615 for i = 0 to ip-1
616 input "Next word"; in$
617 ln = len(in$): if ln > ma then ma = ln:rem update mx wrd lngth
619 if in$ = "..." then if i = 0 then goto 8 : rem 0 words
620 if in$ = "..." then sz = -1 : if in$ = "..." then goto 635
621 if in$ = "...." then if ip = 1 then goto 900
622 if in$ = "...." then goto 628 : rem fill with default
623 gosub 2000
624 if cn = 1 then print "that word was already added"
625 if cn = 1 then goto 616 : rem: get the word again
626 gosub 9000 : pr$(i) = in$ : sz = i
627 next i
628 if i >= ip-1 then goto 635
629 for j = i to ip-1 :rem: fill in the data
630 read in$ : if in$ = "..." then goto 1200
631 gosub 2000
632 if cn = 1 then goto 629
633 pr$(j) = in$ : sz = j
634 next j
635 gosub 890 : rj = 0 : pa = 0 : dim re$(ip-1) : dim pl$(ip-1)
639 print "" : print "" : print ""
640 print "Would you like to have a status indicator? (y/n)"
645 print "Note that it slows you down considerably."
647 input ""; in$
649 if in$ = "n" then uu = 0 : if in$ = "n" then goto 670
650 if in$ = "y" then uu = 1 : if in$ = "y" then goto 670
660 rem else
665 print "You must enter y or n. y is Yes and n is No."
668 goto 647
670 print "Please wait. The game is loading..."
671 print "In some cases this could take a few minutes..." :
693 print "" : print "" : if ip = 1 then goto 910
694 gosub 4000 : gosub 310 : gosub 535
695 gosub 1500
696 gosub 6500
697 gosub 7000
698 gosub 1000
699 rem go
701 print "" : print "" : print ""
702 print "*********************"
703 print "***WORDSEARCH GAME***"
704 print "*********************"
710 gosub 5000 :rem: print
711 gosub 6600 :rem: the key array and answer list.
790 end
888 rem Turn pr$() into ps$() where the size is the number
889 rem of words (in case they terminate early with "...").
890 if sz = -1 then gosub 1300
891 dim ps$(sz)
892 for i = 0 to sz
893 ps$(i) = pr$(i)
894 next i
896 ip = sz+1
897 rem to make work with "ip-1" elsewhere
898 return
900 dim ps$(0)
901 ps$(0) = "rat"
902 rj = 0 : pa = 0 : dim re$(ip-1) : dim pl$(ip-1)
903 goto 910
906 rem This subroutine continues the running of the game
907 rem normally, except it deals with the case where
908 rem there is only one word. The normal code fails
909 rem in this case (which is why we need 910).
910 gosub 4000
911 gosub 6500
912 dr$ = ps$(0)
913 gosub 949
914 gosub 1001
931 print "" : print "" : print ""
932 print "*********************"
933 print "***WORDSEARCH GAME***"
934 print "*********************"
935 gosub 5000 : rem print the actual game
936 gosub 6600 : rem the key array
940 end
947 rem This subroutine places one word randomly on the board.
948 if ps$(0) = "" then goto 8 : rem (the case with only one word)
949 if len(ps$(0)) > sh then goto 981
950 gosub 50 : me = mr : dr$ = ps$(0)
951 if me = 0 then goto 960 : rem vertical
952 rem here is the code for horizontal
953 mn = 0
954 mx = a-len(dr$)
955 gosub 40 : x = mr
956 mn=0 : mx=a-1
958 gosub 40 : y = mr
959 goto 979
960 rem here is our code for vertical
962 mn=0 : mx=a-len(dr$)
964 gosub 40 : y = mr
965 mn=0: mx=a-1
968 gosub 40 : x = mr
969 goto 979
979 gosub 6000 : pl$(0) = dr$
980 gosub 100 : return
981 re$(0) = ps$(0)
982 return
998 rem This subroutine fills the board with random
999 rem characters, assuming that words have already been
1000 rem placed. It copies wo$() to wr$(), then replaces the dots.
1001 for i=0 to a-1 : for j=0 to a-1
1003 wr$(i,j) = wo$(i,j)
1007 next j : next i
1010 for i=0 to a-1 : for j=0 to a-1
1012 if wr$(i,j) <> "." then goto 1020
1014 mn = 65: mx = 90
1015 gosub 500
1016 wr$(i,j) = chr$(mr)
1020 next j : next i
1100 gosub 100 : return
1197 rem This is a helper that will, in the read data loop,
1198 rem use the early termination functionality to avoid crashes.
1199 rem It is here to avoid clutter with ":" commands in main.
1200 sz = -1
1201 if pr$(0) = "..." then goto 8
1202 goto 635
1299 rem Finds our sz; fixes a bug using "...".
1300 sz = 0
1301 for po = 0 to ip-1
1302 if pr$(po) <> "" then sz = po
1303 next po
1304 return
1500 rem This subroutines orders words into an array
1501 rem called fd$(). It orders them in a way such
1502 rem that words next to each other will have an
1503 rem intersection if that is possible. Otherwise,
1504 rem they may not.
1505 rem It makes use of our previous 2D array gp$() to do this.
1510 dim fd$(ip-1) : rem list of words in order we place them
1511 fd$(0) = ps$(0)
1520 for k = 1 to ip-1
1521 st$ = fd$(k-1)
1522 gosub 3500
1523 f = 0 :rem number of words in this subarray
1524 for kc = 0 to ip-1
1525 if gp$(ni,f) = "" then goto 1528
1526 f = kc
1527 next kc
1528 rem now we will find a matching word that is new
1529 for d = 0 to f-1
1530 in$ = gp$(ni,d) : rem dst word that is matching
1531 gosub 1700 :rem cn = whether it was contained
1532 if cn = 0 then fd$(k) = in$
1533 if cn = 0 then goto 1649
1534 next d :rem check next subarray item
1535 rem at this point we need an entirely new word
1536 for d = 0 to ip-1
1537 in$ = ps$(d)
1560 gosub 1700
1570 if cn = 0 then fd$(k) = in$
1580 if cn = 0 then goto 1649
1590 rem else (haven't found it yet)
1600 next d
1610 rem this should work and is deterministic
1611 rem
1649 next k
1650 gosub 100 : return
1700 rem Assumes fd$() and in$ and ip exist. Returns cn.
1701 rem Checks whether in$ is in fd$(), with 0 as no, and 1 as yes.
1702 cn = 0
1703 for l = 0 to ip-1
1704 if cn = 1 then gosub 100 : return
1705 if fd$(l) = in$ then cn = 1
1706 next l
1707 gosub 100 : return
2000 rem Assumes pr$() and in$ an ip exist. Outputs cn.
2001 rem returns whether in$ is in pr$() (0 for no, 1 for yes).
2002 cn = 0
2003 for l = 0 to ip-1
2004 if cn = 1 then gosub 100 : return
2005 if pr$(l) = in$ then cn = 1
2006 next l
2007 return
3500 rem Finds the index, outputted as ni, at which st$ is in ps$
3501 rem If it is not present, then it returns -1.
3505 for l = 0 to ip - 1
3510 ni = l
3515 if ps$(ni) = st$ then gosub 100 : return
3520 next l
3525 ni = -1
3530 gosub 100 : return
3999 rem Set the board dimensions.
4000 sh = 21: if ma > sh then ma = sh : rem sh = screen height
4001 a = int(ma + ip/5+ ip/8 + ip/10 + ip/20): if a < 3 then a = 3
4002 rem This is an emperically determined subroutine.
4003 if a < ma+1 then a = ma+1
4005 if a > sh then a = sh
4006 dim wr$(a-1,a-1): dim wo$(a-1,a-1) : rem square grid
4007 gosub 100 : return
5000 rem Print the regular game board
5001 for i=0 to a-1 : for j=0 to a-1 : rem now that it is a by a cant use b :P
5003 print wr$(i,j);
5004 next j
5005 print ""
5006 next i
5007 return
6000 rem This subroutine places a word, dr$, on the board & key
6002 rem at wr$/wo$(y,x). me is 1 for horizontal, 0 for vertical.
6003 rem It assumes that this is possible.
6004 for l = 0 to len(dr$)-1
6005 st$=dr$ : le=l : ri=l+1
6008 gosub 40
6009 wr$(y,x) = sr$: wo$(y,x) = sr$
6010 if me=1 then x=x + 1
6011 if me=0 then y=y + 1
6012 next l
6013 gosub 100 : return
6497 rem This little subroutine fills the key board
6498 rem with dots, on top of which words will later be
6499 rem placed. It makes the game more usable.
6500 for po=0 to a-1 : for pp=0 to a-1
6530 wo$(po,pp) = "."
6540 next pp : next po
6560 gosub 100 : return
6568 rem Waits for any key to be pressed
6569 print "Press Any Key To Continue..." : rem ANY KEY
6570 get tm$ : if tm$ = "" then goto 6570
6571 return
6596 rem This subroutine prints out the words
6597 rem that were used and the words that were rejected.
6598 rem It prints the key out last, in case the user wants to search
6599 rem for a word they didn't realize was there once they see
6600 rem the list. (Some people are like that...)
6605 print "To check your answers, "
6611 gosub 6569
6612 print "Words:"
6613 for i = 0 to ip - 1
6614 print pl$(i); : print " ";
6615 next i
6616 print "": gosub 6569
6617 print "Rejected words:"
6618 for i = 0 to ip-1
6619 print re$(i); : print " ";
6620 next i
6621 print "" : gosub 6569 : print "Key:" : print ""
6640 for kl = 0 to a-1
6641 for lk = 0 to a-1
6642 print wo$(kl,lk);
6643 next lk
6644 print ""
6645 next kl
6646 gosub 6569 : print "*********************"
6647 return
6995 rem This subroutine is our smart placing subroutine.
6996 rem Its goal is to place words from fd$() in a way
6997 rem to maximize intersections of words and create
6998 rem the best possible game.
6999 rem Assumes we have fd$(), wo$() and wr$() all ready
7000 id = 0 : rem the initial word spot on the array
7003 if ip = 3 then goto 7500 : rem this somehow fixes a bug
7004 rem above is due to really bad emulator bug, nonsensical
7005 if id > ip-1 then return : rem we have placed all words
7006 if id = ip-1 then goto 7900 : rem 1 word
7010 rem else
7011 goto 7300
7200 rem This is the general smart placer
7300 se$ = fd$(id+1) : si$ = fd$(id)
7310 s0$ = si$ : s1$ = se$ : gosub 55 : fr = 0 : rf = 0
7311 if s2$ = "" then goto 7450
7315 le = 0 : ri = 1 : st$ = s2$ : gosub 40 : tm$ = sr$
7320 for pi = 0 to len(si$)-1
7325 le = pi : ri = pi+1 : st$ = si$ : gosub 40
7330 if sr$ = tm$ then goto 7350
7335 fr = fr + 1
7340 next pi
7350 for po = 0 to len(se$)-1
7355 le = po : ri = po+1 : st$ = se$ : gosub 40
7360 if sr$ = tm$ then goto 7385
7365 rf = rf + 1
7370 next po
7385 mf = 1 : mk = 0 : rem 1st word is horizontal, second is vertical
7395 for xx = 0 to a-len(si$)
7400 for yy = rf to a-len(se$)+rf
7401 x = xx : y = yy : dr$ = si$
7402 if xx < 0 then goto 7430
7403 if xx > a-1 then 7430
7404 if yy < 0 then 7428
7405 if yy > a-1 then 7428
7407 kd = mf
7408 gosub 8300
7410 if kk = 0 then goto 7428
7412 x = xx + fr : y = yy - rf
7413 dr$ = se$ : kd = mk
7414 gosub 8300
7415 if kk = 0 then goto 7428
7417 dr$ = si$ : x = xx : y = yy : me = mf : gosub 6000
7420 dr$ = se$ : x = xx + fr : y = yy - rf : me = mk : gosub 6000
7425 pl$(pa) = si$ : pa = pa + 1 : pl$(pa) = se$ : pa = pa + 1
7426 goto 7461
7428 next yy
7430 next xx
7450 dr$ = si$ : gosub 8000
7460 dr$ = se$ : gosub 8000
7461 id = id + 2
7470 goto 7005
7500 dr$ = fd$(0) : gosub 8000
7501 dr$ = fd$(1) : gosub 8000
7502 dr$ = fd$(2) : gosub 8000
7503 return
7900 rem This places the last word
7901 dr$ = fd$(ip-1)
7902 gosub 8000
7926 return
7990 rem This subroutine tries to place a word called dr$
7991 rem anywhere possible on the board by testing
7992 rem all locations with all directions allowed in the game.
8000 rem It is meant for words not placed by our smart placer.
8002 if len(dr$) > sh then goto 8028
8003 for pn = 0 to a-1
8004 for pk = 0 to a-1
8006 kd = 1
8008 gosub 8300
8010 if kk = 1 then gosub 8100 : rem place
8012 if kk = 1 then gosub 100 : return
8014 rem else (if kk = 0)
8016 kd = 0
8018 gosub 8300
8020 if kk = 1 then gosub 8100 : rem place
8022 if kk = 1 then gosub 100 : return
8024 next pk
8026 next pn
8028 kk = 0 : re$(rj) = dr$ : rj = rj + 1 : rem add to rejects
8030 gosub 100 : return
8099 rem This is a parameter converter from 8000 to 6000.
8100 rem It will place dr$ at pn, pk (format (y,x)).
8101 y = pn : x = pk: me = kd: gosub 6000
8102 pl$(pa) = dr$ : pa = pa + 1
8103 gosub 100 : return
8260 rem this subroutine will check whether
8261 rem we can place a string called dr$
8262 rem in direction kd (0 is vertical, 1 is horizontal)
8263 rem at the position
8264 rem pn, pk (coords as (y,x))
8300 rem returns kk whther we can place
8301 rem (kk = 0 if we can't, = 1 if we can)
8303 lu = len(dr$)
8305 if kd = 1 then goto 8350 : rem horizontal
8307 kk = 1 : rem vertical
8309 if pn+lu-1 > a-1 then kk = 0
8310 for pc = pn to pn+lu-1
8312 le = pc-pn
8313 ri = le+1 : st$ = dr$
8314 gosub 40 : lc$ = sr$
8315 if wo$(pc,pk) <> "." then kk = 0
8316 if wo$(pc,pk) = lc$ then kk = 1
8317 if kk = 0 then gosub 100 : return
8318 next pc
8319 gosub 100 : return
8350 kk = 1
8351 if pk+lu-1 > a-1 then kk = 0
8352 if kk = 0 then gosub 100 : return
8353 for pc = pk to pk+lu-1
8362 le = pc-pk
8363 ri = le+1 : st$ = dr$
8364 gosub 40 : lc$ = sr$
8374 if wo$(pn,pc) <> "." then kk = 0
8376 if wo$(pc,pk) = lc$ then kk = 1
8384 if kk = 0 then gosub 100 : return
8388 next pc
8389 gosub 100 : return
8999 rem This subroutine takes in string in$ and removes the spaces
9000 rem from the end. It is meant to avoid disastrous user errors.
9008 goto 9010
9010 st$ = in$
9015 tm$ = ""
9020 for ck = 0 to len(in$)-1
9030 le = ck
9033 ri = ck+1
9035 gosub 40
9050 if sr$ = "" then goto 9200
9100 tm$ = tm$ + sr$
9200 next ck
9300 in$ = tm$
9301 return
!-================================================================================
!-11001 rem Note I can use the same spot because its a comment.
!-11002 rem Welcome to the function/method guide!
!-11003 rem Functions will be written like this:
!-11004 rem <line number>(<arg 1>, <arg 2> , <...>): <quick description>=<output>
!-11005 rem Longer descriptions can be found at the functions themselves.
!-11006 rem Also, some functions are called at slightly earlier or later
!-11007 rem line numbers due to comments. They are still the same function.
!-11008 rem Some modules are not functions, but just extensions of other functions
!-11009 rem for readability. These are also here and clearly labeled as not being
!-11010 rem independent functions for the purposes of clearness.
!-11011 rem 10 DATA
!-11012 rem 40 (le, ri, st$): substring function = sr$
!-11013 rem 49 (none): toincoss function = mr
!-11014 rem 55 (s0$, s1$): intersection = s2$
!-11015 rem 100 (mn,mx): random float = mr
!-11016 rem 300 (ps$()): scrambles ps$() = none
!-11017 rem 500 (mn,mx): random int = mr
!-11018 rem 505 (ps$()): grouping finder = gp$()()
!-11019 rem 600 (none): main = none (the game)
!-11020 rem 890 (pr$()): creates ps$() = ps$()
!-11021 rem 900 CONTINUATION OF 600 (single word case)
!-11022 rem 948 CONTINUATION OF 900 (places one single word randomly)
!-11023 rem 1000 (wo$()(), wr$()()): fill the board = none
!-11024 rem 1200 EXTENSION OF 600
!-11025 rem 1300 (pr$()): helper counter for pr$() > ps$() = sz
!-11026 rem 1500 (ps$(), gp$()()): fd$() generator = fd$()
!-11027 rem 1700 (in$, fd$()): fd$() contains = cn
!-11028 rem 2000 (in$, pr$()): pr$() contains = cn
!-11029 rem 3500 (ps$(), st$): 1500 helper for finding index of value = ni
!-11030 rem 4000 (ma, ip): board dimensions = wo$()(), wr$()()
!-11031 rem 5000 (wr$()()): print the game = none
!-11032 rem 6000 (dr$, me, wo$()(), wr$()()): places a word = none
!-11033 rem 6500 (wo$()()): fill the key with dots = none
!-11034 rem 6569 (none): press and key = none
!-11035 rem 6596 (pl$(), re$(), wo$()()): prints out end game info = none
!-11036 rem 7000 (fd$(), wo$(), wr$()): smart placer = none
!-11037 rem 8000 (dr$, wo$(), wr$()): dumb placer = none
!-11038 rem 8100 CONTINUATION OF 8000
!-11039 rem 8300 (dr$, y, x, wo$(), wr$()): check if we can place = kk
!-11040 rem 9000 (in$): removes end spaces (rstrip) = in$
!-11041 rem 10000 (none): Status indicator = none
!-================================================================================
!-Comment
!-================================================================================
!-11101 rem ---Below is a simple guide to the game's inner workings---
!-11102 rem This is a simple word search game ideal for children and
!-11103 rem other such individuals. It runs very slowly, and scales
!-11104 rem badly with both space and time (though, it should scale worse
!-11105 rem with time, than with space relative to word size and number of
!-11106 rem words since it has at most two dimensional arrays). To play
!-11107 rem the game to maximum enjoyment, I recommend warp mode, then
!-11108 rem normal mode if you want to get some time to dash to manzanita
!-11109 rem and grab a cup of coffee in the morning (or something).
!-11110 rem Luckily it has a rough status indicator. If it fills the
!-11111 rem entirety of the screen do not be afraid, its just being called
!-11112 rem a lot because it is basically called any time anything at all
!-11113 rem happens, and a lot is happening.
!-11114 rem The game works as follows: it will find all combinations of words
!-11115 rem which can work, and then it will group the words into an order
!-11116 rem to be placed in, which is based on which words are best grouped
!-11117 rem with each other (though, it randomly chooses if two words both
!-11118 rem can be grouped with the same word, leading to different outcomes
!-11119 rem for the game; note that while some code may look purely
!-11120 rem deterministic, due to the fact that the entire array ps$() from which
!-11121 rem the input array pr$() was loaded, it is not since ps$() was scrambled
!-11122 rem randomly, meaning that any order may be found, and due to the inner
!-11123 rem workings of the combination finder, it will order the combinations
!-11124 rem in different ways; this is why every single possible combination of
!-11125 rem words is technically possible in this game). After loading the
!-11126 rem placement order list (fd$()), it will place in pairs that will form
!-11127 rem crosses. It will always make the first word horizontal and the second
!-11128 rem vertical, but due to the initial scramble, every single possibility
!-11129 rem can technically occur (thought it typically makes the game less
!-11130 rem "random"). If a pair can't form a cross (since it either has none
!-11131 rem of the same letters, or it can't fit it in, it will try to fit each
!-11132 rem word individually. If that fails, then it will add the word to
!-11133 rem the rejects list (placed words are added to a placed words list).
!-11134 rem After each word is placed or rejected, the game is printed, and
!-11135 rem after the player desires to see the solution it shows first the
!-11136 rem words, then the rejects, then the answer key. The key is last, in
!-11137 rem case players see words they want to now find, and don't want to
!-11138 rem have the key be revealed just yet.
!-11139 rem
!-11140 rem Crash Notes:
!-11141 rem Crashes after:
!-11142 rem Many, many big words (ie if you put 30+ words of length
!-11143 rem 12+ it will probably take forever, and might run out of memory).
!-11144 rem I have run it with 50 words of average length 4 and it actually works
!-11145 rem fine, just really slowly; remember you can speed up with uu = 1.
!-11146 rem It tanks especially if you add long words because
!-11147 rem that is the main factor of grid size.
!-11148 rem
!-11149 rem Note: "l" as an iterator is always the letter, not the number.
!-11150 rem (I know it can look confusing).
!-11154 rem
!-11155 rem Side note for users: if you put in tens or
!-11156 rem hundreds of words as the numbers and terminate EARLY, you still
!-11157 rem allocate an array of that size; this will crash the game. Also,
!-11158 rem if you ever get suspicious crashes (ie out of memory with 5 words)
!-11159 rem try restarting or reloading. The emulator is buggy too. Other
!-11160 rem successful options to fix emulator bugs include printing items
!-11161 rem in random places and then deleting the print statements. It's
!-11162 rem nonsensical, but it worked for me.
!-11168 rem
!-11169 rem Empty strings as "words" can break the game on occasion.
!-11170 rem This is not dealt with because users entering empty strings
!-11171 rem as words are clearly not planning on playing a legitimate game
!-11172 rem so it does not matter for the general populace.
!-11173 rem
!-11174 rem Good Luck!
!-11175 rem Thank you for playing!
!-================================================================================
In fact, the error is very simple !
The error comes from the lines:
gosub 10000 : return
GOSUB will keep in the stack the return address, so after a few calls, the stack will be full (the stack is 256 bytes on 6502).
The solution is to replace with:
goto 10000

Getting a 2D histogram of a grayscale image in Julia

Using the Images package, I can open up a color image, convert it to Gray scale and then :
using Images
img_gld = imread("...path to some color jpg...")
img_gld_gs = convert(Image{Gray},img_gld)
#change from floats to Array of values between 0 and 255:
img_gld_gs = reinterpret(Uint8,data(img_gld_gs))
Now I've got a 1920X1080 array of Uint8's:
julia> img_gld_gs
1920x1080 Array{Uint8,2}
Now I want to get a histogram of the 2D array of Uint8 values:
julia> hist(img_gld_gs)
(0.0:50.0:300.0,
6x1080 Array{Int64,2}:
1302 1288 1293 1302 1297 1300 1257 1234 … 12 13 13 12 13 15 14
618 632 627 618 623 620 663 686 189 187 187 188 185 183 183
0 0 0 0 0 0 0 0 9 9 8 7 8 7 7
0 0 0 0 0 0 0 0 10 12 9 7 13 7 9
0 0 0 0 0 0 0 0 1238 1230 1236 1235 1230 1240 1234
0 0 0 0 0 0 0 0 … 462 469 467 471 471 468 473)
But, instead of 6x1080, I'd like 256 slots in the histogram to show total number of times each value has appeared. I tried:
julia> hist(img_gld_gs,256)
But that gives:
(2.0:1.0:252.0,
250x1080 Array{Int64,2}:
So instead of a 256x1080 Array, it's 250x1080. Is there any way to force it to have 256 bins (without resorting to writing my own hist function)? I want to be able to compare different images and I want the histogram for each image to have the same number of bins.
Assuming you want a histogram for the entire image (rather than one per row), you might want
hist(vec(img_gld_gs), -1:255)
which first converts the image to a 1-dimensional vector. (You can also use img_gld_gs[:], but that copies the data.)
Also note the range here: the hist function uses a left-open interval, so it will omit counting zeros unless you use something smaller than 0.
hist also accepts a vector (or range) as an optional argument that specifies the edge boundaries, so
hist(img_gld_gs, 0:256)
should work.

Sort rps-blast results by position of the hit

I'm beginning with biopython and I have a question about parsing results. I used a tutorial to get involved in this and here is the code that I used:
from Bio.Blast import NCBIXML
for record in NCBIXML.parse(open("/Users/jcastrof/blast/pruebarpsb.xml")):
if record.alignments:
print "Query: %s..." % record.query[:60]
for align in record.alignments:
for hsp in align.hsps:
print " %s HSP,e=%f, from position %i to %i" \
% (align.hit_id, hsp.expect, hsp.query_start, hsp.query_end)
Part of the result obtained is:
gnl|CDD|225858 HSP,e=0.000000, from position 32 to 1118
gnl|CDD|225858 HSP,e=0.000000, from position 1775 to 2671
gnl|CDD|214836 HSP,e=0.000000, from position 37 to 458
gnl|CDD|214836 HSP,e=0.000000, from position 1775 to 2192
gnl|CDD|214838 HSP,e=0.000000, from position 567 to 850
And what I want to do is to sort that result by position of the hit (Hsp_hit-from), like this:
gnl|CDD|225858 HSP,e=0.000000, from position 32 to 1118
gnl|CDD|214836 HSP,e=0.000000, from position 37 to 458
gnl|CDD|214838 HSP,e=0.000000, from position 567 to 850
gnl|CDD|225858 HSP,e=0.000000, from position 1775 to 2671
gnl|CDD|214836 HSP,e=0.000000, from position 1775 to 2192
My input file for rps-blast is a *.xml file.
Any suggestion to proceed?
Thanks!
The HSPs list is just a Python list, and can be sorted as usual. Try:
align.hsps.sort(key = lambda hsp: hsp.query_start)
However, you are dealing with a nested list (each match has a list of HSPs), and you want to sort over all of them. Here making your own list might be best - something like this:
for record in ...:
print "Query: %s..." % record.query[:60]
hits = sorted((hsp.query_start, hsp.query_end, hsp.expect, align.hit_id) \
for hsp in align.hsps for align in record.alignments)
for q_start, q_end, expect, hit_id in hits:
print " %s HSP,e=%f, from position %i to %i" \
% (hit_id, expect, q_start, q_end)
Peter

Resources