I am trying to take a list of words that I have imported from a textfile and make a dictionary , where the value is incremented each time the word is passed over in the loop. However, with the current code I have, none are added and only the value I add initiall is there when I print the dictionary. What am I doing wrong?
import pymysql
from os import path
import re
db = pymysql.connect(host='127.0.0.1', port=3306, user='root', passwd='', db='db_cc')
cursor = db.cursor()
cursor.execute("SELECT id, needsprocessing, SchoolID, ClassID, TaskID FROM sharedata WHERE needsprocessing = 1")
r = cursor.fetchall()
print(r)
from os import path
import re
noentities = len(r)
a = r[0][1]
b = r[0][2]
c = r[0][3]
d = r[0][4]
filepath = "/codecompare/%s/%s/%s/%s.txt" %(a, b, c, d)
print(filepath)
foo = open(filepath, "r")
steve = foo.read()
rawimport = steve.split(' ')
dictionary = {"for":0}
foo.close()
for word in rawimport:
if word in dictionary:
dictionary[word] +=1
else:
dictionary[word] = 1
print dictionary
Some rawimport values are as follows:
print rawimport
['Someting', 'something', 'dangerzones', 'omething', 'ghg', 'sdf', 'hgiinsfg', '932wrtioarsjg', 'fghbyghgyug', 'sadiiilglj']
Additionally, when trying to print from the code, it throws
... print dictionary
File "<stdin>", line 3
print dictionary
^
SyntaxError: invalid syntax
However, if I run print dictionary by itself it prints:
{'for': 0}
Which is evidence that for loop did nothing.
Any ideas?
Running Python 2.7.2
edit: updated to reflect closing of file and to make loop simpler
edit: added sample rawimport data
I received the same Traceback when working through this in the Python interpreter -- it arose from not leaving the context of the for loop:
>>> for word in rawimport:
... if word in dictionary:
... dictionary[word]+=1
... else:
... dictionary[word]=1
... print dictionary
File "<stdin>", line 6
print dictionary
^
The interpreter thinks your print statement belongs to the for loop, and errors because it's not appropriately indented. (If you did indent it, of course, it would print the dictionary during each pass). The solution to that (assuming you're doing this in the interpreter, which was how I reproduced your error) is hitting enter again:
>>> for word in rawimport:
... if word in dictionary:
... dictionary[word]+=1
... else:
... dictionary[word]=1
...
>>> print dictionary
{'for': 1, 'fghbyghgyug': 1, '932wrtioarsjg': 1, 'dangerzones': 1, 'sdf': 1, 'ghg': 1, 'Someting': 1, 'something': 1, 'omething': 1, 'sadiiilglj': 1, 'hgiinsfg': 1}
'''
Related
I would like to use dask.array.map_overlap to deal with the scipy interpolation function. However, I keep meeting errors that I cannot understand and hoping someone can answer this to me.
Here is the error message I have received if I want to run .compute().
ValueError: could not broadcast input array from shape (1070,0) into shape (1045,0)
To resolve the issue, I started to use .to_delayed() to check each partition outputs, and this is what I found.
Following is my python code.
Step 1. Load netCDF file through Xarray, and then output to dask.array with chunk size (400,400)
df = xr.open_dataset('./Brazil Sentinal2 Tile/' + data_file +'.nc')
lon, lat = df['lon'].data, df['lat'].data
slon = da.from_array(df['lon'], chunks=(400,400))
slat = da.from_array(df['lat'], chunks=(400,400))
data = da.from_array(df.isel(band=0).__xarray_dataarray_variable__.data, chunks=(400,400))
Step 2. declare a function for da.map_overlap use
def sumsum2(lon,lat,data, hex_res=10):
hex_col = 'hex' + str(hex_res)
lon_max, lon_min = lon.max(), lon.min()
lat_max, lat_min = lat.max(), lat.min()
b = box(lon_min, lat_min, lon_max, lat_max, ccw=True)
b = transform(lambda x, y: (y, x), b)
b = mapping(b)
target_df = pd.DataFrame(h3.polyfill( b, hex_res), columns=[hex_col])
target_df['lat'] = target_df[hex_col].apply(lambda x: h3.h3_to_geo(x)[0])
target_df['lon'] = target_df[hex_col].apply(lambda x: h3.h3_to_geo(x)[1])
tlon, tlat = target_df[['lon','lat']].values.T
abc = lNDI(points=(lon.ravel(), lat.ravel()),
values= data.ravel())(tlon,tlat)
target_df['out'] = abc
print(np.stack([tlon, tlat, abc],axis=1).shape)
return np.stack([tlon, tlat, abc],axis=1)
Step 3. Apply the da.map_overlap
b = da.map_overlap(sumsum2, slon[:1200,:1200], slat[:1200,:1200], data[:1200,:1200], depth=10, trim=True, boundary=None, align_arrays=False, dtype='float64',
)
Step 4. Using to_delayed() to test output shape
print(b.to_delayed().flatten()[0].compute().shape, )
print(b.to_delayed().flatten()[1].compute().shape)
(1065, 3)
(1045, 0)
(1090, 3)
(1070, 0)
which is saying that the output from da.map_overlap is only outputting 1-D dimension ( which is (1045,0) and (1070,0) ), while in the da.map_overlap, the output I am preparing is 2-D dimension ( which is (1065,3) and (1090,3) ).
In addition, if I turn off the trim argument, which is
c = da.map_overlap(sumsum2,
slon[:1200,:1200],
slat[:1200,:1200],
data[:1200,:1200],
depth=10,
trim=False,
boundary=None,
align_arrays=False,
dtype='float64',
)
print(c.to_delayed().flatten()[0].compute().shape, )
print(c.to_delayed().flatten()[1].compute().shape)
The output becomes
(1065, 3)
(1065, 3)
(1090, 3)
(1090, 3)
This is saying that when trim=True, I cut out everything?
because...
#-- print out the values
b.to_delayed().flatten()[0].compute()[:10,:]
(1065, 3)
array([], shape=(1045, 0), dtype=float64)
while...
#-- print out the values
c.to_delayed().flatten()[0].compute()[:10,:]
array([[ -47.83683837, -18.98359832, 1395.01848583],
[ -47.8482856 , -18.99038681, 2663.68391094],
[ -47.82800624, -18.99207069, 1465.56517187],
[ -47.81897323, -18.97919009, 2769.91556363],
[ -47.82066663, -19.00712956, 1607.85927095],
[ -47.82696896, -18.97167714, 2110.7516765 ],
[ -47.81562653, -18.98302933, 2662.72112163],
[ -47.82176881, -18.98594465, 2201.83205114],
[ -47.84567 , -18.97512514, 1283.20631652],
[ -47.84343568, -18.97270783, 1282.92117225]])
Any thoughts for this?
Thank You.
I guess I got the answer. Please let me if I am wrong.
I am not allowing to use trim=True is because I change the shape of output array (after surfing the internet, I notice that the shape of output array should be the same with the shape of input array). Since I change the shape, the dask has no idea how to deal with it so it returns the empty array to me (weird).
Instead of using trim=False, since I didn't ask cutting-out the buffer zone, it is now okay to output the return values. (although I still don't know why the dask cannot concat the chunked array, but believe is also related to shape)
The solution is using delayed function on da.concatenate, which is
delayed(da.concatenate)([e.to_delayed().flatten()[idx] for idx in range(len(e.to_delayed().flatten()))])
In this case, we are not relying on the concat function in map_overlap but use our own concat to combine the outputs we want.
I want to use the variable that I passed to a function which contains a file path. However, I don't get it working.
For example, I have a path like "/samba-test/log_gen/log_gen/log_generator" and when I read this path to a variable it doesn't work as expected. Please refer to my explaination in the code. My comments
are tagged with the string "VENK" . Any help would be appreciated.
/* caller */
config_path = "/samba-test/log_gen/log_gen/log_generator"
ReadWrite_Config(config_path)
/*definition*/
def ReadWrite_Timeline(lp_readpath, lp_filterlist):
current_parent_path = lp_readpath
current_search_list = lp_filterlist
print(current_parent_path) >>>>>> VENK - PATH prints fine here as expected <<<<<<<<.
strings_1 = ("2e88422c-4b61-41d7-9cf9-4650edaa4e56", "2017-11-27 16:1")
for index in range(0,3):
print (current_search_list[index])
files=None
filext=[".txt",".log"]
#outputfile = open(wrsReportFileName, "a")
for ext in filext:
print("Current_Parent_Path",current_parent_path ) <<<<<<VENK - Prints as Expected ""
#VENK - The above line prints as ('Current_Parent_Path', '/samba-test/log_gen/log_gen/log_generator') which is expected
#The actual files are inside the 'varlog' where the 'varlog' folder is inside '/samba-test/log_gen/log_gen/log_generator'
#possible problematic line below.
varlogpath = "(current_parent_path/varlog)/*"+ext >>>>>>>>>>> VENK- Unable to find the files if given in this format <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
print("varlogpath",varlogpath) >>>>>>>>>>>> VENK- varlogpath doesn't print as expected <<<<<<<<<<<<<<<<<<<<
#VENK - The above line prints as ('varlogpath', 'current_parent_path/varlog/*.txt') which I feel is problematic.
#VENK - If I give the absolute path as below it works fine
#varlogpath = "/samba-test/log_gen/log_gen/log_generator/varlog/*"+ext
files = glob.glob(varlogpath)
for file in files:
fname_varlog = open(file, 'r')
outputfile.write("\n")
outputfile.write(file)
outputfile.write("\n")
for line in fname_varlog:
#if any(s in line for s in strings):
"""
#s1 searches the mandatory arguments
#s2 searches the optional arguments
"""
if all(s1 in line for s1 in strings_1):
#if all(s1 in line for s1 in strings_1) or all(s2 in line for s2 in strings_2):
#print (file, end="")
outputfile.write(line)
fname_varlog.close()
outputfile.write("\n")
outputfile.write("10.[##### Summary of the Issue #####] -- Enter Data Manually \n")
outputfile.write("\n")
outputfile.close()
#print (ext)
A path join to the variable 'current_parent_path' helped to resolve the problem (like below).
varlogpath = os.path.join(current_parent_path, "*"+ext)
I've started to learn reactive-cocoa from couple of days, today I was playing with the flatten method of the reactivecocoa (reactiveSwift), I tried executing the snippet given for the concat flattening in the documentation Basic operators. Here's the snippet:
let (lettersSignal, lettersObserver) = Signal<String, NoError>.pipe()
let (numbersSignal, numbersObserver) = Signal<String, NoError>.pipe()
let (signal, observer) = Signal<Signal<String, NoError>, NoError>.pipe()
signal.flatten(.concat).observeValues { print($0) }
observer.send(value: lettersSignal)
observer.send(value: numbersSignal)
observer.sendCompleted()
numbersObserver.send(value: "1") // nothing printed
lettersObserver.send(value: "a") // prints "a"
lettersObserver.send(value: "b") // prints "b"
numbersObserver.send(value: "2") // nothing printed
lettersObserver.send(value: "c") // prints "c"
lettersObserver.sendCompleted() // prints "1, 2"
numbersObserver.send(value: "3") // prints "3"
numbersObserver.sendCompleted()
As per the documentation and the interactive visualization diagram (RAC marbles - flatten(.concat) visual diagram, the output should have been something like this,
First it should have printed letter stream i.e,
a, b, c
& once the letterStream has completed it should've printed the number stream i.e.
1, 2, 3
So the final output of this observation should've been
[a, b, c, 1, 2, 3]
However, the concatenated output I'm seeing is,
[a, b, c, 3]
why is this so? Why only the latest value of the numberStream is being printed? Instead of printing the entire number stream values once the letter stream was completed.
Please let me know if I've misunderstood something. Cheers.
As mentioned in the ReactiveSwift's slack channel, that is the expected outcome.
Quoting the documentation:
The outer event stream is started observed. Each subsequent event stream is not observed until the preceeding one has completed.
So numbersSignal will only send values, once lettersObserver has completed.
I am having a table data in string form. Sample is given below:
{"engName1":"HOLDER","validDurPeriod":3,"engName2":"INFORMATION","appStatus":2,"stayExpDate":"01/10/2012","engName3":"","appExpDate":"12/04/2010"}
How can I convert it into a proper table type variable so that I can access keys.I am new to lua and I am not aware if there is any existing method to do so.
There is plenty of JSON parsers available for Lua, for example dkjson:
local json = require ("dkjson")
local str = [[
{
"numbers": [ 2, 3, -20.23e+2, -4 ],
"currency": "\u20AC"
}
]]
local obj, pos, err = json.decode (str, 1, nil)
if err then
print ("Error:", err)
else
print ("currency", obj.currency)
for i = 1,#obj.numbers do
print (i, obj.numbers[i])
end
end
Output:
currency €
1 2
2 3
3 -2023
4 -4
Try this code to start with
J=[[
{"engName1":"HOLDER","validDurPeriod":3,"engName2":"INFORMATION","appStatus":2,"stayExpDate":"01/10/2012","engName3":"","appExpDate":"12/04/2010"}
]]
J=J:gsub("}",",}")
L={}
for k,v in J:gmatch('"(.-)":(.-),') do
L[k]=v
print(k,v)
end
You'll still need to convert some values to number and remove quotes.
Alternatively, you can let Lua do the hard work, if you trust the source string. Just replace the loop by this:
J=J:gsub('(".-"):(.-),','[%1]=%2,\n')
L=loadstring("return "..J)()
What I'm currently trying to do is make a table of email addresses (as keys) that hold person_records (as values). Where the person_record holds 6 or so things in it. The problem I'm getting is that when I try to assign the email address as a key to a table it complains and says table index is nil... This is what I have so far:
random_record = split(line, ",")
person_record = {first_name = random_record[1], last_name = random_record[2], email_address = random_record[3], street_address = random_record[4], city = random_record[5], state = random_record[6]}
email_table[person_record.email_address] = person_record
I wrote my own split function that basically takes a line of input and pulls out the 6 comma seperated values and stores them in a table (random_record)
I get an error when I try to say email_table[person_record.email_address] = person_record.
But when I print out person_record.email_address it's NOT nil, it prints out the string I stored in it.. I'm so confused.
function split(str, pat)
local t = {} -- NOTE: use {n = 0} in Lua-5.0
local fpat = "(.-)" .. pat
local last_end = 1
local s, e, cap = str:find(fpat, 1)
while s do
if s ~= 1 or cap ~= "" then
table.insert(t,cap)
end
last_end = e+1
s, e, cap = str:find(fpat, last_end)
end
if last_end <= #str then
cap = str:sub(last_end)
table.insert(t, cap)
end
return t
end
The following code is copy and pasted from your example and runs just fine:
email_table = {}
random_record = {"first", "second", "third"}
person_record = {first_name = random_record[1], last_name = random_record[1], email_address = random_record[1]}
email_table[person_record.email_address] = person_record
So your problem is in your split function.
BTW, Lua doesn't have "hashtables". It simply has "tables" which store key/value pairs. Whether these happen to use hashes or not is an implementation detail.
It looks like you iterating over some lines that have comma-separated data.
Looking at your split function, it stops as soon as there's no more separator (,) symbols in particular line to find. So feeding it anything with less than 3 ,-separated fields (for very common example: an empty line at end of file) will produce a table that doesn't go up to [3]. Addressing any empty table value will return you a nil, so person_record.email_address will be set to nil as well on the 2nd line of your code. Then, when you attempt to use this nil stored in person_record.email_address as an index to email_table in 3rd line, you will get the exact error you've mentioned.