How to make my mapbasic code run faster? - buffer

I want to ask about how to fix my mapbasic code to become faster
Actually, I made a program with mapbasic that worked normally but it takes long time to run and sometimes not responding
I already try to modified the code but nothing changes :(
This is my code
Sub ProsesBuffer
Create Table "Buffer2"
(Block_no Char(15),Remark Char(10),Type_Palm Char(10),Ha Decimal(6,2))
File "D:\Buffer2.TAB"
TYPE NATIVE Charset "WindowsLatin1"
Create Map For Buffer2 CoordSys Earth Projection 1, 104
Set Table Buffer2 FastEdit On Undo Off
Create Object As Buffer From sensus Width 5 Units "m" Type Spherical Resolution 100 Into Table Buffer2 Group by Rowid
Update Buffer2 Set Ha = Area(obj, "sq m")
Commit Table Buffer2
Create Index On Buffer2 (Block_no)
Add Map Layer Buffer2
Create Table "Check_Region"
(Block_no Char(15),Remark Char(10),Type_Palm Char(10),Ha Decimal(6,2))
File "D:\Check_Region.TAB"
TYPE NATIVE Charset "WindowsLatin1"
Create Map For Check_Region CoordSys Earth Projection 1, 104
Create Index On Check_Region (Block_no)
Add Map Layer Check_Region
Set Map Layer 1 Editable On
Set Table Buffer2 FastEdit On Undo Off
Objects Check From Buffer2 Into Table Check_Region Overlap Pen (1,2,0) Brush (2,16776960,0)
Commit Table Buffer2
Update Check_Region Set Ha = Area(obj, "sq m")
Commit Table Check_Region
Browse * From Check_Region
Set Map Layer 1 Editable Off Layer 2 Editable On
Add Column "Buffer2" (Ha) From Check_Region Set To sum(Ha) Where within
Browse * From Buffer2
'function stdev
Select count(*) "NL", sum(Ha) "Jlh_Ha",sum(Ha*Ha) "Sum_Sq", avg(Ha) "Mean" from Buffer2 into tbl_stdev
Browse * From tbl_stdev
dim numlines as integer
dim _sum_sq as float
dim _mean as float
dim _jlh_ha as float
fetch first from tbl_stdev
numlines=tbl_stdev.nl
_jlh_ha=tbl_stdev.jlh_ha
_sum_sq=tbl_stdev.sum_sq
_mean=tbl_stdev.mean
dim stdev as float
stdev= (_sum_sq - (_jlh_ha^2)/numlines)/(numlines-1)
print "stdev " + stdev
dim sd as float
sd=sqr(stdev)
dim stdev1 as float
dim stdev2 as float
dim stdev3 as float
stdev = _mean + 1 * sd
stdev2 = _mean + 2 * sd
stdev3 = _mean + 3 * sd
print "SD " + sd
print "STDEV1 " + stdev
print "STDEV2 " + stdev2
print "STDEV3 " + stdev3
Set Layer 1 Editable Off Layer 3 Editable On
select * from Buffer2 where Ha > stdev3 into Selection
browse * from Selection
Create Table "stdev3" (Block_no Char(15),Remark Char(10),Type_Palm Char(10),Ha Decimal(6,2)) file "D:\Buffer\stdev3.tab"
TYPE NATIVE Charset "WindowsLatin1"
Create Map For stdev3 CoordSys Earth Projection 1, 104
drop index stdev3 (Block_no)
Create Index On stdev3 (Block_no)
Add Map Layer stdev3
End Sub

Related

Constrain axis limits in chordDiagram (circlize) when making gifs

I hope somebody will be able to help me with this chordDiagram visualisation I am trying to create. I am well aware that maybe this visualization type was not suitable for this particular data, but somehow it was something I had in my head (or how I wanted to visualize this data) and what I wanted to create, and now I think it is too late to give it up :) too curious how one can fix it. It is my first real post here, though I am an active user of stackoverflow and I genuinely admire the audience here.
So I have this data on the change in the size of area in km2 over time (d0) and I am trying to create a GIF out of it using example here: https://guyabel.com/post/animated-directional-chord-diagrams/
The data "d0":
Time <- as.numeric(c(10,10,10,100,100,100,200,200,200,5,5,5,50,50,50,0,0,0))
Year <- as.character(c(2050,2100,2200,2050,2100,2200,2050,2100,2200,2050,2100,2200,2050,2100,2200,2050,2100,2200))
Area_km2 <- as.numeric(c(4.3075211,7.1672926,17.2780622,5.9099250,8.2909189,16.9748961,6.5400554,8.9036313,16.5627228,3.0765610,6.3929883,18.0708108,5.3520782,8.4503856,16.7938196,0.5565978,1.8415855,12.5089476))
(d0 <- as.data.frame(cbind(Time,Year,Area_km2)))
I also have the color codes stored in a separate dataframe (d1) following the above mentioned example.
The data "d1":
year <- as.numeric(c(2050,2100,2200))
order1 <- as.character(c(1,2,3))
col1 <- c("#40A4D8","#33BEB7","#0C5BCE")
(d1 <- as.data.frame(cbind(year,order1,col1)))
So the idea was to have self-linking flows within each sector increasing in size over time, which will look like just growing segments in a final animated GIF (or like growing pie segments), but I noticed that regardless how hard I try I can't seem to manage to constrain the axis of each segment to limits of that particular year in an every single frame. It seems that the axis is being added on and keeps on adding over time, which is not what I want.
Like for example in the first figure (figure0) or "starting frame" the size of the links matches well the dataframe:
figure0
So it is
orig_year
Area_km2
.frame
2050
0.557
0
2100
1.84
0
2200
12.5
0
But when one plots next figure (figure1), the axis seems to have taken the values from the starting frame and added on the current values (4, 7.4 and 19 respectively) instead of (3.08, 6.39 and 18.1) or what should have been the values according the data frame:
figure1
orig_year
Area_km2
.frame
2050
3.08
1
2100
6.39
1
2200
18.1
1
And it keep on doing so as one loops through the data and creates new plots for the next frames. I wonder whether it is possible to constrain the axis and create the visualization in a way that the links just gradually increase over time and the axis is, so to say, following the increase or does also increase gradually following the data???
Any help is highly appreciated!
Thanks.
My code:
Sort decreasing
(d0 <- arrange(d0,Time))
Copy columns
(d0$Dest_year <- d0$Year)
Re-arrange data
library(tweenr)
(d2 <- d0 %>%
mutate(corridor=paste(Year,Dest_year,sep="->")) %>%
dplyr::select(Time,corridor,Area_km2) %>%
mutate(ease="linear") %>%
tweenr::tween_elements('Time','corridor','ease',nframes=30) %>%
tibble::as_tibble())
(d2 <- d2 %>%
separate(col=.group,into=c("orig_year","dest_year"),sep="->") %>%
dplyr::select(orig_year,dest_year,Area_km2,everything()))
d2$Time <- NULL
Create a directory to store the individual plots
dir.create("./plot-gif/")
Fixing scales
scale_gap <- function(Area_km2_m,Area_km2_max,gap_at_max=1,gaps=NULL) {
p <- Area_km2_m/Area_km2_max
if(length(gap_at_max)==1 & !is.null(gaps)) {
gap_at_max <- rep(gap_at_max,gaps)
}
gap_degree <- (360-sum(gap_at_max))*(1-p)
gap_m <- (gap_degree + sum(gap_at_max))/gaps
return(gap_m)
}
Function to derive the size of gaps in each frame for an animated GIF
(d3 <- d2 %>% group_by(orig_year) %>% mutate(gaps=scale_gap(Area_km2_m=Area_km2,Area_km2_max=max(.$Area_km2),gap_at_max=4,gaps=9)))
library(magrittr)
Get the values for axis limits
(axmax <- d2 %>% group_by(orig_year,.frame) %>% mutate(max=mean(Area_km2)))
Creating unique chordDiagrams for each frame
library(circlize)
for(f in unique(d2$.frame)){
png(file=paste0("./plot-gif/figure",f,".png"),height=7,width=7,units="in",res=500)
circos.clear()
par(mar=rep(0,4),cex=1)
circos.par(start.degree=90,track.margin=c(-0.1,0.1),
gap.degree=filter(d3,.frame==f)$gaps,
points.overflow.warning=FALSE)
chordDiagram(x=filter(d2,.frame==f),directional=2,order=d1$year,
grid.col=d1$col1,annotationTrack=c("grid","name","axis"),
transparency=0.25,annotationTrackHeight=c(0.05,0.1),
direction.type=c("diffHeight"),
diffHeight=-0.04,link.sort=TRUE,
xmax=axmax$max)
dev.off()
}
Now make a GIF
library(magick)
img <- image_read(path="./plot-gif/figure0.png")
for(f in unique(d2$.frame)[-1]){
img0 <- image_read(path=paste0("./plot-gif/figure",f,".png"))
img <- c(img,img0)
message(f)
}
img1 <- image_scale(image=img,geometry="720x720")
ani0 <- image_animate(image=img1,fps=10)
image_write(image=ani0,path="./plot-gif/figure.gif")
I will start with your d0 object. I first construct the d0 object but I do not convert everything to characters, just put them as the original numeric format. Also I reorder d0 by Time and Year:
Time = c(10,10,10,100,100,100,200,200,200,5,5,5,50,50,50,0,0,0)
Year = c(2050,2100,2200,2050,2100,2200,2050,2100,2200,2050,2100,2200,2050,2100,2200,2050,2100,2200)
Area_km2 = c(4.3075211,7.1672926,17.2780622,5.9099250,8.2909189,16.9748961,6.5400554,8.9036313,16.5627228,3.0765610,6.3929883,18.0708108,5.3520782,8.4503856,16.7938196,0.5565978,1.8415855,12.5089476)
d0 = data.frame(Time = Time,
Year = Year,
Area_km2 = Area_km2,
Dest_year = Year)
d0 = d0[order(d0$Time, d0$Year), ]
The key thing is to calculate proper values for "gaps" between sectors so that the same unit from data corresponds to the same degree in different plots.
We first calculate the maximal total width of the circular plot:
width = tapply(d0$Area_km2, d0$Time, sum)
max_width = max(width)
We assume there are n sectors (where n = 3 in d0). We let the first n-1 gaps to be 2 degrees and we dynamically adjust the last gap according to the total amount of values in each plot. For the plot with the largest total value, the last gap is also set to 2 degrees.
n = 3
degree_per_unit = (360 - n*2)/max_width
Now degree_per_unit can be shared between multiple plots. Every time we calculate the value for last_gap:
for(t in sort(unique(Time))) {
l = d0$Time == t
d0_current = d0[l, c("Year", "Dest_year", "Area_km2")]
last_gap = 360 - (n-1)*2 - sum(d0_current$Area_km2)*degree_per_unit
circos.par(gap.after = c(rep(2, n-1), last_gap))
chordDiagram(d0_current, grid.col = c("2050" = "red", "2100" = "blue", "2200" = "green"))
circos.clear()
title(paste0("Time = ", t, ", Sum = ", sum(d0_current$Area_km2)))
Sys.sleep(1)
}

Is there a faster way of to generate the required output than using a one-to-many join in Proc SQL?

I require an output that shows the total number of hours worked in a rolling 24 hour window. The data is currently stored such that each row is one hourly slot (for example 7-8am on Jan 2nd) per person and how much they worked in that hour stored as "Hour". What I need to create is another field that is the sum of the most recent 24 hourly slots (inclusive) for each row. So for the 7-8am example above I would want the sum of "Hour" across the 24 rows: Jan 1st 8-9am, Jan 1st 9-10am... Jan 2nd 6-7am, Jan 2nd 7-8am.
Rinse and repeat for each hourly slot.
There are 6000 people, and we have 6 months of data, which means the table has 6000 * 183 days * 24 hours = 26.3m rows.
I am currently done this using the code below, which works on a sample of 50 people very easily, but grinds to a halt when I try it on the full table, somewhat understandably.
Does anyone have any other ideas? All date/time variables are in datetime format.
proc sql;
create table want as
select x.*
, case when Hours_Wrkd_In_Window > 16 then 1 else 0 end as Correct
from (
select a.ID
, a.Start_DTTM
, a.End_DTTM
, sum(b.hours) as Hours_Wrkd_In_Window
from have a
left join have b
on a.ID = b.ID
and b.start_dttm > a.start_dttm - (24 * 60 * 60)
and b.start_dttm <= a.start_dttm
where datepart(a.Start_dttm) >= &report_start_date.
and datepart(a.Start_dttm) < &report_end_date.
group by ID
, a.Start_DTTM
, a.End_DTTM
) x
order by x.ID
, x.Start_DTTM
;quit;
The most performant DATA step solution most likely involves a ring-array to track the 1hr time slots and hours worked within. The ring will allow a rolling aggregate (sum and count) to be computed based on what goes into and out of the ring.
If you have a wide SAS license, look into the procedures in SAS/ETS (Econometrics and Time Series). Proc EXPAND might have some rolling aggregate capability.
This sample DATA Step code took <10s (WORK folder on SSD) to run on simulated data for 6k people with 6months of complete coverage of 1hr time slots.
data have(keep=id start_dt end_dt hours);
do id = 1 to 6000;
do start_dt
= intnx('dtmonth', datetime(), -12)
to intnx('dtmonth', datetime(), -6)
by dhms(0,1,0,0)
;
end_dt = start_dt + dhms(0,1,0,0);
hours = 0.25 * floor (5 * ranuni(123)); * 0, 1/4, 1/2, 3/4 or 1 hour;
output;
end;
end;
format hours 5.2;
run;
/* %let log= ; options obs=50 linesize=200; * submit this (instead of next) if you want to log the logic; */
%let log=*; options obs=max;
data want2(keep=id start_dt end_dt hours hours_rolling_sum hours_rolling_cnt hours_out_:);
array dt_ring(24) _temporary_;
array hr_ring(24) _temporary_;
call missing (of dt_ring(*));
call missing (of hr_ring(*));
if 0 then set have; * prep pdv column order;
hours_rolling_sum = 0;
hours_rolling_cnt = 0;
label hours_rolling_sum = 'Hours worked in prior 24 hours';
index = 0;
do until (last.id);
set have;
by id start_dt;
index + 1;
if index > 24 then index = 1;
hours_out_sum = 0;
hours_out_cnt = 0;
do clear = 1 by 1 until (clear=0);
if sum (dt_ring(index), 0) = 0 then do;
* index is first go through ring array, or hit a zeroed slot;
&log putlog 'NOTE: ' index= 'clear for empty ring item. ';
clear = 0;
end;
else
if start_dt - dt_ring(index) >= %sysfunc(dhms(0,24,0,0)) then do;
&log putlog / 'NOTE: ' index= 'reducting and zeroing.' /;
hours_out_sum + hr_ring(index);
hours_out_cnt + 1;
hours_rolling_sum = hours_rolling_sum - hr_ring(index);
hours_rolling_cnt = hours_rolling_cnt - 1;
dt_ring(index) = 0;
hr_ring(index) = 0;
* advance item to next item, that might also be more than 24 hours ago;
index = index + 1;
if index > 24 then index = 1;
end;
else do;
&log putlog / 'NOTE: ' index= 'back off !' /;
* index was advanced to an item within 24 hours, back off one;
index = index - 1;
if index < 1 then index = 24;
clear = 0;
end;
end; /* do clear */
dt_ring(index) = start_dt;
hr_ring(index) = hours;
hours_rolling_sum + hours;
hours_rolling_cnt + 1;
&log putlog 'NOTE: ' index= 'overlaying and aggregating.' / 'NOTE: ' start_dt= hours= hours_rolling_sum= hours_rolling_cnt=;
output;
end; /* do until */
format hours_rolling_sum 5.2 hours_rolling_cnt 2.;
format hours_out_sum 5.2 hours_out_cnt 2.;
run;
options obs=max;
When reviewing the results you should notice the delta for hours_rolling_sum is +(hours in slot) - (hours_out_sum{which is hrs removed from ring})
If you must use SQL, I would suggest following #jspascal and index the table, but rearrange the query to left join original data to inner-joined subselect (so that SQL will do an index involved hash join on the ids) . For same amount of few people it should faster than original query, but still be too slow for doing all 6K.
proc sql;
create index id on have;
create index id_slot on have (id, start_dt);
quit;
proc sql _method;
reset inobs=50; * limit data so you can see the _method;
create table want as
select
have.*
, case
when ROLLING.HOURS_WORKED_24_HOUR_PRIOR > 16
then 1
else 0
end as REVIEW_TIME_CLOCKING_FLAG
from
have
left join
(
select
EACH_SLOT.id
, EACH_SLOT.start_dt
, count(*) as SLOT_COUNT_24_HOUR_PRIOR
, sum(PRIOR_SLOT.hours) as HOURS_WORKED_24_HOUR_PRIOR
from
have as EACH_SLOT
join
have as PRIOR_SLOT
on
EACH_SLOT.ID = PRIOR_SLOT.ID
and EACH_SLOT.start_dt - PRIOR_SLOT.start_dt between 0 and %sysfunc(dhms(0,24,0,0))-0.1
group by
EACH_SLOT.id, EACH_SLOT.start_dt
) as ROLLING
on
have.ID = ROLLING.ID
and have.start_dt = ROLLING.start_dt
order by
id, start_dt
;
%put NOTE: SQLOOPS = &SQLOOPS;
quit;
The inner join is pyramid-like and still involves a lot of internal looping.
A compound index on the columns being accessed in the joined table - id + start_dttm + hours - would be useful if there isn't one already.
Using msglevel=i will print some diagnostics about how the query is executed. It may give some additional hints.

MVC 5 Get Exif data for Camera make

I am trying to Get Exif data for Camera make, ISO speed etc. in a file upload. I can get some tags (see below) but need some guidance on extracting items from the Exif directories. Any suggestions please.
IEnumerable<MetadataExtractor.Directory> directories = ImageMetadataReader.ReadMetadata(strFileName);
foreach (var directory in directories)
foreach (var tag in directory.Tags)
System.Diagnostics.Debug.WriteLine(string.Format("Directory " + $"{directory.Name} - {tag.Name} = {tag.Description}"));
var subIfdDirectory = directories.OfType<ExifSubIfdDirectory>().FirstOrDefault();
var dateTime = subIfdDirectory?.GetDescription(ExifDirectoryBase.TagDateTime);
System.Diagnostics.Debug.WriteLine(string.Format("dateTime " + dateTime));
//
Image img = Image.FromFile(strFileName);
ImageFormat format = img.RawFormat;
System.Diagnostics.Debug.WriteLine("Image Type : " + format.ToString());
System.Diagnostics.Debug.WriteLine("Image width : " + img.Width);
System.Diagnostics.Debug.WriteLine("Image height : " + img.Height);
System.Diagnostics.Debug.WriteLine("Image resolution : " + (img.VerticalResolution * img.HorizontalResolution));
System.Diagnostics.Debug.WriteLine("Image Pixel depth : " + Image.GetPixelFormatSize(img.PixelFormat));
PropertyItem[] propItems = img.PropertyItems;
int count = 0;
ArrayList arrayList = new ArrayList();
foreach (PropertyItem item in propItems)
{
arrayList.Add("Property Item " + count.ToString());
arrayList.Add("iD: 0x" + item.Id.ToString("x"));
System.Diagnostics.Debug.WriteLine("PropertyItem item in propItems: " + item.Id.ToString("Name"));
count++;
}
ASCIIEncoding encodings = new ASCIIEncoding();
try
{
string make = encodings.GetString(propItems[1].Value);
arrayList.Add("The equipment make is " + make.ToString() + ".");
}
catch
{
arrayList.Add("no Meta Data Found");
}
ViewBag.listFromArray = arrayList;
return View(await db.ReadExifs.ToListAsync());
}
Two loops I know, messy but gives some output :
Directory JPEG - Compression Type = Baseline
Directory JPEG - Data Precision = 8 bits
Directory JPEG - Image Height = 376 pixels
Directory JPEG - Image Width = 596 pixels
Directory JPEG - Number of Components = 3
Directory JPEG - Component 1 = Y component: Quantization table 0, Sampling factors 2 horiz/2 vert
Directory JPEG - Component 2 = Cb component: Quantization table 1, Sampling factors 1 horiz/1 vert
Directory JPEG - Component 3 = Cr component: Quantization table 1, Sampling factors 1 horiz/1 vert
Directory JFIF - Version = 1.1
Directory JFIF - Resolution Units = inch
Directory JFIF - X Resolution = 120 dots
Directory JFIF - Y Resolution = 120 dots
Directory JFIF - Thumbnail Width Pixels = 0
Directory JFIF - Thumbnail Height Pixels = 0
Directory File - File Name = FakeFoto03_large.Jpg
Directory File - File Size = 66574 bytes
Directory File - File Modified Date = Tue Jan 03 00:02:00 +00:00 2017
Image Type : [ImageFormat: b96b3cae-0728-11d3-9d7b-0000f81ef32e]
Image width : 596
Image height : 376
Image resolution : 14400
Image Pixel depth : 24
Thanks. Y.
If the image you're processing has camera make, ISO and so forth, the metadata-extractor will print it out. The image you're providing must not have those details.
Solved. This block:
ArrayList arrayList = new ArrayList();
IEnumerable<MetadataExtractor.Directory> directories = ImageMetadataReader.ReadMetadata(strFileName);
foreach (var directory in directories)
foreach (var tag in directory.Tags)
// System.Diagnostics.Debug.WriteLine(string.Format("Directory " + $"{directory.Name} - {tag.Name} = {tag.Description}"));
arrayList.Add($"{tag.Name} = {tag.Description}");
ViewBag.listFromArray = arrayList;
return View(await db.ReadExifs.ToListAsync());
This produces (in the case of the Photo used as source) 120 exif tags. Sample:
White Balance Mode = Auto white balance
Digital Zoom Ratio = 1
Focal Length 35 = 28 mm
Scene Capture Type = Standard
Gain Control = Low gain up
Contrast = None
Thanks to Drew for the reply, works fine now, up to a point. while the snippet prints to screen fine ( 160 items ), I cannot assign the items description to a variable or array. Here is the code:
// start exif ###############################
var strFileName = Server.MapPath("~/uploads/" + fname + "_large" + extension);
System.Diagnostics.Debug.WriteLine(">>> ReadExifsController, fname: " + fname);
if (System.IO.File.Exists(strFileName))
{
System.Diagnostics.Debug.WriteLine(">>> ReadExifsController File exists.");
}
ArrayList arrayList = new ArrayList();
arrayList.Add("ArrayList start");
IEnumerable<MetadataExtractor.Directory> directories = ImageMetadataReader.ReadMetadata(strFileName);
foreach (var directory in directories)
foreach (var tag in directory.Tags)
System.Diagnostics.Debug.WriteLine(string.Format("Directory " + $"{directory.Name} - {tag.Name} = {tag.Description}"));
count++;
ViewBag.listFromArray = arrayList;
return View(await db.ReadExifs.ToListAsync());
}

How to export/convert line projection to excel table and order the Y coornidate

I wrote a code that can get line projection (intensity profile) of an image, and I would like to convert/export this line projection (intensity profile) to excel table, and then order all the Y coordinate. For example, except the maximum and minimum values of all the Y coordinate, I would like to know largest 5 coordinate value and smallest coordinate value.
Is there any code can reach this function? Thanks,
image line_projection
Realimage imgexmp
imgexmp := GetFrontImage()
number samples = 256, xscale, yscale, xsize, ysize
GetSize( imgexmp, xsize, ysize )
line_projection := CreateFloatImage( "line projection", Xsize, 1 )
line_projection = 0
line_projection[icol,0] += imgexmp
line_projection /= samples
ShowImage( line_projection )
Finding a 'sorted' list of values
If you need to sort though large lists of values (i.e. large images) the following might not be very sufficient. However, if your aim is to get the "x highest" values with a relatively small number of X, then the following code is just fine:
number nFind = 10
image test := GetFrontImage().ImageClone()
Result( "\n\n" + nFind + " highest values:\n" )
number x,y,v
For( number i=0; i<nFind; i++ )
{
v = max(test,x,y)
Result( "\t" + v + " at " + x + "\n" )
test[x,y] = - Infinity()
}
Working with a copy and subsequently "removing" the maximum value by changing that pixel value. The max command is fast - even for large images -, but the for-loop iteration and setting of individual pixels is slow. Hence this script is too slow for a complete 'sorting' of the data if it is big, but it can quickly get you the n 'highest' values.
This is a non-coding answer:
If you havea LinePlot display in DigitalMicrograph, you can simply copy-paste that into Excel to get the numbers.
i.e. with the LinePlot image front most, preses CTRL + C to copy
(make sure there are no ROIs on it).
Switch to Excel and press CTRL + V. Done.
==>

Getting an error with Openpyxl with Kivy

I'm trying to use some my python code I've written using IPython on Kivy, but I'm getting an error that says it cannot import name BUILTIN_FORMATS, which is called from the styleable.py within openpyxl.
BTW I used:
import openpyxl as xl
It works perfectly fine when I run the code within IPython.
Does anyone know how I can fix this.
EDIT: I've already tried reinstalling openpyxl with pip.
EDIT2: I'm on windows 7, and here's my code:
#!/usr/bin/kivy
import kivy
import random
import matplotlib.pyplot as plt
import pandas as pd
import pylab as pl
import requests
import openpyxl as xl
from operator import itemgetter
from collections import Counter
from lxml import html
#function to load the table form the excel file corresponding to the passed sheet name
def loadTable(sheetName):
lotteryData = pd.ExcelFile("Lottery databases.xlsx") #grabs and loads the file into memory
df = lotteryData.parse(sheetName) #loads the data table form the corresponding sheetName into the df data frame
return df
#function to display the table
def showTable(table):
#get the number of rows the table has
no_of_rows = len(table.index)
#display the table
return table.head(no_of_rows)
#function to display pie charts of a specific column within the database
#table is the database that the function will be working with
#and column is a numberical vaule of which column to get the data from
def printPieChart(table, column):
if column == 6:
columnList = table.iloc[:, -1:].values.T.ravel()
else:
columnList = table.iloc[:, (column - 7): (column - 6)].values.T.ravel()
countedList = Counter(columnList)
#set up the size of the pie chart
fig = plt.figure(figsize=[10, 10])
ax = fig.add_subplot(111)
cmap = plt.cm.prism
#input variables for pie function
slices = [float(v) for v in countedList.values()]
colors = cmap(np.linspace(0., 1., len(slices)))
labels = [float(k) for k in countedList]
columnHeaders = list(table.columns.values)
#the pie chart
pie_wedge_collection = ax.pie(slices, colors = colors, labels = labels, labeldistance = 1.05, autopct = '%1.1f%%')
#get rid of the black outlines between the wedges and around the pie
for pie_wedge in pie_wedge_collection[0]:
pie_wedge.set_edgecolor('white')
ax.set_title(columnHeaders[column + 1])
#can't display a Legends as there's too many for plt.legends() too handle
#return pyplot.pie([float(v) for v in countedList.values()], labels = [float(k) for k in countedList])
def updateDatabase():
wb = xl.load_workbook("Lottery databases.xlsx") #load the workbook into memory
#list of the sheet names within the workbook
sheetnames = ["SuperLotto", "MegaMillions", "Powerball"]
days = ["Tue. ", "Wed. ", "Fri. ", "Sat. "] #days the draws on done on
#list of the webpages to use grab the new draws
webPages = ['http://www.calottery.com/play/draw-games/superlotto-plus/winning-numbers', 'http://www.calottery.com/play/draw-games/mega-millions/winning-numbers', 'http://www.calottery.com/play/draw-games/powerball/winning-numbers']
x = 3
while x != 0:
ws = wb.get_sheet_by_name(sheetnames[x-1]) # which sheet to update
rowIndex = ws.get_highest_row() # gets the highest row index in the sheet
lastCellValue = ws.cell(row = rowIndex - 1, column = 0).value #gets the last value in the first column, draw number
page = requests.get(webPages[x-1]) #grabs the webpage needed
tree = html.fromstring(page.text) #puts the webpage into a tree structure to make it easy to traverse
#get the newest draw and date from the webpage for comparasion purposes
draw_and_date = tree.xpath('//*[#id="objBody_content_0_pagecontent_0_objPastWinningNumbers_rptPast_ctl01_lblDrawDateNumber"]/text()')
#if the table is up to date, it will move on to the next table else it will update it
y = int(draw_and_date[0][-4:]) - int(lastCellValue) # checks to see how many draws are missing from the table
if y == 0:
print("The table for " + sheetnames[x-1] + " is up to date.")
x -= 1 #decrement x by 1 to move on to the next table
else:
#while loop to check if the table needs to be updated or not, if yes it will update it
while y != 0:
#grabs the draw and date of the missing draws from the table
draw_and_date = tree.xpath('//*[#id="objBody_content_0_pagecontent_0_objPastWinningNumbers_rptPast_ctl0' + str(y) + '_lblDrawDateNumber"]/text()')
numbers = tree.xpath(".//*[#id='content']/div[3]/table/tr[" + str(y) + "]/td[2]/span/text()") #numbers
numbers = [int(x) for x in numbers] # converts the text to integers
numbers.sort() #sort the number from smallest to largest
mega = tree.xpath(".//*[#id='content']/div[3]/table/tr[" + str(y) + "]/td[3]/text()") #mega number
mega = int(mega[0]) # converts the text to integers
#write to the file
if sheetnames[x-1] == "MegaMillions":
d = 0
else:
d = 1
if int(draw_and_date[0][-4:]) % 2 == 0:
# if the draw date is even then the day is a Friday/Saturday
ws.append([int(draw_and_date[0][-4:]), (days[d+2] + draw_and_date[0][:12]), numbers[0], numbers[1], numbers[2], numbers[3], numbers[4], mega]) # print the draw date
else:
# if the draw date is odd then the day is a Tuesday/Wednesday
ws.append([int(draw_and_date[0][-4:]), (days[d] + draw_and_date[0][:12]), numbers[0], numbers[1], numbers[2], numbers[3], numbers[4], mega])
y -= 1 #decrement y by 1 to get the next missing draw
print("Updated the " + sheetnames[x-1] + " table successfully!")
x -= 1 #decrement x by 1 to move on to the next table
wb.save("Lottery databases.xlsx") #save the workbook
print("Saved the database Sucessfully!")
and so on...

Resources