Using pyquery to try to parse data

Using pyquery to try to parse data - parsing

I am trying to use the following code to pull data from a website. I am returning a blank screen, I don't know what I am doing wrong.
doc = pq(url='https://www.sec.gov/Archives/edgar/data/1800/0001047469-15-001377-index.htm')
for heading in doc(".tableFile:contains('Document Format Files')").parent('div'):
rows = pq(heading).next("tbody tr")
for row in rows:
tds = pq(row).find("td")
print(tds.eq(2).text())

doc = pq(url='https://www.sec.gov/Archives/edgar/data/1800/0001047469-15-001377-index.htm')
trs = doc('.tableFile').eq(0).find('tr')
for tr in trs:
td = pq(tr)('td').eq(2)
print td.text()

Related

IfcOpenShell(Parse)_IFC PropertySet, printing issue

Hy, I am new to programming and I have problems with printing my property sets and values.
I have more elements in my IFC and want to Parse all Property Sets and values.
My current result is elements ID(for every element), but it takes the attributes(property sets and values) form the first one.
Sketch:
see image
My code:
import ifcopenshell
ifc_file = ifcopenshell.open('D:\PZI_9-1_1441_LIN_CES_1-17c-O_M-M3.ifc')
products = ifc_file.by_type('IFCPROPERTYSET')
for product in products:
print(product.is_a())
print(product) # Prints
Category_Name_1 = ifc_file.by_type('IFCBUILDINGELEMENTPROXY')[0]
for definition in Category_Name_1.IsDefinedBy:
property_set = definition.RelatingPropertyDefinition
headders_list = []
data_list = []
max_len = 0
for property in property_set.HasProperties:
if property.is_a('IfcPropertySingleValue'):
headers = (property.Name)
data= (property.NominalValue.wrappedValue)
#print(headders)
headders_list.append(headers)
if len(headers) > max_len: max_len = len(headers)
#print(data)
data_list.append(data)
if len(data) > max_len: max_len = len(data)
headders_list = [headers.ljust(max_len) for headers in headders_list]
data_list = [data.ljust(max_len) for data in data_list]
print(" ".join(headders_list))
print(" ".join(data_list))
Has somebody a solution?
Thanks and kind regards,

On line:
Category_Name_1 = ifc_file.by_type('IFCBUILDINGELEMENTPROXY')[0]
it seems that you are referring always to the first IfcBuildingElementProxy object (because of the 0-index). The index should be incremented for each product, I guess.

Biopython Genbank.Record : trying to understand source code

I am writing a csv reader to generate Genbank files to capture annotations with sequence.
First I used a Bio.SeqRecord and got correctly formatted output but the SeqRecord class lacks fields that I need.
Blockquote
FEATURES Location/Qualifiers
HCDR1 27..35
HCDR2 50..66
HCDR3 99..109
I switched to Bio.GenBank.Record and have the needed fields except now the annotation formatting is wrong. It can't have the extra "type:" "location:" and "qualifiers:" text and the information should all be on one line.
Blockquote
FEATURES Location/Qualifiers
type: HCDR1
location: [26:35]
qualifiers:
type: HCDR2
location: [49:66]
qualifiers:
type: HCDR3
location: [98:109]
qualifiers:
The code for pulling annotations is the same for both versions. Only the class changed.
# Read csv entries and create a container with the data
container = Record()
container.locus = row['Sample']
container.size = len(row['Seq'])
container.residue_type="PROTEIN"
container.data_file_division="PRI"
container.date = (datetime.date.today().strftime("%d-%b-%Y")) # today's date
container.definition = row['FullCloneName']
container.accession = [row['Vgene'],row['HCDR3']]
container.version = getpass.getuser()
container.keywords = [row['ProjectName']]
container.source = "test"
container.organism = "Homo Sapiens"
container.sequence = row['Seq']
annotations = []
CDRS = ["HCDR1", "HCDR2", "HCDR3"]
for CDR in CDRS:
start = row['Seq'].find(row[CDR])
end = start + len(row[CDR])
feature = SeqFeature(FeatureLocation(start=start, end=end), type=CDR)
container.features.append(feature)
I have looked at the source code for Bio.Genbank.Record but can't figure out why the SeqFeature class has different formatting output compared to Bio.SeqRecord.
Is there an elegant fix or do I write a separate tool to reformat the annotations in the Genbank file?

After reading the source code again, I discovered Bio.Genbank.Record has its own Features method that takes key and location as strings. These are formatted correctly in the output Genbank file.
CDRS = ["HCDR1", "HCDR2", "HCDR3"]
for CDR in CDRS:
start = row['Seq'].find(row[CDR])
end = start + len(row[CDR])
feature = Feature()
feature.key = "{}".format(CDR)
feature.location = "{}..{}".format(start, end)
container.features.append(feature)

Aspose: Text after Ampersand(&) not seen while setting the page header

I encountered a problem with setting the page header text containing ampersand like ‘a&b’. The text after ‘&’ disappears in the pdf maybe because it is the reserved key in Aspose. My code looks like this:
PageSetup pageSetup = workbook.getWorksheets().get(worksheetName).getPageSetup();
//calling the function
setHeaderFooter(pageSetup, parameters, criteria)
//function for setting header and footer
def setHeaderFooter(PageSetup pageSetup, parameters, criteria = [:])
{
def selectedLoa=getSelectedLoa(parameters)
if(selectedLoa.length()>110){
String firstLine = selectedLoa.substring(0,110);
String secondLine = selectedLoa.substring(110);
if(secondLine.length()>120){
secondLine = secondLine.substring(0,122)+"...."
}
selectedLoa = firstLine+"\n"+secondLine.trim();
}
def periodInfo=getPeriodInfo(parameters, criteria)
def reportingInfo=periodInfo[0]
def comparisonInfo=periodInfo[1]
def benchmarkName=getBenchmark(parameters)
def isNonComparison = criteria.isNonComparison?
criteria.isNonComparison:false
def footerInfo="&BReporting Period:&B " + reportingInfo+"\n"
if (comparisonInfo && !isNonComparison){
footerInfo=footerInfo+"&BComparison Period:&B " +comparisonInfo+"\n"
}
if (benchmarkName){
footerInfo+="&BBenchmark:&B "+benchmarkName
}
//where I encounterd the issue,selectedLoa contains string with ampersand
pageSetup.setHeader(0, pageSetup.getHeader(0) + "\n&\"Lucida Sans,Regular\"&8&K02-074&BPopulation:&B "+selectedLoa)
//Insertion of footer
pageSetup.setFooter(0,"&\"Lucida Sans,Regular\"&8&K02-074"+footerInfo)
def downloadDate = new Date().format("MMMM dd, yyyy")
pageSetup.setFooter(2,"&\"Lucida Sans,Regular\"&8&K02-074" + downloadDate)
//Insertion of logo
try{
def bucketName = parameters.containsKey('printedRLBucketName')?parameters.get('printedRLBucketName'):null
def filePath = parameters.containsKey('printedReportLogo')?parameters.get('printedReportLogo'): null
// Declaring a byte array
byte[] binaryData
if(!filePath || filePath.contains("null") || filePath.endsWith("null")){
filePath = root+"/images/defaultExportLogo.png"
InputStream is = new FileInputStream(new File(filePath))
binaryData = is.getBytes()
}else {
AmazonS3Client s3client = amazonClientService.getAmazonS3Client()
S3Object object = s3client.getObject(bucketName, filePath)
// Getting the bytes out of input stream of S3 object
binaryData = object.getObjectContent().getBytes()
}
// Setting the logo/picture in the right section (2) of the page header
pageSetup.setHeaderPicture(2, binaryData);
// Setting the script for the logo/picture
pageSetup.setHeader(2, "&G");
// Scaling the picture to correct size
Picture pic = pageSetup.getPicture(true, 2);
pic.setLockAspectRatio(true)
pic.setRelativeToOriginalPictureSize(true)
pic.setHeight(35)
pic.setWidth(Math.abs(pic.getWidth() * (pic.getHeightScale() / 100)).intValue());
}catch (Exception e){
e.printStackTrace()
}
}
In this case, I get only ‘a’ in the pdf header all other text after ampersand gets disappeared. Please suggest me with a solution for this. I am using aspose 18.2

We have added header on a PDF page with below code snippet but we did not notice any problem when ampersand sign is included in header text.
// open document
Document document = new Document(dataDir + "input.pdf");
// create text stamp
TextStamp textStamp = new TextStamp("a&bcdefg");
// set properties of the stamp
textStamp.setTopMargin(10);
textStamp.setHorizontalAlignment(HorizontalAlignment.Center);
textStamp.setVerticalAlignment(VerticalAlignment.Top);
// set text properties
textStamp.getTextState().setFont(new FontRepository().findFont("Arial"));
textStamp.getTextState().setFontSize(14.0F);
textStamp.getTextState().setFontStyle(FontStyles.Bold);
textStamp.getTextState().setFontStyle(FontStyles.Italic);
textStamp.getTextState().setForegroundColor(Color.getGreen());
// iterate through all pages of PDF file
for (int Page_counter = 1; Page_counter <= document.getPages().size(); Page_counter++) {
// add stamp to all pages of PDF file
document.getPages().get_Item(Page_counter).addStamp(textStamp);
}
// save output document
document.save(dataDir + "TextStamp_18.8.pdf");
Please ensure using Aspose.PDF for Java 18.8 in your environment. For further information on adding page header, you may visit Add Text Stamp in the Header or Footer section.
In case you face any problem while adding header, then please share your code snippet and generated PDF document with us via Google Drive, Dropbox etc. so that we may investigate it to help you out.
PS: I work with Aspose as Developer Evangelist.

Well, yes, "&" is a reserved word when inserting headers/footers in MS Excel spreadsheet via Aspose.Cells APIs. To cope with your issue, you got to place another ampersand to paste the "& (ampersand)" in the header string. See the sample code for your reference:
e.g
Sample code:
Workbook wb = new Workbook();
Worksheet ws = wb.getWorksheets().get(0);
ws.getCells().get("A1").putValue("testin..");
String headerText="a&&bcdefg";
PageSetup pageSetup = ws.getPageSetup();
pageSetup.setHeader(0, headerText);
wb.save("f:\\files\\out1.xlsx");
wb.save("f:\\files\\out2.pdf");
Hope this helps a bit.
I am working as Support developer/ Evangelist at Aspose.

Google Apps Script

I've created a simple script that reads through an xml file and posts the results to an SQL database. This works perfectly.
I've put a little if statement in the script to identify orders that have already been posted to SQL. Basically if the transactionID in the input array is higher than the highest transactionID on the SQL server it adds the row values to the output array.
It seems that I am missing a trick here because I am getting "TypeError: Cannot call method "getAttribute" of undefined. (line 18, file "Code")" when trying to compare the current xml row to the last transaction ID.
I've done some searching and whilst I can see people with similar problems the explanations don't make a whole lot of sense to me.
Anyway, here is the relevant part of the code. Note that this all works perfectly without the if() bit.
function getXML() {
var id = lastTransactionID();
var xmlSite = UrlFetchApp.fetch("https://api.eveonline.com/corp/WalletTransactions.xml.aspx?KeyID=1111&vCode=1111&accountKey=1001").getContentText();
var xmlDoc = XmlService.parse(xmlSite);
var root = xmlDoc.getRootElement();
var row = new Array();
row = root.getChild("result").getChild("rowset").getChildren("row");
var output = new Array();
var i = 0;
for (j=0;i<row.length;j++){
if(row[j].getAttribute("transactionID").getValue()>id){ //Produces: TypeError: Cannot call method "getAttribute" of undefined. (line 18, file "Code")
output[i] = new Array();
output[i][0] = row[j].getAttribute("transactionDateTime").getValue();
output[i][1] = row[j].getAttribute("transactionID").getValue();
output[i][2] = row[j].getAttribute("quantity").getValue();
output[i][3] = row[j].getAttribute("typeName").getValue();
output[i][4] = row[j].getAttribute("typeID").getValue();
output[i][5] = row[j].getAttribute("price").getValue();
output[i][6] = row[j].getAttribute("clientID").getValue();
output[i][7] = row[j].getAttribute("clientName").getValue();
output[i][8] = row[j].getAttribute("stationID").getValue();
output[i][9] = row[j].getAttribute("stationName").getValue();
output[i][10] = row[j].getAttribute("transactionType").getValue();
output[i][11] = row[j].getAttribute("transactionFor").getValue();
output[i][12] = row[j].getAttribute("journalTransactionID").getValue();
output[i][13] = row[j].getAttribute("clientTypeID").getValue();
i++;
}
}
insert(output,output.length);
}

I have seen my mistake and corrected.
Mistake was in the for loop.
for (j=0;i

How can I protect an excel sheet, except for a particular cell using Open XML 2.0?

I am generating a report using OpenXML and exporting it to excel. I want to protect the excel sheet except for a particular cell.
If anyone has worked on this before, kindly help
Thanks,
Amolik

PageMargins pageM = worksheetPart.Worksheet.GetFirstChild<PageMargins>();
SheetProtection sheetProtection = new SheetProtection();
sheetProtection.Password = "CC";
sheetProtection.Sheet = true;
sheetProtection.Objects = true;
sheetProtection.Scenarios = true;
ProtectedRanges pRanges = new ProtectedRanges();
ProtectedRange pRange = new ProtectedRange();
ListValue<StringValue> lValue = new ListValue<StringValue>();
lValue.InnerText = "A1:E1"; //set cell which you want to make it editable
pRange.SequenceOfReferences = lValue;
pRange.Name = "not allow editing";
pRanges.Append(pRange);
worksheetPart.Worksheet.InsertBefore(sheetProtection, pageM);
worksheetPart.Worksheet.InsertBefore(pRanges, pageM);
ref : http://social.msdn.microsoft.com/Forums/en-US/a6f7502d-3867-4d5b-83a9-b4e0e211068f/how-to-lock-specific-columns-in-xml-workbook-while-exporting-dataset-to-excel?forum=oxmlsdk

Have you tried using the OpenXML Productivity Toolkit?
from what I can see you have to add a
new CellFormat
with attribute
ApplyProtection = true
to
CellFormats
append
new Protection
with attribute
Locked = false
to the the CellFormat you created.
CellFormat is a element of CellFormats which is a element of Stylesheet
then to the Worksheet you add a
new SheetProtection(){ Password = "CC1A", Sheet = true, Objects = true, Scenarios = true };
I havent tried this, but it should be easy enought to find out what you need to do with the Productivity Toolkit. I hope this points you and anyone trying to do this in the right direction.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Using pyquery to try to parse data - parsing

doc = pq(url='https://www.sec.gov/Archives/edgar/data/1800/0001047469-15-001377-index.htm') trs = doc('.tableFile').eq(0).find('tr') for tr in trs: td = pq(tr)('td').eq(2) print td.text()

Related

IfcOpenShell(Parse)_IFC PropertySet, printing issue

Biopython Genbank.Record : trying to understand source code

Aspose: Text after Ampersand(&) not seen while setting the page header

Google Apps Script

How can I protect an excel sheet, except for a particular cell using Open XML 2.0?

Categories

Resources