XPath expression not resulting in same result - parsing

I parse a website with a xpath-parser in Swift. The site has multiple pages with the same layout.
The website.
The xpath is like that:
//div[#class='views-row views-row-4 views-row-even']/div[#class='details']/div[#class='detailscontainer']//tr[7]/td[2]
It works for almost every element on every page but suddenly, the xpath doesn't return the value it should.
I've checked the xpath with a extension in chrome and it is correct. But the parser doesn't find it.
For example on this page it's the 'Gymnasium Laufen' where the 'Typ' row doesn't(tr[7]/td2) return any values(null-crash) but it should, because it's the same as for the other records.
Suspicious is also, that some rows before are working. Like for example the first 4. But higher rows often don't work.
I'm using Kanna as a parser. Here's my code:
for site in 15...49{
var url = "https://bildungssystem.educa.ch/de/schools_in_ch?page=2%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C" + site.description + "&title=&field_eduinst_canton_value=All&field_eduinst_type_value=All&field_eduinst_school_grade_value=All&ahah_page_storage[page_build_id]=page-72cd2d0cc7ac814bba6bbfb0b0bc0a3e"
var urlRequest = NSURL(string: url)
var error:NSError?
let html = String(contentsOfURL: urlRequest!, encoding: NSUTF8StringEncoding, error: &error)
if let doc = Kanna.HTML(html: html!, encoding: NSUTF8StringEncoding) {
println(doc.title)
for school in 1...25{
if school == 22 && site == 49{
break
}
for td in 1...7{
println(doc.xpath("//div[#class='view-content']/div[starts-with(#class, 'views-row views-row-" + school.description + "')]/div[#class='details']/div[#class='detailscontainer']//tr[" + td.description + "]/td[2]")[0].text)
}
}
println("SITE \(site) DONE")
}
}

The content you're trying to query may be getting loaded dynamically (via JavaScript), and thus isn't available when the HTML initially renders. The reason why the Chrome extension works may be because Chrome is executing the JavaScript required to build out the DOM. However, when you use Kanna and a NSURLSession for example, their is no JavaScript engine being used to evaluate the JS.

Related

How to convert escaping closure code to async-await code that uses URLSession?

I’m trying to convert escaping closure code to async-await code in a certain file. I’m stuck with implementing the async-await part, specifically whether to still use a do block, with catch and resume, or to not use a do block, and assign the line of code with “try await URLSession.shared.dataTask(with: request)” (that's commented out in File1-GettingSelectedRestaurantBusinessDataUsingAsync-AwaitAndNotUsingCodable.swift in this post below, and meant to be used in the solution attempts to this file) to a variable, then use that variable later similar to how the file File2-GettingRestaurantBusinessDataUsingAsync-AwaitAndUsingCodable.swift does, which is posted further below in this post.
*Note: I used async-await and codable to get the restaurant business data for a certain searched city (thats searched by the user) which is done in a different file (and function). The file I’m having trouble with though is for getting the selected restaurant business’s detail info (name, address, business hours, etc.), and I’m not using codable in this file because some of the data I get when doing this URL request, I get by using NSDictionary; not codable.
How do I correctly implement this async-await concept in my file? The file I’m implementing this in is File1-GettingSelectedRestaurantBusinessDataUsingAsync-AwaitAndNotUsingCodable.swift which is posted further below in this post.
*Update: Where I thought my problem lied when first writing up this question post: At the line of code “URLSession.shared.dataTask(with: request) { (data, response, error) in” in File1-GettingSelectedRestaurantBusinessDataUsingAsync-AwaitAndNotUsingCodable.swift, specifically I thought that I should use the form of code that didn’t use the completionHandler (which is commented as V2 in that file), and whether to use a do block after it, and catch, and resume after the do block.
I’ve posted some attempted solutions so far, which are incomplete, since I’m having the problems mentioned in the first paragraph of this post. I know they don’t work, but this is my thought process so far. These solution attempts are below the code files that I’m working with which are posted further below.
I used the following article for learning more about async-await before attempting to make this current implementation: https://www.avanderlee.com/swift/async-await/.
My code:
Code files that I’m working with:
File that I’m attempting to implement this escaping closure to async-await concept in:
File1-GettingSelectedRestaurantBusinessDataUsingAsync-AwaitAndNotUsingCodable.swift:
import Foundation
import UIKit
import CoreLocation
extension UIViewController {
func retrieveSelectedRestaurantDetailViewInfo(
selected_restaurant_business_ID: String) async throws -> SelectedRestaurantDetailViewInfoNotUsingCodable? {
// MARK: Make API Call
let apiKey = "API key"
/// Create URL
let baseURL =
"https://api.yelp.com/v3/businesses/\(selected_restaurant_business_ID)"
let url = URL(string: baseURL)
/// Creating Request
var request = URLRequest(url: url!)
request.setValue("Bearer \(apiKey)", forHTTPHeaderField: "Authorization")
request.httpMethod = "GET"
///Initialize session and task
//V1: Code for using escaping closure version code.
URLSession.shared.dataTask(with: request) { (data, response, error) in
//This version commented out right now, to show where I'm at with this proboem for clarity. This version is included in the solution attempts; both SoultionAttempt1 and SolutionAttempt2.
if let error = error {
completionHandler(nil, error)
}
//V2: Code for what I think is correct for using async-await version code. Not using the completionHandler here.
// URLSession.shared.dataTask(with: request)
do {
/// Read data as JSON
let json = try JSONSerialization.jsonObject(with: data!, options: [])
/// Main dictionary
guard let responseDictionary = json as? NSDictionary else {return}
//Code for accessing restaraunt detail view info, and assigning it to selectedRestarauntDetailViewInfo view model thats a struct.
var selectedVenue = SelectedRestaurantDetailViewInfoNotUsingCodable()
//Rest of code for accessing restaraunt detail view info, and assigning it to seelctedRestarauntDetailViewInfo view model thats a struct.
selectedVenue.name = responseDictionary.value(forKey: "name") as? String
//*Rest of code for getting business info including address, hours, etc..*
//Commented out for now, because am going with version 2 below.
//V1: Uses escaping closure code version.
// completionHandler(selectedVenue, nil)
//V2: Used for Async/Await code version.
return selectedVenue
} catch {
print("Caught error")
}
}.resume()
}
}
*Update: Below is the new file that shows the code that calls the function with the URLRequest in it in File1-GettingSelectedRestaurantBusinessDataUsingAsync-AwaitAndNotUsingCodable.swift, that uses the async-await concept, as mentioned in a response to a comment in this post:
File0.5-FileWithJustCodeThatCallsTheFunctionForMakingTheURLRequestInFile1.swift:
async let responseSelectedVenueDetailViewInfoNotUsingCodable = try await retrieveSelectedRestaurantDetailViewInfo(selected_restaurant_business_ID: venues.id)
File (file I'm referring to is below; is File2-GettingRestaurantBusinessDataUsingAsync-AwaitAndUsingCodable.swift) that uses async-await for getting the initial restaurant business data, after the user selects a city, which I’m using for reference for making the stated change from the escaping closure code to the async-await code in File1-GettingSelectedRestaurantBusinessDataUsingAsync-AwaitAndNotUsingCodable.swift above:
File2-GettingRestaurantBusinessDataUsingAsync-AwaitAndUsingCodable.swift:
import UIKit
import Foundation
import CoreLocation
class YelpApiSelectedRestaurantDetailViewInfo {
private var apiKey: String
init(apiKey: String) {
self.apiKey = apiKey
}
func searchBusinessDetailViewInfo(selectedRestaurantBusinessID: String) async throws -> SelectedRestaurantDetailViewInfo {
var resultsForTheSelectedRestaurantDetailViewInfo: SelectedRestaurantDetailViewInfo
// MARK: Make URL Request.
var urlComponents = URLComponents(string: "https://api.yelp.com/v3/businesses/\(selectedRestaurantBusinessID)")
guard let url = urlComponents?.url else {
throw URLError(.badURL)
}
var request = URLRequest(url: url)
request.setValue("Bearer \(self.apiKey)", forHTTPHeaderField: "Authorization")
let (data, _) = try await URLSession.shared.data(for: request)
let businessDetailViewInfoResults = try JSONDecoder().decode(SelectedRestaurantDetailViewInfo.self, from: data)
resultsForTheSelectedRestaurantDetailViewInfo = businessDetailViewInfoResults
return resultsForTheSelectedRestaurantDetailViewInfo
}
}
Solution Attempts:
*Note:
-Both of the below solution attempt code snippets start at the line of code: URLSession.shared.dataTask(with: request) { (data, response, error) in in File1-GettingSelectedRestaurantBusinessDataUsingAsync-AwaitAndNotUsingCodable.swift, and goes through the rest of that file, and ends at the same end point as that file (as file File1-GettingSelectedRestaurantBusinessDataUsingAsync-AwaitAndNotUsingCodable.swift).
-Also, the file that these solution attempts are to be implemented in is File1-GettingSelectedRestaurantBusinessDataUsingAsync-AwaitAndNotUsingCodable.swift.
SolutionAttempt1-DoesntUseDoBlock.swift:
///Initialize session and task
try await URLSession.shared.dataTask(with: request)
/// Read data as JSON
let json = try JSONSerialization.jsonObject(with: data!, options: [])
/// Main dictionary
guard let responseDictionary = json as? NSDictionary else {return}
//Code for accessing restaraunt detail view info, and assigning it to selectedRestarauntDetailViewInfo view model thats a struct.
var selectedVenue = SelectedRestaurantDetailViewInfoNotUsingCodable()
//Rest of code for accessing restaraunt detail view info, and assigning it to seelctedRestarauntDetailViewInfo view model thats a struct.
selectedVenue.name = responseDictionary.value(forKey: "name") as? String
//*Rest of code for getting business info including address, business hours, etc..*
return selectedVenue
} catch {
print("Caught error")
}
}.resume()
}
}
*Note about the following solution attempt file: The solution attempt here in my opinion (SolutionAttempt2-DoesUseDoBlock.swift) may not have to include indentation for the do block, where the do block is within the scope of the “try await URLSession.shared.dataTask(with: request)” line of code, but I included the below solution attempt to have this indentation, as it would seem that the do block would need to be within the scope of the “try await URLSession.shared.dataTask(with: request)” line of code, and the original file version of File1-GettingSelectedRestaurantBusinessDataUsingAsync-AwaitAndNotUsingCodable.swift, that uses the escaping closure (not the file being edited/worked on here) had the do block within the “URLSession.shared.dataTask(with: request) { (data, response, error) in” line of code’s scope, which is at the same position as the “try await URLSession.shared.dataTask(with: request)” line of code in this SolutionAttempt2-DoesUseDoBlock.swift file below.
SolutionAttempt2-DoesUseDoBlock.swift:
///Initialize session and task
try await URLSession.shared.dataTask(with: request)
do {
/// Read data as JSON
let json = try JSONSerialization.jsonObject(with: data!, options: [])
/// Main dictionary
guard let responseDictionary = json as? NSDictionary else {return}
//Code for accessing restaraunt detail view info, and assigning it to seelctedRestarauntDetailViewInfo view model thats a struct.
var selectedVenue = SelectedRestaurantDetailViewInfoNotUsingCodable()
//Rest of code for accessing restaraunt detail view info, and assigning it to seelctedRestarauntDetailViewInfo view model thats a struct.
selectedVenue.name = responseDictionary.value(forKey: "name") as? String
//*Rest of code for getting business info including address, business hours, etc.*
return selectedVenue
} catch {
print("Caught error")
}
}.resume()
}
}
Thanks!

HTTP DELETE Works From Browser But Not From Postman or IOS App

When attempting an http request to my rest api, I continually get a 401 error when using the following code. I don not get this error making any other type of request. I have provided the function that makes the request below.
func deleteEvent(id: Int){
eventUrl.append(String(id))
let request = NSMutableURLRequest(url: NSURL(string: eventUrl)! as URL)
request.httpMethod = "DELETE"
print(eventUrl)
eventUrl.removeLast()
print(self.token!)
request.allHTTPHeaderFields = ["Authorization": "Token \(self.token)"]
let task = URLSession.shared.dataTask(with: request as URLRequest) { data, response, error in
if error != nil {
print("error=\(String(describing: error))")
//put variable that triggers error try again view here
return
}
print("response = \(String(describing: response))")
}
task.resume()
}
When sending the delete request with postman, the rest api just returns the data I want to delete but does not delete it. For reference I have posted the view and permissions classes associated with this request Any help understanding why this may be resulting in an error is greatly appreciated!
Views.py
class UserProfileFeedViewSet(viewsets.ModelViewSet):
"""Handles creating, reading and updating profile feed items"""
authentication_classes = (TokenAuthentication,)
serializer_class = serializers.ProfileFeedItemSerializer
queryset = models.ProfileFeedItem.objects.all()
permission_classes = (permissions.UpdateOwnStatus, IsAuthenticated)
def perform_create(self, serializer):
"""Sets the user profile to the logged in user"""
#
serializer.save(user_profile=self.request.user)
Permissions.py
class UpdateOwnStatus(permissions.BasePermission):
"""Allow users to update their own status"""
def has_object_permission(self, request, view, obj):
"""Check the user is trying to update their own status"""
if request.method in permissions.SAFE_METHODS:
return True
return obj.user_profile.id == request.user.id
HEADER SENT WITH DELETE REQUEST VIA POSTMAN
Preface: You leave out too much relevant information from the question for it to be properly answered. Your Swift code looks, and please don't be offended, a bit beginner-ish or as if it had been migrated from Objective-C without much experience.
I don't know why POSTMAN fails, but I see some red flags in the Swift code you might want to look into to figure out why your iOS app fails.
I first noticed that eventUrl seems to be a String property of the type that contains the deleteEvent function. You mutate it by appending the event id, construct a URL from it (weirdly, see below), then mutate it back again. While this in itself is not necessarily wrong, it might open the doors for racing conditions depending how your app works overall.
More importantly: Does your eventUrl end in a "/"? I assume your DELETE endpoint is of the form https://somedomain.com/some/path/<id>, right? Now if eventUrl just contains https://somedomain.com/some/path your code constructs https://somedomain.com/some/path<id>. The last dash is missing, which definitely throws your backend off (how I cannot say, as that depends how the path is resolved in your server app).
It's hard to say what else is going from from the iOS app, but other than this potential pitfall I'd really recommend using proper Swift types where possible. Here's a cleaned up version of your method, hopefully that helps you a bit when debugging:
func deleteEvent(id: Int) {
guard let baseUrl = URL(string: eventUrl), let token = token else {
// add more error handling code here and/or put a breakpoint here to inspect
print("Could not create proper eventUrl or token is nil!")
return
}
let deletionUrl = baseUrl.appendingPathComponent("\(id)")
print("Deletion URL with appended id: \(deletionUrl.absoluteString)")
var request = URLRequest(url: deletionUrl)
request.httpMethod = "DELETE"
print(token) // ensure this is correct
request.allHTTPHeaderFields = ["Authorization": "Token \(token)"]
let task = URLSession.shared.dataTask(with: request) { data, response, error in
if let error = error {
print("Encountered network error: \(error)")
return
}
if let httpResponse = response as? HTTPURLResponse {
// this is basically also debugging code
print("Endpoint responded with status: \(httpResponse.statusCode)")
print(" with headers:\n\(httpResponse.allHeaderFields)")
}
// Debug output of the data:
if let data = data {
let payloadAsSimpleString = String(data: data, encoding: .utf8) ?? "(can't parse payload)"
print("Response contains payload\n\(payloadAsSimpleString)")
}
}
task.resume()
}
This is obviously still limited in terms of error handling, etc., but a little more swifty and contains more console output that will hopefully be helpful.
The last important thing is that you have to ensure iOS does not simply block your request due to Apple Transport Security: Make sure your plist has the expected entries if needed (see also here for a quick intro).

Issue deep linking to MS Excel from iOS app

I am trying to open an Excel document that is located on a server. I wrote the following code but it always returns false for UIApplication.shared.canOpenURL(url as URL)
I think I am missing some requirement for deep linking to Excel. Why is iOS not able to understand ms-excel:ofe|u| format?
#objc static func openExcel() {
let originalString = "http://s000.tinyupload.com/download.php?file_id=23290165129849240725&t=2329016512984924072514118"
let encodedString = originalString.addingPercentEncoding(withAllowedCharacters: .urlHostAllowed)
let encodedURLString = "ms-excel:ofe|u|" + encodedString! + "|n|TestDoc.xlsx|a|App"
if let url = NSURL(string: encodedURLString),
UIApplication.shared.canOpenURL(url as URL) {
UIApplication.shared.openURL(url as URL)
} else if let itunesUrl = NSURL(string: "https://itunes.apple.com/us/app/microsoft-excel/id586683407?mt=8&uo=4"), UIApplication.shared.canOpenURL(itunesUrl as URL) {
UIApplication.shared.openURL(itunesUrl as URL)
}
}
I have analyzed your code and found some mistakes. First, your URL was redirecting to somewhere, as per Microsoft documentation it can't handle redirecting URL's
The URL has to be encoded and must be a direct link to the file (not a
redirect). If the URL is in a format that Office cannot handle, or the
download simply fails, Office will not return the user to the invoking
application.
Here is Microsoft Documentation Link
The second mistake was you are only encoding the URL string containing site URL, you should consider the part after the scheme ms-excel: as a URL and should be encoded.
Because of improper encoding the let url = URL(string: encodedURLString) results nil that's why it is not working as expected.
Here is an example working code:
#objc static func openExcel() {
//replace the below url with yours. may be this one dosen't work
let originalString = "ofe|u|https://pgcconline.blackboard.com/webapps/dur-browserCheck-bb_bb60/samples/sample.xlsx"
let encodedString = originalString.addingPercentEncoding(withAllowedCharacters: .urlQueryAllowed)
let encodedURLString = "ms-excel:" + encodedString!
if let url = URL(string: encodedURLString),
UIApplication.shared.canOpenURL(url) {
UIApplication.shared.openURL(url)
} else if let itunesUrl = NSURL(string: "https://itunes.apple.com/us/app/microsoft-excel/id586683407?mt=8&uo=4"), UIApplication.shared.canOpenURL(itunesUrl as URL) {
UIApplication.shared.openURL(itunesUrl as URL)
}
}
Note: From iOS 9 you must whitelist any URL schemes your App wants to query in Info.plist under the LSApplicationQueriesSchemes key (an array of strings):
For example in our case:
When i try to open the URL in the question above I get redirected to this URL, so my guess would be that your code is fine, it just might be that your excel file you're trying to open is really an HTML page since tinyupload apparently blocks direct links to the files.
Maybe try opening a direct excel file download link, https://pgcconline.blackboard.com/webapps/dur-browserCheck-bb_bb60/samples/sample.xlsx (it was the first google result for 'xlsx file sample download')

How can I pull the grades on the website, and perform the login using a POST request?

I am trying to pull my school grades from the website which stores all my grades, but I am having trouble logging in using HTTP requests, and pulling the information of the next page. Any help is appreciated :)
override func viewDidLoad() {
super.viewDidLoad()
let myUrl = NSURL(string: "https://homeaccess.katyisd.org/HomeAccess/Account/LogOn?ReturnUrl=%2fhomeaccess%2f")
let request = NSMutableURLRequest(URL: myUrl!)
request.HTTPMethod = "POST"
let postString = "User_Name=**hidden**&Password=**hidden**"
request.HTTPBody = postString.dataUsingEncoding(NSUTF8StringEncoding)
let task = NSURLSession.sharedSession().dataTaskWithRequest(request){
data,response,error in
if(error != nil){
print("error=\(error)")
return
}
print("response = \(response)")
// Print out response body
let responseString = NSString(data: data!, encoding: NSUTF8StringEncoding)
print("responseString = \(responseString)")
//Let’s convert response sent from a server side script to a NSDictionary object:
do{
var myJSON = try NSJSONSerialization.JSONObjectWithData(data!, options: .MutableLeaves) as? NSDictionary
if let parseJSON = myJSON {
// Now we can access value of First Name by its key
var firstNameValue = parseJSON["User_Name"] as? String
print("firstNameValue: \(firstNameValue)")
}
}catch{
print(error)
}
}
}
First, you need task.resume() after defining the task in order to start the connection loading, otherwise the object will be created and nothing will actually happen.
According to this error you posted, there's an SSL verification error on the site you are trying to access. The most secure option is to fix the SSL on the site, but I presume that is beyond your control in this case. The easier fix is to bypass the SSL error by adding "App Transport Security Settings" > "Allow Arbitrary Loads" = YES in your info.plist, as #André suggested. Or, if you are only using the one domain, bypass the particular domain in the NSExceptionDomains. See this question for more info.
According to this error you posted, a JSON parsing error is occurring. It is currently being caught and printed by your catch block, so the data is not actually processed. In your case, this is occurring because the response from Home Access Center is HTML, not JSON, so the JSON parser is failing. You are probably looking for an HTML parser. Swift does not have one built-in; look at this question for some example open-source options.
I have actually created a program that interfaces with Home Access Center. Sadly, there is no public API available -- APIs typically return JSON, which is easier to work with. Instead, you will need to use an HTML parser to analyze the page that is meant for human users and cookies to fake that a human user is logging on.
add task.resume() at the end of your code. also add the following to your info.plist file:

Asynchronous Request not properly updating variables

I'm attempting to grab source from a page online to parse it and put the information into an array. The request goes through fine, and html has the source as a string as I would like. The problem is, after this function, even though html and sbcc are global, the values I've added disappear. Even the "Test" in the parse function does not appear. I think it has something to do with the request being asynchronous? I've searched, but it's only brought me to ideas I didn't quite understand, or doesn't really cover my specific question. My code snippets are below, if anyone can help I'd greatly appreciate it.
let url = NSURL(string: "http://www.google.com");
var html = String()
var sbcc = courselisting();
func getSource(url: NSURL){
let request = NSURLRequest(URL: url)
NSURLConnection.sendAsynchronousRequest(request, queue: NSOperationQueue.mainQueue()) {
(response, data, error) in
if (error != nil) {
println("whoops, something went wrong")
let alert : UIAlertView = UIAlertView(title: "Oops!", message: "Something went wrong", delegate: nil, cancelButtonTitle: "Reload")
alert.show()
} else {
//println(self.html)
self.html = NSString(data: data, encoding: NSUTF8StringEncoding)!
self.parse()
}
}
}
func parse() {
sbcc.subjects.append("Test")
sbcc.subjects.append(html.substringToIndex(advance(html.startIndex, 2)))
println(self.html)
}
EDIT: following zisoft's advice i removed the passing of the html string into parse() in my above code, however my global string html still does not have the appended values. for more info here is the main portion of viewDidLoad
getSource(url!);
println(self.html)
it is printing blank in the viewDidLoad but the println inside of parse prints out the proper HTML.
Parameters are passed by value to functions. So you are appending to a copy of html.
Since you declared html in a global scope there is no need to pass it to the function:
{
...
self.parse()
}
func parse() {
sbcc.subjects.append("Test")
sbcc.subjects.append(html.substringToIndex(advance(html.startIndex, 2)))
println(self.html)
}
I've Solved it! using Rob's answer here. basically I need to do all asynchronous activity WITHIN the asynch call and just have everything refresh when complete.

Resources