Extracting domain name from a DNS Response packet using dpkt library - parsing

I'm trying to generate a list of all domain names and their corresponding IP addresses from a pcap file, using dpkt library available here
My code is mostly based on this
filename = raw_input('Type filename of pcap file (without extention): ')
path = 'c:/temp/PcapParser/' + filename + '.pcap'
f = open(path, 'rb')
pcap = dpkt.pcap.Reader(f)
for ts, buf in pcap:
#make sure we are dealing with IP traffic
try:
eth = dpkt.ethernet.Ethernet(buf)
except:
continue
if eth.type != 2048:
continue
#make sure we are dealing with UDP protocol
try:
ip = eth.data
except:
continue
if ip.p != 17:
continue
#filter on UDP assigned ports for DNS
try:
udp = ip.data
except:
continue
if udp.sport != 53 and udp.dport != 53:
continue
#make the dns object out of the udp data and
#check for it being a RR (answer) and for opcode QUERY
try:
dns = dpkt.dns.DNS(udp.data)
except:
continue
if dns.qr != dpkt.dns.DNS_R:
continue
if dns.opcode != dpkt.dns.DNS_QUERY:
continue
if dns.rcode != dpkt.dns.DNS_RCODE_NOERR:
continue
if len(dns.an) < 1:
continue
#process and print responses based on record type
for answer in dns.an:
if answer.type == 1: #DNS_A
print 'Domain Name: ', answer.name, '\tIP Address: ', socket.inet_ntoa(answer.rdata)
The problem is that answer.name is not good enough for me, because I need the original domain name requested, and not its' CNAME representation. For example, one of the original DNS requests was for www.paypal.com, but the CNAME representation of it is paypal.112.2o7.net.
I looked closely at the code and realized I'm actually extracting the information from the DNS Response (and not the query). Then I looked at the response packet in wireshark and saw that the original domain is there, under 'queries' and under 'answers', so my question is how can I extract it?
Thanks!

In order to acquire the name from the "Questions" section of the DNS response, via the dns.qd object, provided by dpkt.dns, all I needed to do was simply this:
for qname in dns.qd: print qname.name

Related

requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read, 512 more expected)', IncompleteRead

I wanted to write a program to fetch tweets from Twitter and then do sentiment analysis. I wrote the following code and got the error even after importing all the necessary libraries. I'm relatively new to data science, so please help me.
I could not understand the reason for this error:
class TwitterClient(object):
def __init__(self):
# keys and tokens from the Twitter Dev Console
consumer_key = 'XXXXXXXXX'
consumer_secret = 'XXXXXXXXX'
access_token = 'XXXXXXXXX'
access_token_secret = 'XXXXXXXXX'
api = Api(consumer_key, consumer_secret, access_token, access_token_secret)
def preprocess(tweet, ascii=True, ignore_rt_char=True, ignore_url=True, ignore_mention=True, ignore_hashtag=True,letter_only=True, remove_stopwords=True, min_tweet_len=3):
sword = stopwords.words('english')
if ascii: # maybe remove lines with ANY non-ascii character
for c in tweet:
if not (0 < ord(c) < 127):
return ''
tokens = tweet.lower().split() # to lower, split
res = []
for token in tokens:
if remove_stopwords and token in sword: # ignore stopword
continue
if ignore_rt_char and token == 'rt': # ignore 'retweet' symbol
continue
if ignore_url and token.startswith('https:'): # ignore url
continue
if ignore_mention and token.startswith('#'): # ignore mentions
continue
if ignore_hashtag and token.startswith('#'): # ignore hashtags
continue
if letter_only: # ignore digits
if not token.isalpha():
continue
elif token.isdigit(): # otherwise unify digits
token = '<num>'
res += token, # append token
if min_tweet_len and len(res) < min_tweet_len: # ignore tweets few than n tokens
return ''
else:
return ' '.join(res)
for line in api.GetStreamSample():
if 'text' in line and line['lang'] == u'en': # step 1
text = line['text'].encode('utf-8').replace('\n', ' ') # step 2
p_t = preprocess(text)
# attempt authentication
try:
# create OAuthHandler object
self.auth = OAuthHandler(consumer_key, consumer_secret)
# set access token and secret
self.auth.set_access_token(access_token, access_token_secret)
# create tweepy API object to fetch tweets
self.api = tweepy.API(self.auth)
except:
print("Error: Authentication Failed")
Assume all the necessary libraries are imported. The error is on line 69.
for line in api.GetStreamSample():
if 'text' in line and line['lang'] == u'en': # step 1
text = line['text'].encode('utf-8').replace('\n', ' ') # step 2
p_t = preprocess(text)
I tried checking on the internet the reason for the error but could not get any solution.
Error was:
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read, 512 more expected)', IncompleteRead(0 bytes read, 512 more expected))
I'm using Python 2.7 and requests version 2.14, the latest one.
If you set stream to True when making a request, Requests cannot release the connection back to the pool unless you consume all the data or call Response.close. This can lead to inefficiency with connections. If you find yourself partially reading request bodies (or not reading them at all) while using stream=True, you should make the request within a with statement to ensure it’s always closed:
with requests.get('http://httpbin.org/get', stream=True) as r:
# Do things with the response here.
I had the same problem but without stream, and as stone mini said, just apply "with" clause before to make sure your request is closed before a new request.
with requests.request("POST", url_base, json=task, headers=headers) as report:
print('report: ', report)
actually the problem with your django2.7 or earlier version based application. that django versions by default allowed 2.5mb data upload memory size of request body.
I was facing the same issue with django2.7 based application, I just updated the setting.py file of my django application where my urls(endpoints) were working.
DATA_UPLOAD_MAX_MEMORY_SIZE = None
I just add the above variable in my application's settings.py file.
you can also readout about that from here
I'm pretty sure this will work for you.

Yahoo Finance Current Quotes Error 999 No Definition Found [duplicate]

For months I've been using a url like this, from perl:
http://finance.yahoo.com/d/quotes.csv?s=$s&f=ynl1 #returns yield, name, price;
Today, 11/1/17, it suddenly returns a 999 error.
Is this a glitch, or has Yahoo terminated the service?
I get the error even if I enter the URL directly into a browser as, eg:
http://finance.yahoo.com/d/quotes.csv?s=INTC&f=ynl1
so it doesn't seem to be a 'crumb' problem.
Note: This is NOT a question which has been answered in the past!
It was working yesterday.That it happened on the first of the month is suspicious.
As noted in the other answers and elsewhere (e.g. https://stackoverflow.com/questions/47076404/currency-helper-of-yahoo-sorry-unable-to-process-request-at-this-time-erro/47096766#47096766), Yahoo has indeed ceased operation of the Yahoo Finance API. However, as a workaround, you can access a trove of financial information, in JSON format, for a given ticker symbol, by doing a HTTPS GET request to: https://finance.yahoo.com/quote/SYMBOL (e.g. https://finance.yahoo.com/quote/MSFT). If you do a GET request to the above URL, you'll see that the financial data is contained within the response in JSON format. The following python3 script shows how you can parse individual values that you may be interested in:
import requests
import json
symbol = 'MSFT'
url ='https://finance.yahoo.com/quote/' + symbol
resp = requests.get(url)
# parse the section from the html document containing the raw json data that we need
# you can write jsonstr to a file, then open the file in a web browser to browse the structure of the json data
r = str(resp.content, 'utf-8')
i1 = 0
i1 = r.find('root.App.main', i1)
i1 = r.find('{', i1)
i2 = r.find("\n", i1)
i2 = r.rfind(';', i1, i2)
jsonstr = r[i1:i2]
# load the raw json data into a python data object
data = json.loads(jsonstr)
# pull the values that we are interested in
name = data['context']['dispatcher']['stores']['QuoteSummaryStore']['price']['shortName']
price = data['context']['dispatcher']['stores']['QuoteSummaryStore']['price']['regularMarketPrice']['raw']
change = data['context']['dispatcher']['stores']['QuoteSummaryStore']['price']['regularMarketChange']['raw']
shares_outstanding = data['context']['dispatcher']['stores']['QuoteSummaryStore']['defaultKeyStatistics']['sharesOutstanding']['raw']
market_cap = data['context']['dispatcher']['stores']['QuoteSummaryStore']['summaryDetail']['marketCap']['raw']
trailing_pe = data['context']['dispatcher']['stores']['QuoteSummaryStore']['summaryDetail']['trailingPE']['raw']
earnings_per_share = data['context']['dispatcher']['stores']['QuoteSummaryStore']['defaultKeyStatistics']['trailingEps']['raw']
forward_annual_dividend_rate = data['context']['dispatcher']['stores']['QuoteSummaryStore']['summaryDetail']['dividendRate']['raw']
forward_annual_dividend_yield = data['context']['dispatcher']['stores']['QuoteSummaryStore']['summaryDetail']['dividendYield']['raw']
# print the values
print('Symbol:', symbol)
print('Name:', name)
print('Price:', price)
print('Change:', change)
print('Shares Outstanding:', shares_outstanding)
print('Market Cap:', market_cap)
print('Trailing PE:', trailing_pe)
print('Earnings Per Share:', earnings_per_share)
print('Forward Annual Dividend Rate:', forward_annual_dividend_rate)
print('Forward_annual_dividend_yield:', forward_annual_dividend_yield)
Yahoo confirmed that they terminated the service:
It has come to our attention that this service is being used in violation of the Yahoo Terms of Service. As such, the service is being discontinued. For all future markets and equities data research, please refer to finance.yahoo.com .
There is still a way to get this data by querying some APIs used by the finance.yahoo.com page. Not sure if Yahoo will be supporting it long term as the previous API was (hopefully they will).
I adapted the method used by https://github.com/pstadler/ticker.sh into the following python hack that takes a list of symbols from the command line and outputs some of the variables as a csv:
#!/usr/bin/env python
import sys
import time
import requests
if len(sys.argv) < 2:
print("missing parameters: <symbol> ...")
exit()
apiEndpoint = "https://query1.finance.yahoo.com/v7/finance/quote"
fields = [
'symbol',
'regularMarketVolume',
'regularMarketPrice',
'regularMarketDayHigh',
'regularMarketDayLow',
'regularMarketTime',
'regularMarketChangePercent']
fields = ','.join(fields)
symbols = sys.argv[1:]
symbols = ','.join(symbols)
payload = {
'lang': 'en-US',
'region': 'US',
'corsDomain': 'finance.yahoo.com',
'fields': fields,
'symbols': symbols}
r = requests.get(apiEndpoint, params=payload)
for i in r.json()['quoteResponse']['result']:
if 'regularMarketPrice' in i:
a = []
a.append(i['symbol'])
a.append(i['regularMarketPrice'])
a.append(time.strftime(
'%Y-%m-%d %H:%M:%S', time.localtime(i['regularMarketTime'])))
a.append(i['regularMarketChangePercent'])
a.append(i['regularMarketVolume'])
a.append("{0:.2f} - {1:.2f}".format(
i['regularMarketDayLow'], i['regularMarketDayHigh']))
print(",".join([str(e) for e in a]))
Sample Run:
$ ./getquotePy.py AAPL GOOGL
AAPL,174.5342,2017-11-07 17:21:28,0.1630961,19905458,173.60 - 173.60
GOOGL,1048.6753,2017-11-07 17:21:22,0.5749836,840447,1043.00 - 1043.00
var API = "https://query1.finance.yahoo.com/v7/finance/quote?symbols=AAPL";
$.getJSON(API, function (json) {...});call throws this error: No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'http://www.microplan.at/sar' is therefore not allowed access.

Has Yahoo suddenly today terminated its finance download API?

For months I've been using a url like this, from perl:
http://finance.yahoo.com/d/quotes.csv?s=$s&f=ynl1 #returns yield, name, price;
Today, 11/1/17, it suddenly returns a 999 error.
Is this a glitch, or has Yahoo terminated the service?
I get the error even if I enter the URL directly into a browser as, eg:
http://finance.yahoo.com/d/quotes.csv?s=INTC&f=ynl1
so it doesn't seem to be a 'crumb' problem.
Note: This is NOT a question which has been answered in the past!
It was working yesterday.That it happened on the first of the month is suspicious.
As noted in the other answers and elsewhere (e.g. https://stackoverflow.com/questions/47076404/currency-helper-of-yahoo-sorry-unable-to-process-request-at-this-time-erro/47096766#47096766), Yahoo has indeed ceased operation of the Yahoo Finance API. However, as a workaround, you can access a trove of financial information, in JSON format, for a given ticker symbol, by doing a HTTPS GET request to: https://finance.yahoo.com/quote/SYMBOL (e.g. https://finance.yahoo.com/quote/MSFT). If you do a GET request to the above URL, you'll see that the financial data is contained within the response in JSON format. The following python3 script shows how you can parse individual values that you may be interested in:
import requests
import json
symbol = 'MSFT'
url ='https://finance.yahoo.com/quote/' + symbol
resp = requests.get(url)
# parse the section from the html document containing the raw json data that we need
# you can write jsonstr to a file, then open the file in a web browser to browse the structure of the json data
r = str(resp.content, 'utf-8')
i1 = 0
i1 = r.find('root.App.main', i1)
i1 = r.find('{', i1)
i2 = r.find("\n", i1)
i2 = r.rfind(';', i1, i2)
jsonstr = r[i1:i2]
# load the raw json data into a python data object
data = json.loads(jsonstr)
# pull the values that we are interested in
name = data['context']['dispatcher']['stores']['QuoteSummaryStore']['price']['shortName']
price = data['context']['dispatcher']['stores']['QuoteSummaryStore']['price']['regularMarketPrice']['raw']
change = data['context']['dispatcher']['stores']['QuoteSummaryStore']['price']['regularMarketChange']['raw']
shares_outstanding = data['context']['dispatcher']['stores']['QuoteSummaryStore']['defaultKeyStatistics']['sharesOutstanding']['raw']
market_cap = data['context']['dispatcher']['stores']['QuoteSummaryStore']['summaryDetail']['marketCap']['raw']
trailing_pe = data['context']['dispatcher']['stores']['QuoteSummaryStore']['summaryDetail']['trailingPE']['raw']
earnings_per_share = data['context']['dispatcher']['stores']['QuoteSummaryStore']['defaultKeyStatistics']['trailingEps']['raw']
forward_annual_dividend_rate = data['context']['dispatcher']['stores']['QuoteSummaryStore']['summaryDetail']['dividendRate']['raw']
forward_annual_dividend_yield = data['context']['dispatcher']['stores']['QuoteSummaryStore']['summaryDetail']['dividendYield']['raw']
# print the values
print('Symbol:', symbol)
print('Name:', name)
print('Price:', price)
print('Change:', change)
print('Shares Outstanding:', shares_outstanding)
print('Market Cap:', market_cap)
print('Trailing PE:', trailing_pe)
print('Earnings Per Share:', earnings_per_share)
print('Forward Annual Dividend Rate:', forward_annual_dividend_rate)
print('Forward_annual_dividend_yield:', forward_annual_dividend_yield)
Yahoo confirmed that they terminated the service:
It has come to our attention that this service is being used in violation of the Yahoo Terms of Service. As such, the service is being discontinued. For all future markets and equities data research, please refer to finance.yahoo.com .
There is still a way to get this data by querying some APIs used by the finance.yahoo.com page. Not sure if Yahoo will be supporting it long term as the previous API was (hopefully they will).
I adapted the method used by https://github.com/pstadler/ticker.sh into the following python hack that takes a list of symbols from the command line and outputs some of the variables as a csv:
#!/usr/bin/env python
import sys
import time
import requests
if len(sys.argv) < 2:
print("missing parameters: <symbol> ...")
exit()
apiEndpoint = "https://query1.finance.yahoo.com/v7/finance/quote"
fields = [
'symbol',
'regularMarketVolume',
'regularMarketPrice',
'regularMarketDayHigh',
'regularMarketDayLow',
'regularMarketTime',
'regularMarketChangePercent']
fields = ','.join(fields)
symbols = sys.argv[1:]
symbols = ','.join(symbols)
payload = {
'lang': 'en-US',
'region': 'US',
'corsDomain': 'finance.yahoo.com',
'fields': fields,
'symbols': symbols}
r = requests.get(apiEndpoint, params=payload)
for i in r.json()['quoteResponse']['result']:
if 'regularMarketPrice' in i:
a = []
a.append(i['symbol'])
a.append(i['regularMarketPrice'])
a.append(time.strftime(
'%Y-%m-%d %H:%M:%S', time.localtime(i['regularMarketTime'])))
a.append(i['regularMarketChangePercent'])
a.append(i['regularMarketVolume'])
a.append("{0:.2f} - {1:.2f}".format(
i['regularMarketDayLow'], i['regularMarketDayHigh']))
print(",".join([str(e) for e in a]))
Sample Run:
$ ./getquotePy.py AAPL GOOGL
AAPL,174.5342,2017-11-07 17:21:28,0.1630961,19905458,173.60 - 173.60
GOOGL,1048.6753,2017-11-07 17:21:22,0.5749836,840447,1043.00 - 1043.00
var API = "https://query1.finance.yahoo.com/v7/finance/quote?symbols=AAPL";
$.getJSON(API, function (json) {...});call throws this error: No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'http://www.microplan.at/sar' is therefore not allowed access.

ESP8266, NodeMCU, soft AP - UDP server-like soft AP, independent access point

I am using NodeMCU (with ESP8266-E) with an upgraded firmware. All basic commands work perfectly but there is one problem.
I wanted to create an independent access point, which could have a behaviour like a UDP server. That means without direct connection to any other access points. A simple UDP server like soft AP.
I followed these steps:
I have uploaded a new firmware to NodeMCU.
I have downloaded ESPlorer for better work with NodeMCU.
I have uploaded the source code below.
I have connected to the NodeMCU access point on my desktop.
I have sent some strings to the NodeMCU using a Java UDP client program.
I have looked at the messages on ESPlorer.
NodeMCU has not received any such strings.
--
print("ESP8266 Server")
wifi.setmode(wifi.STATIONAP);
wifi.ap.config({ssid="test",pwd="12345678"});
print("Server IP Address:",wifi.ap.getip())
-- 30s timeout for an inactive client
srv = net.createServer(net.UDP, 30)
-- server listens on 5000, if data received, print data to console
srv:listen(5000, function(sk)
sk:on("receive", function(sck, data)
print("received: " .. data)
end)
sk:on("connection", function(s)
print("connection established")
end)
end)
When I tried to send a message using a Java application, there was no change in ESPlorer. Not even when I tried to send a message using the Hercules program (great program for TCP, UDP communication).
I guess that maybe it will be the wrong IP address. I am using the IP address of the AP and not the IP address of the station.
In other words I am using this address: wifi.ap.getip() and not this address wifi.sta.getip() for connections to the UDP server. But sta.getip() returns a nil object. Really I don't know.
I will be glad for any advice.
Thank you very much.
Ok, let's restart this since you updated the question. I should have switched on my brain before I gave you the first hints, sorry about this.
UDP is connectionless and, therefore, there's of course no s:on("connection"). As a consequence you can't register your callbacks on a socket but on the server itself. It is in the documentation but it's easy to miss.
This should get you going:
wifi.setmode(wifi.STATIONAP)
wifi.ap.config({ ssid = "test", pwd = "12345678" })
print("Server IP Address:", wifi.ap.getip())
srv = net.createServer(net.UDP)
srv:listen(5000)
srv:on("receive", function(s, data)
print("received: " .. data)
s:send("echo: " .. data)
end)
I ran this against a firmware from the dev branch and tested from the command line like so
$ echo "foo" | nc -w1 -u 192.168.4.1 5000
echo: foo
ESPlorer then also correctly printed "received: foo".
This line is invalid Lua code. connected is in the wrong place here. you can't just put a single word after a function call.
print(wifi.ap.getip()) connected
I guess you intended to do something like
print(wifi.ap.getip() .. " connected")
Although I think you should add som error handling here in case wifi.ap.getip() does not return an IP.
Here you do not finish the function definition. Neither did you complete the srv:on call
srv:on("receive", function(srv, pl)
print("Strings received")
srv:listen(port)
I assume you just did not copy/paste the complete code.

How would you parse a url in Ruby to get the main domain?

I want to be able to parse any URL with Ruby to get the main part of the domain without the www (just the example.com)
Please note there is no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain (the policies differ with each registry), the only method is to create a list of all top-level domains and the level at which domains can be registered.
This is the reason why the Public Suffix List exists.
I'm the author of PublicSuffix, a Ruby library that decomposes a domain into the different parts.
Here's an example
require 'uri/http'
uri = URI.parse("http://toolbar.google.com")
domain = PublicSuffix.parse(uri.host)
# => "toolbar.google.com"
domain.domain
# => "google.com"
uri = URI.parse("http://www.google.co.uk")
domain = PublicSuffix.parse(uri.host)
# => "www.google.co.uk"
domain.domain
# => "google.co.uk"
This should work with pretty much any URL:
# URL always gets parsed twice
def get_host_without_www(url)
url = "http://#{url}" if URI.parse(url).scheme.nil?
host = URI.parse(url).host.downcase
host.start_with?('www.') ? host[4..-1] : host
end
Or:
# Only parses twice if url doesn't start with a scheme
def get_host_without_www(url)
uri = URI.parse(url)
uri = URI.parse("http://#{url}") if uri.scheme.nil?
host = uri.host.downcase
host.start_with?('www.') ? host[4..-1] : host
end
You may have to require 'uri'.
Just a short note: to overcome the second parsing of the url from Mischas second example, you could make a string comparison instead of URI.parse.
# Only parses once
def get_host_without_www(url)
url = "http://#{url}" unless url.start_with?('http')
uri = URI.parse(url)
host = uri.host.downcase
host.start_with?('www.') ? host[4..-1] : host
end
The downside of this approach is, that it is limiting the url to http(s) based urls, which is widely the standard. But if you will use it more general (f.e. for ftp links) you have to adjust accordingly.
Addressable is probably the right answer in 2018, especially uses the PublicSuffix gem to parse domains.
However, I need to do this kind of parsing in multiple places, from various data sources, and found it a bit verbose to use repeatedly. So I created a wrapper around it, Adomain:
require 'adomain'
Adomain["https://toolbar.google.com"]
# => "toolbar.google.com"
Adomain["https://www.google.com"]
# => "google.com"
Adomain["stackoverflow.com"]
# => "stackoverflow.com"
I hope this helps others.
Here's one that works better with .co.uk and .com.fr - type domains
domain = uri.host[/[^.\s\/]+\.([a-z]{3,}|([a-z]{2}|com)\.[a-z]{2})$/]
if the URL is in format http://www.google.com, then you could do something like:
a = 'http://www.google.com'
puts a.split(/\./)[1] + '.' + a.split(/\./)[2]
Or
a =~ /http:\/\/www\.(.*?)$/
puts $1

Resources