I'm trying to ingest my AWS ALB logs into Splunk. After all, I could search my ALB logs in Splunk. But still the events are not properly parsing. Did anyone had similar issue or any suggestion?
Here is my prop.conf
[aws:alb:accesslogs]
SHOULD_LINEMERGE=false
FIELD_DELIMITER = whitespace
pulldown_type=true
FIELD_NAMES=type,timestamp,elb,client_ip,client_port,target,request_processing_time,target_processing_time,response_processing_time,elb_status_code,target_status_code,received_bytes,sent_bytes,request,user_agent,ssl_cipher,ssl_protocol,target_group_arn,trace_id
EXTRACT-elb = ^\s*(?P<type>[^\s]+)\s+(?P<timestamp>[^\s]+)\s+(?P<elb>[^\s]+)\s+(?P<client_ip>[0-9.]+):(?P<client_port>\d+)\s+(?P<target>[^\s]+)\s+(?P<request_processing_time>[^\s]+)\s+(?P<target_processing_time>[^\s]+)\s+(?P<response_processing_time>[^\s]+)\s+(?P<elb_status_code>[\d-]+)\s+(?P<target_status_code>[\d-]+)\s+(?P<received_bytes>\d+)\s+(?P<sent_bytes>\d+)\s+"(?P<request>.+)"\s+"(?P<user_agent>.+)"\s+(?P<ssl_cipher>[-\w]+)\s*(?P<ssl_protocol>[-\w\.]+)\s+(?P<target_group_arn>[^\s]+)\s+(?P<trace_id>[^\s]+)
EVAL-rtt = request_processing_time + target_processing_time + response_processing_time
Sample data
https 2020-08-20T12:40:00.274478Z app/my-aws-alb/e7538073dd1a6fd8 162.158.26.188:21098 172.0.51.37:80 0.000 0.004 0.000 405 405 974 424 "POST https://my-aws-alb-domain:443/api/ps/fpx/callback HTTP/1.1" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.2840.91 Safari/537.36" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 arn:aws:elasticloadbalancing:ap-southeast-1:111111111111:targetgroup/my-aws-target-group/41dbd234b301e3d84 "Root=1-5f3e6f20-3fdasdsfffdsf" "api.mydomain.com" "arn:aws:acm:ap-southeast-1:11111111111:certificate/be4344424-a40f-416e-8434c-88a8a3b072f5" 0 2020-08-20T12:40:00.270000Z "forward" "-" "-" "172.0.51.37:80" "405" "-" "-"
Using transforms is pretty straightforward. Start with a stanza in transforms.conf.
[elb]
REGEX = ^\s*(?P<type>[^\s]+)\s+(?P<timestamp>[^\s]+)\s+(?P<elb>[^\s]+)\s+(?P<client_ip>[0-9.]+):(?P<client_port>\d+)\s+(?P<target>[^\s]+)\s+(?P<request_processing_time>[^\s]+)\s+(?P<target_processing_time>[^\s]+)\s+(?P<response_processing_time>[^\s]+)\s+(?P<elb_status_code>[\d-]+)\s+(?P<target_status_code>[\d-]+)\s+(?P<received_bytes>\d+)\s+(?P<sent_bytes>\d+)\s+"(?P<request>.+)"\s+"(?P<user_agent>.+)"\s+(?P<ssl_cipher>[-\w]+)\s*(?P<ssl_protocol>[-\w\.]+)\s+(?P<target_group_arn>[^\s]+)\s+(?P<trace_id>[^\s]+)
Then refer to the transform in props.conf
[aws:alb:accesslogs]
TIME_PREFIX = https\s
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%6N%Z
MAX_TIMESTAMP_LOOKAHEAD = 32
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
TRANSFORMS-elb = elb
EVAL-rtt = request_processing_time + target_processing_time + response_processing_time
Related
I am trying to scrape this website
Scraping information is possible manually
However, I cannot access information within p...p tags and ul...ul tags with one loop. These two tags are in a similar division. However, the loop breaks whenever p replaces ul or vice-versa.
Is this possible with just one loop??
import requests
from bs4 import BeautifulSoup as bs
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36",
"Accept-Encoding":"gzip, deflate, br",
"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"DNT":"1",
"Connection":"close",
"Upgrade-Insecure-Requests":"1"}
source = requests.get('https://insights.blackcoffer.com/how-small-business-can-survive-the-coronavirus-crisis/',
headers=headers)
page = source.content
soup = bs(page, 'html.parser')
information = ''
for section in soup.find('div', class_='td-post-content').find_all('p'):
if information != '':
information = information + '\n' + section.text
else:
information = section.text
print(information)
import requests
from bs4 import BeautifulSoup as bs
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36",
"Accept-Encoding":"gzip, deflate, br",
"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"DNT":"1",
"Connection":"close",
"Upgrade-Insecure-Requests":"1"}
source = requests.get('https://insights.blackcoffer.com/how-small-business-can-survive-the-coronavirus-crisis/',
headers=headers)
page = source.content
soup = bs(page, 'html.parser')
information = ''
for section in soup.find('div', class_='td-post-content').find_all(['p', 'li']):
information += '\n\n' + section.text
print(information.strip())
I am trying to call some urls with special characters in it. But it does not work.
This works:
GET .../rest/validation/checknameunique/?className=lomnido.Template&rename=true&name=Templaa%3Ea
This not: PUT ../rest/template/rename/526/Templaa%3Ea
There I get a 400 back from grails.
In the NGINX Log there is this entry
213.162.73.171 - - [22/Apr/2022:13:16:32 +0000] "PUT /rest/template/rename/28484/Bla%3Eaa HTTP/1.1" 400 2307 "https://mytest.com/configuration/template/28484" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36"
When I debug this, the request does not reach the Security Interceptor (all requests go through this).
What is wrong here?
Best regards,
Peter
import scrapy
class oneplus_spider(scrapy.Spider):
name='one_plus'
page_number=0
start_urls=[
'https://www.amazon.com/s?k=samsung+mobile&page=3&qid=1600763713&ref=sr_pg_3'
]
def parse(self,response):
all_links=[]
total_links=[]
domain='https://www.amazon.com'
href=[]
link_set=set()
href=response.css('a.a-link-normal.a-text-normal').xpath('#href').extract()
for x in href:
link_set.add(domain+x)
for x in link_set:
next_page=x
yield response.follow(next_page, callback=self.parse_page1)
def parse_page1(self, response):
title=response.css('span.a-size-large product-title-word-break::text').extract()
print(title)
Error after running the code - (failed 2 times): 503 Service Unavailable.
I tried many ways but failed. Please help me. Thanks in advance!
Check url by "curl" first. like,
curl -I "https://www.amazon.com/s?k=samsung+mobile&page=3&qid=1600763713&ref=sr_pg_3"
then, you can see 503 response.
HTTP/2 503
In other words, your request is wrong.
you have to find proper request.
Chrome DevTools will help you. like
I think that user-agent ( like browser ) must be needed.
curl 'https://www.amazon.com/s?k=samsung+mobile&page=3&qid=1600763713&ref=sr_pg_3' \
-H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36' \
--compressed
so... It may work,
import scrapy
class oneplus_spider(scrapy.Spider):
name='one_plus'
page_number=0
user_agent = "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36"
start_urls=[
'https://www.amazon.com/s?k=samsung+mobile&page=3&qid=1600763713&ref=sr_pg_3'
]
def parse(self,response):
all_links=[]
total_links=[]
domain='https://www.amazon.com'
href=[]
link_set=set()
href=response.css('a.a-link-normal.a-text-normal').xpath('#href').extract()
for x in href:
link_set.add(domain+x)
for x in link_set:
next_page=x
yield response.follow(next_page, callback=self.parse_page1)
def parse_page1(self, response):
title=response.css('span.a-size-large product-title-word-break::text').extract()
print(title)
'Edge 75' will be (is?) the first Chromium Based Edge browser. How can I check if this browser is Edge on Chrome ?
(What I really want to know is if the browser fully supports data-uri's - https://caniuse.com/#feat=datauri - so feature detection would even be better. If you know a way to do that, I can change the question)
You could use the window.navigator userAgent to check whether the browser is Microsoft Chromium Edge or Chrome.
Code as below:
<script>
var browser = (function (agent) {
switch (true) {
case agent.indexOf("edge") > -1: return "edge";
case agent.indexOf("edg/") > -1: return "chromium based edge (dev or canary)"; // Match also / to avoid matching for the older Edge
case agent.indexOf("opr") > -1 && !!window.opr: return "opera";
case agent.indexOf("chrome") > -1 && !!window.chrome: return "chrome";
case agent.indexOf("trident") > -1: return "ie";
case agent.indexOf("firefox") > -1: return "firefox";
case agent.indexOf("safari") > -1: return "safari";
default: return "other";
}
})(window.navigator.userAgent.toLowerCase());
document.body.innerHTML = window.navigator.userAgent.toLowerCase() + "<br>" + browser;
</script>
The Chrome browser userAgent:
mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml,
like gecko) chrome/74.0.3729.169 safari/537.36
The Edge browser userAgent:
mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml,
like gecko) chrome/64.0.3282.140 safari/537.36 edge/18.17763
The Microsoft Chromium Edge Dev userAgent:
mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml,
like gecko) chrome/76.0.3800.0 safari/537.36 edg/76.0.167.1
The Microsoft Chromium Edge Canary userAgent:
mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml,
like gecko) chrome/76.0.3800.0 safari/537.36 edg/76.0.167.1
As we can see that Microsoft Chromium Edge userAgent contains the "edg" keyword, we could use it to detect whether the browser is Chromium Edge browser or Chrome browser.
Using CanIUse, the most universal feature which is unsupported on old Edge (which used the EdgeHtml engine) but supported in Edge Chromium and everywhere else (except IE), is the reversed attribute on an OL list. This attribute has the advantage of having been supported for ages in everything else.
(This is the only one I can find which covers all other browsers including Opera Mini; if that's not a worry for you there are plenty of others.)
So, you can use simple feature detection to see if you're on Old Edge (or IE) -
var isOldEdgeOrIE = !('reversed' in document.createElement('ol'));
Since I found this question from the other side, how to actually check if a pre-chromium-edge is being used, I found the following solution (IE checks included):
// Edge < 18
if (window.navigator.userAgent.indexOf('Edge') !== -1) {
return true;
}
// IE 11
if (window.document.documentMode) {
return true;
}
// IE 10
if (navigator.appVersion.indexOf('MSIE 10') !== -1) {
return true;
}
return false;
We are using Nylas api to get Access token for different type of Email account like Gmail, Outlook.. But We couldn't authenticate for Gmail.
let myURL = URL(string: getNylasAuthUrl())
let userAgent = getUserAgentParams()
webView.customUserAgent = userAgent
let myRequest = URLRequest(url: myURL!)
webView.load(myRequest)
got below error
Finally found a way, by setting User-Agent, we could do authentication for gmail from post
Tried below User-agents but didn't help
let userAgent = "Mozilla/5.0 (Apple \(Utils.getDeviceModel()) ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36"
let userAgent = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36"
let userAgent = "Mozilla/5.0 (Google) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36"
Finally, I found the working user-agent.
let userAgent = "Mozilla/5.0 (iPhone; CPU iPhone OS 10_3_2 like Mac OS X)
AppleWebKit/603.1.30 (KHTML, like Gecko) Mobile/14F89 Safari/602.1"
If you want to Google auth via Webview, use this user-agent especially for getting access token.