NotFound: received 404 HTTP response - reddit

**Hi, I was trying to collect data from a subreddit but it started collecting the posts successfully and after 3 hours I got [NotFound: received 404 HTTP response]error. I tried to run the code many times and it kept giving me the same error at the same time.
Here is my code:**
### BLOCK 1 ###
start_year = 2022
end_year = 2022
for year in range(start_year,end_year+1):
action = "[Year] " + str(year)
log_action(action)
dirpath = basecorpus + str(year)
if not os.path.exists(dirpath):
os.makedirs(dirpath)
# timestamps that define window of posts
ts_after = int(datetime.datetime(2022, 1, 1).timestamp())
ts_before = int(datetime.datetime(2022, 4, 11).timestamp())
### BLOCK 2 ###
for subreddit in subreddits:
start_time = time.time()
action = "\t[Subreddit] " + subreddit
log_action(action)
subredditdirpath = dirpath + '/' + subreddit
if os.path.exists(subredditdirpath):
continue
else:
os.makedirs(subredditdirpath)
submissions_csv_path = str(year) + '-' + subreddit + '-submissions.csv'
### BLOCK 3 ###
submissions_dict = {
"id" : [],
"url" : [],
"title" : [],
"score" : [],
"num_comments": [],
"created_utc" : [],
"selftext" : [],
"author" : [],
"domain" : [],
"num_crossposts" : [],
}
### BLOCK 4 ###
# use PSAW only to get id of submissions in time interval
gen = api.search_submissions(
after=ts_after,
before=ts_before,
filter=['id'],
subreddit=subreddit,
limit=None
)
### BLOCK 5 ###
# use PRAW to get actual info and traverse comment tree
for submission_psaw in gen:
# use psaw here
submission_id = submission_psaw.d_['id']
# use praw from now on
submission_praw = reddit.submission(id=submission_id)
submissions_dict["id"].append(submission_praw.id)
submissions_dict["url"].append(submission_praw.url)
submissions_dict["title"].append(submission_praw.title)
submissions_dict["score"].append(submission_praw.score)
submissions_dict["num_comments"].append(submission_praw.num_comments)
submissions_dict["created_utc"].append(submission_praw.created_utc)
submissions_dict["selftext"].append(submission_praw.selftext)
submissions_dict["author"].append(submission_praw.author)
submissions_dict["domain"].append(submission_praw.domain)
submissions_dict["num_crossposts"].append(submission_praw.num_crossposts)
### BLOCK 6 ###
submission_comments_csv_path = str(year) + '-' + subreddit + '-submission_' + submission_id + '-comments.csv'
submission_comments_dict = {
"comment_id" : [],
"comment_parent_id" : [],
"comment_body" : [],
"comment_link_id" : [],
"author" : [],
"created_utc" : [],
"is_submitter" : [],
}
### BLOCK 7 ###
# extend the comment tree all the way
submission_praw.comments.replace_more(limit=None)
# for each comment in flattened comment tree
for comment in submission_praw.comments.list():
submission_comments_dict["comment_id"].append(comment.id)
submission_comments_dict["comment_parent_id"].append(comment.parent_id)
submission_comments_dict["comment_body"].append(comment.body)
submission_comments_dict["comment_link_id"].append(comment.link_id)
submission_comments_dict["author"].append(comment.author)
submission_comments_dict["created_utc"].append(comment.created_utc)
submission_comments_dict["is_submitter"].append(comment.is_submitter)
# for each submission save separate csv comment file
pd.DataFrame(submission_comments_dict).to_csv(subredditdirpath + '/' + submission_comments_csv_path,
index=False)
### BLOCK 8 ###
# single csv file with all submissions
pd.DataFrame(submissions_dict).to_csv(subredditdirpath + '/' + submissions_csv_path,
index=False)
action = f"\t\t[Info] Found submissions: {pd.DataFrame(submissions_dict).shape[0]}"
log_action(action)
action = f"\t\t[Info] Elapsed time: {time.time() - start_time: .2f}s"
log_action(action)
I would appreciate any advise.
Thank you.

Related

Sorting list of objects by object attribute?

I have a list of objects that are pulled in from an API. Here is the snippet of output (as there's about 300 lines):
combo =>
ID: 6, Name:Thomas Partey, Club:1, Position: 3, Price: $4.7, Total Pts: 57
ID: 7, Name:Martin Ødegaard, Club:1, Position: 3, Price: $7.0, Total Pts: 128
ID: 8, Name:Kieran Tierney, Club:1, Position: 2, Price: $4.6, Total Pts: 23
ID: 12, Name:Emile Smith Rowe, Club:1, Position: 3, Price: $5.6, Total Pts: 5
I would like to change the order so that they are ranked by Total Points rather than ID
I have tried the following:
sorted = combo.sort_by(#totalpoints)
As well as: (but I assume I want to try and use #teampoints since I've defined that)
sorted = combo.sort_by(:totalpoints)
My Full code is:
class Player
attr_accessor :id, :firstname, :secondname, :club, :position, :price, :totalpoints,
:active
def initialize(id, firstname, secondname, club, position, price, totalpoints, active)
#id = id.to_i
#firstname = firstname.to_s
#secondname = secondname.to_s
#club = club.to_s
#position = position.to_i
#price = price / 10.to_f
#totalpoints = totalpoints.to_i
#active = active.to_i
end
def to_s()
" ID: " + #id.to_s + ", Name:" + #firstname.to_s + " " + #secondname.to_s + ", Club:" + #club.to_s + ", Position: " + #position.to_s + ", Price: $" + #price.to_s + ", Total Pts: " + #totalpoints.to_s + " "
end
def self.pull()
require 'net/http'
require 'json'
url = 'https://fantasy.premierleague.com/api/bootstrap-static/'
uri = URI(url)
response = Net::HTTP.get(uri)
object = JSON.parse(response)
elements = object["elements"]
elements.map! { |qb|
if qb["chance_of_playing_next_round"].to_f > 0
Player.new(
qb["id"], # ID
qb["first_name"], # First Name
qb["second_name"], # Surname
qb["team"], # Club
qb["element_type"], # Position
qb["now_cost"], # Current Price
qb["total_points"], # Total Points
qb["chance_of_playing_next_round"]) # Chance Of Playing
end
}
end
combo = Player.pull().map{|qb| qb}
sorted = combo.sort_by(#totalpoints)
puts sorted
end
Based on what you've got shown, this should do what you need:
sorted = combo.sort_by(&:totalpoints)
It's essentially a shortened version of this:
sorted = combo.sort_by { |_combo| _combo.totalpoints }

How to pass multiple OR and AND condition in Activate Record in Rails as a string

I'm new in Ruby on rails and I would like to fetch records based on a condition, and I'm passing the condition in a string format. Moreover, I will pass the query in multiple OR and AND conditions. However, right now, I'm stuck that how to pass the query in string format in rails
I have attached the screenshot
#data= CustomAttribute.includes(:custom_attribute_values).where(id: 18, company_id: current_user.company_id).first
The above line executed successfully and gave the output
<CustomAttribute id: 18, data_type: "string", label: "Marital status", code: "marital_status", entity_type: "member", company_id: 1, created_at: "2021-03-10 10:16:15", updated_at: "2021-03-10 10:16:27", is_active: true, is_default: false, rank: nil, is_identifier: false>
but when I executed the below line it gave me the error that
#data.custom_attribute_values.where("\""+"value_string"+"\""+"="+"\""+'Single'+"\"").size
ERROR: column "Single" does not exist
the Single is the value which I would like to count
Here is my code for the dynamic query creation
logical_operator = 'OR'
#custom_attribute = CustomAttribute.includes(:custom_attribute_values).where(id: custom_attribute_ids, company_id: current_user.company_id)
query=""
#custom_attribute.each_with_index do |attribute_object, index|
filter_object= filter_params[:filters].find {|x| x['custom_attribute_id']==attribute_object['id']}
if filter_object.present?
query += "("+ '"' +'value_'+attribute_object.data_type + '"' + ' ' + filter_object['operator'] + ' ' + "'" + filter_object['value'].to_s + "'"+ ")"
end
if index != #custom_attribute.length-1
query+=' '+logical_operator+' '
end
if index == #custom_attribute.length-1
query="'" + " ( " + query + " ) " + "'"
end
end
byebug
puts(#custom_attribute.first.custom_attribute_values.where(query).size)
Any time you're doing a lot of escaping and string addition in Ruby you're doing it wrong. If we clean up how you build your SQL:
"\""+"value_string"+"\""+"="+"\""+'Single'+"\""
things will be clearer. First, put space around your operators for readability:
"\"" + "value_string" + "\"" + "=" + "\"" + 'Single' + "\""
Next, don't use double quotes unless you need them for escape codes (such as \n) or interpolation:
'"' + 'value_string' + '"' + '=' + '"' + 'Single' + '"'
Now we see that we're adding several constant strings so there's no need to add them at all, a single string literal will do:
'"value_string" = "Single"'
Standard SQL uses double quotes for identifiers (such as table and column names) and single quotes for strings. So your query is asking for all rows where the value_string column equals the Single column and there's your error.
You want to use single quotes for the string (and %q(...) to quote the whole thing to avoid adding escapes back in):
#data.custom_attribute_values.where(
%q("value_string" = 'Single')
)
Or better, let ActiveRecord build the query:
# With a positional placeholder:
#data.custom_attribute_values.where('value_string = ?', 'Single')
# Or a named placeholder:
#data.custom_attribute_values.where('value_string = :s', s: 'Single')
# Or most idiomatic:
#data.custom_attribute_values.where(value_string: 'Single')

How can I have multiple hyperlinks in the same column using JS render function?

I have a shiny app that displays a datatable (using DTedit R pckg) and I would like to:
1) Insert all the hyperlinks in the same cell, separated by a new line
2) Open the hyperlinks in a different tab.
For point 2, I have tried different versions of ' target="_blank" but it does not work. My guess is that I am doing something wrong with the quotes.
#E.g.:
<a href=' + data + ' target='_blank'>' + 'PanelApp' + '</a>' ;}"
I have tried with double quotes too (target="_blank") but is not recognising them (I have zero experience in JS)
This is an example of the app:
library(shiny)
library(DT)
#devtools::install_github('jbryer/DTedit')
library(DTedit)
ui = fluidPage(
h3("How can I have all the links in one column separated by <br> ?"),
mainPanel(
shiny::uiOutput("mytable")
)
)
server = function(input, output){
#dataframe
mydata <- data.frame(Gene = c("GBE1", "KMT2D"),
Metric = c(10, 20))
## Add hyperlinks
mydata$Decipher <- paste0("https://decipher.sanger.ac.uk/gene/", mydata$Gene, "#overview/protein-info")
mydata$PanelApp <- paste0("https://panelapp.genomicsengland.co.uk/panels/entities/", mydata$Gene)
#render table
DTedit::dtedit(input, output,
name = 'mytable',
thedata = mydata,
datatable.options = list(
columnDefs = list(
list(targets= 2,
render = JS("function(data){return '<a href=' + data + '>' + 'PanelApp' + '</a>' ;}")),
list(targets= 3,
render = JS("function(data){return '<a href=' + data + '>' + 'Decipher' + '</a>' ;}"))
)
)
)
}
shinyApp(ui = ui, server = server, options = list(height = 1080))
Can anyone shed some light here?
THANK YOU!!

convert data formatting in a lua file

hello i need to convert 720 data sets from a 1 liner to this format below.
Atm i got them in a open office file with each number in a column but i have no idea how i can convert that formatting.
12 -8906.071289 560.890564 93.236107 0 test2
13 -846.814636 -526.218323 10.981694 0 southshore
to
[12] = {
[1] = "test2",
[2] = "-8906.071289",
[3] = "560.890564",
[4] = "93.236107",
[5] = "0",
},
[13] = {
[1] = "Southshore",
[2] = "-846.814636",
[3] = "-526.218323",
[4] = "10.981694",
[5] = "0",
},
One possibility in Lua. Run with program.lua datafile
where program.lua is whatever name you give this file, and datafile is, well, your external data file. Test with just program.lua
--[[
12 -8906.071289 560.890564 93.236107 0 test2
13 -846.814636 -526.218323 10.981694 0 southshore
--]]
local filename = arg[1] or arg[0] --data from 1st command line argument or this file
local index,head,tail
print '{'
for line in io.lines(filename) do
if line:match '^%d+' then
head, line, tail = line:match '^(%d+)%s+(.-)(%S+)$'
print(' [' .. head .. '] = {\n [1] = "' .. tail .. '",')
index = 1
for line in line:gmatch '%S+' do
index = index + 1
print(' [' .. index .. '] = "' .. line .. '",')
end
print ' },'
end
end
print '}'
This awk program does it:
{
print "[" $1 "] = {"
print "\t[" 1 "] = \"" $NF "\","
for (i=2; i<NF; i++) {
print "\t[" i "] = \"" $i "\","
}
print "},"
}

Using Regex and ruby regular expressions to find values

So I'm currently trying to sort values from a file. I'm stuck on the finding the first attribute, and am not sure why. I'm new to regex and ruby so I'm not sure how to go about the problem. I'm trying to find values of a,b,c,d,e where they are all positive numbers.
Here's what the line will look like
length=<a> begin=(<b>,<c>) end=(<d>,<e>)
Here's what I'm using to find the values
current_line = file.gets
if current_line == nil then return end
while current_line = file.gets do
if line =~ /length=<(\d+)> begin=((\d+),(\d+)) end=((\d+),(\d+))/
length, begin_x, begin_y, end_x, end_y = $1, $2, $3, $4, $5
puts("length:" + length.to_s + " begin:" + begin_x.to_s + "," + begin_y.to_s + " end:" + end_x.to_s + "," + end_y.to_s)
end
end
for some reason it never prints anything out, so I'm assuming it never finds a match
Sample input
length=4 begin=(0,0) end=(3,0)
A line with 0-4 decimals after 2 integers seperated by commas.
So it could be any of these:
2 4 1.3434324,3.543243,4.525324
1 2
18 3.3213,9.3233,1.12231,2.5435
7 9 2.2,1.899990
0 3 2.323
Here is your regex:
r = /length=<(\d+)> begin=((\d+),(\d+)) end=((\d+),(\d+))/
str.scan(r)
#=> nil
First, we need to escape the parenthesis:
r = /length=<(\d+)> begin=\((\d+),(\d+)\) end=\((\d+),(\d+)\)/
Next, add the missing < and > after "begin" and "end".
r = /length=<(\d+)> begin=\(<(\d+)>,<(\d+)>\) end=\(<(\d+)>,<(\d+)>\)/
Now let's try it:
str = "length=<4779> begin=(<21>,<47>) end=(<356>,<17>)"
but first, let's set the mood
str.scan(r)
#=> [["4779", "21", "47", "356", "17"]]
Success!
Lastly (though probably not necessary), we might replace the single spaces with \s+, which permits one or more spaces:
r = /length=<(\d+)>\s+begin=\(<(\d+)>,<(\d+)>\)\send=\(<(\d+)>,<(\d+)>\)/
Addendum
The OP has asked how this would be modified if some of the numeric values were floats. I do not understand precisely what has been requested, but the following could be modified as required. I've assumed all the numbers are non-negative. I've also illustrated one way to "build" a regex, using Regexp#new.
s1 = '<(\d+(?:\.\d+)?)>' # note single parens
#=> "<(\\d+(?:\\.\\d+)?)>"
s2 = "=\\(#{s1},#{s1}\\)"
#=> "=\\(<(\\d+(?:\\.\\d+)?)>,<(\\d+(?:\\.\\d+)?)>\\)"
r = Regexp.new("length=#{s1} begin#{s2} end#{s2}")
#=> /length=<(\d+(?:\.\d+)?)> begin=\(<(\d+(?:\.\d+)?)>,<(\d+(?:\.\d+)?)>\) end=\(<(\d+(?:\.\d+)?)>,<(\d+(?:\.\d+)?)>\)/
str = "length=<47.79> begin=(<21>,<4.7>) end=(<0.356>,<17.999>)"
str.scan(r)
#=> [["47.79", "21", "4.7", "0.356", "17.999"]]
Sample input:
length=4 begin=(0,0) end=(3,0)
data.txt:
length=3 begin=(0,0) end=(3,0)
length=4 begin=(0,1) end=(0,5)
length=2 begin=(1,3) end=(1,5)
Try this:
require 'pp'
Line = Struct.new(
:length,
:begin_x,
:begin_y,
:end_x,
:end_y,
)
lines = []
IO.foreach('data.txt') do |line|
numbers = []
line.scan(/\d+/) do |match|
numbers << match.to_i
end
lines << Line.new(*numbers)
end
pp lines
puts lines[-1].begin_x
--output:--
[#<struct Line length=3, begin_x=0, begin_y=0, end_x=3, end_y=0>,
#<struct Line length=4, begin_x=0, begin_y=1, end_x=0, end_y=5>,
#<struct Line length=2, begin_x=1, begin_y=3, end_x=1, end_y=5>]
1
With this data.txt:
2 4 1.3434324,3.543243,4.525324
1 2
18 3.3213,9.3233,1.12231,2.5435
7 9 2.2,1.899990
0 3 2.323
Try this:
require 'pp'
data = []
IO.foreach('data.txt') do |line|
pieces = line.split
csv_numbers = pieces[-1]
next if not csv_numbers.index('.') #skip the case where there are no floats on a line
floats = csv_numbers.split(',')
data << floats.map(&:to_f)
end
pp data
--output:--
[[1.3434324, 3.543243, 4.525324],
[3.3213, 9.3233, 1.12231, 2.5435],
[2.2, 1.89999],
[2.323]]

Resources