I'm a total beginner working through an entry-level university course on data handling. I've been tasked with building a simple database app that the user can add movies to and then mark them "watched". I have a "watch a movie" function (source below) that needs to prompt the user for the name of the movie and then mark that particular movie as "watched". The function receives a movie collection from a database as its single parameter, and the collection only contains the name and watched status for each movie.
Now what happens is the update method on the last line doesn't do anything, and I can't figure out why. I've tested it by including a print command right after the update command, but the document is unchanged. I even tried updating the name of the movie at the same time, but no change.
Any ideas?
watch(collection) async {
print('Name of the movie?');
var name = stdin.readLineSync();
var terms = Query(filter: ValueFilter({'name': name}));
var results = await collection.search(query: terms);
var documents = results.snapshots;
for (var i = 0; i < documents.length; i++) {
var data = documents[i].data;
var document = documents[i].document;
document.update(data: {'name': data['name'], 'watched': true});
}
}
OK, so I had help and managed to solve this. I don't know why, but the ValueFilter needed a MapFilter around it, i.e.
var haku = Query(filter: MapFilter({'nimi': ValueFilter(nimi)}));
It works now, but if anyone could explain why, I'd appreciate it. :D
Related
I just want to fetch all my liked videos ~25k items. as far as my research goes this is not possible via the YouTube v3 API.
I have already found multiple issues (issue, issue) on the same problem, though some claim to have fixed it, but it only works for them as they don't have < 5000 items in their liked video list.
playlistItems list API endpoint with playlist id set to "liked videos" (LL) has a limit of 5000.
videos list API endpoint has a limit of 1000.
Unfortunately those endpoints don't provide me with parameters that I could use to paginate the requests myself (e.g. give me all the liked videos between date x and y), so I'm forced to take the provided order (which I can't get past 5k entries).
Is there any possibility I can fetch all my likes via the API?
more thoughts to the reply from #Yarin_007
if there are deleted videos in the timeline they appear as "Liked https://...url" , the script doesnt like that format and fails as the underlying elements dont have the same structure as existing videos
can be easily fixed with a try catch
function collector(all_cards) {
var liked_videos = {};
all_cards.forEach(card => {
try {
// ignore Dislikes
if (card.innerText.split("\n")[1].startsWith("Liked")) {
....
}
}
catch {
console.log("error, prolly deleted video")
}
})
return liked_videos;
}
to scroll down to the bottom of the page ive used this simple script, no need to spin up something big
var millisecondsToWait = 1000;
setInterval(function() {
window.scrollTo(0, document.body.scrollHeight);
console.log("scrolling")
}, millisecondsToWait);
when more ppl want to retrive this kind of data, one could think about building a proper script that is more convenient to use. If you check the network requests you can find the desired data in the response of requests called batchexecute. One could copy the authentification of one of them provide them to a script that queries those endpoints and prepares the data like the other script i currently manually inject.
Hmm. perhaps Google Takeout?
I have verified the youtube data contains a csv called "liked videos.csv". The header is Video Id,Time Added, and the rows are
dQw4w9WgXcQ,2022-12-18 23:42:19 UTC
prvXCuEA1lw,2022-12-24 13:22:13 UTC
for example.
So you would need to retrieve video metadata per video ID. Not too bad though.
Note: the export could take a while, especially with 25k videos. (select only YouTube data)
I also had an idea that involves scraping the actual liked videos page (which would save you 25k HTTP Requests). But I'm unsure if it breaks with more than 5000 songs. (also, emulating the POST requests on that page may prove quite difficult, albeit not impossible. (they fetch /browse?key=..., and have some kind of obfuscated / encrypted base64 strings in the request-body, among other parameters)
EDIT:
Look. There's probably a normal way to get a complete dump of all you google data. (i mean, other than takeout. Email them? idk.)
anyway, the following is the other idea...
Follow this deep link to your liked videos history.
Scroll to the bottom... maybe with selenium, maybe with autoit, maybe put something on the "end" key of your keyboard until you reach your first liked video.
Hit f12 and run this in the developer console
// https://www.youtube.com/watch?v=eZPXmCIQW5M
// https://myactivity.google.com/page?utm_source=my-activity&hl=en&page=youtube_likes
// go over all "cards" in the activity webpage. (after scrolling down to the absolute bottom of it)
// create a dictionary - the key is the Video ID, the value is a list of the video's properties
function collector(all_cards) {
var liked_videos = {};
all_cards.forEach(card => {
// ignore Dislikes
if (card.innerText.split("\n")[1].startsWith("Liked")) {
// horrible parsing. your mileage may vary. I Tried to avoid using any gibberish class names.
let a_links = card.querySelectorAll("a")
let details = a_links[0];
let url = details.href.split("?v=")[1]
let video_length = a_links[3].innerText;
let time = a_links[2].parentElement.innerText.split(" • ")[0];
let title = details.innerText;
let date = card.closest("[data-date]").getAttribute("data-date")
liked_videos[url] = [title,video_length, date, time];
// console.log(title, video_length, date, time, url);
}
})
return liked_videos;
}
// https://stackoverflow.com/questions/57709550/how-to-download-text-from-javascript-variable-on-all-browsers
function download(filename, text, type = "text/plain") {
// Create an invisible A element
const a = document.createElement("a");
a.style.display = "none";
document.body.appendChild(a);
// Set the HREF to a Blob representation of the data to be downloaded
a.href = window.URL.createObjectURL(
new Blob([text], { type })
);
// Use download attribute to set set desired file name
a.setAttribute("download", filename);
// Trigger the download by simulating click
a.click();
// Cleanup
window.URL.revokeObjectURL(a.href);
document.body.removeChild(a);
}
function main() {
// gather relevant elements
var all_cards = document.querySelectorAll("div[aria-label='Card showing an activity from YouTube']")
var liked_videos = collector(all_cards)
// download json
download("liked_videos.json", JSON.stringify(liked_videos))
}
main()
Basically it gathers all the liked videos' details and creates a key: video_ID - Value: [title,video_length, date, time] object for each liked video.
It then automatically downloads the json as a file.
I am new to Google Sheets, and I have a Google Sheet that I have set up to dynamically place the present date in cell A1 and the time in cell A2. The sheet is "published to the web", and "Settings/Calculation" is set to Recalculate change every minute.
That all works fine, but I want to be able to read these values from the sheet using an API call. Also works perfectly, the FIRST TIME. Unfortunately, every time I try to call it again, I get the same answer as the first time, even a day later.
I'm using:
=int(hour(now()))&":"&int(minute(now()))&" "&int(SECOND(now()))
as the formula. I should also add that it's a JSON file that I'm reading and it is updating properly on the actual sheet.
I'm sure that I am missing something. Can someone please tell me what it is?
Thanks in advance.
may be you are not reading the JSON correctly. This give me correct result every time I run it.
function myFunction(){
var url = "https://spreadsheets.google.com/feeds/cells/1TtXe1JXKsxHKUWb3bqniHkLQB0Po1fSUqsiib2yMv90/1/public/values?alt=json";
try{
var sh = SpreadsheetApp.getActive().getSheetByName("Sheet1");
var response = UrlFetchApp.fetch(url)
var str = response.getContentText();
var data = JSON.parse(response);
var entry = data.feed.entry;
sh.getRange(1, 1).setValue(entry[0].content.$t);
sh.getRange(1, 2).setValue(entry[1].content.$t);
}catch(e){
Logger.log(e);
}
}
You need to check the size of "entry" before reading it, I just wanted to show that it works.
Thanks
I'm writing a function that will parse certain websites and fetch data from there, which will be used to create instances of a class. I'm able to successfully extract the data when it is retrieved using the getElementById() function, but for some reason, the getElementsByClassName() always returns a node list with 0 elements.
The site I'm currently parsing is here.
If you search for 'datas-nev', you will find exactly one match:
<p class="datas-nev"><b>Kutya neve: </b>Jhonny</p>
And here is the code use for parsing:
import 'package:html/parser.dart' show parse;
...
final response = await http.get(URL);
var document = parse(response.body);
var detailsContainer = document.getElementById('husky_details_container_right');
var dogName = new List<Node>();
dogName = document.getElementsByClassName('datas-nev');
The contents of the detailsContainer can be extracted successfully, for example this gives me back a string of relevant data I will use later:
var humanBehaviourValue;
try { humanBehaviourValue = detailsContainer.nodes[1].nodes[19].nodes[1].nodes[7].nodes[1].toString(); }
catch (e) { humanBehaviourValue = 'N/A'; }
But when I check the value of dogName in the debug window, I get the following:
dogName = {_growableList} size = 0
I already tried initializing the dogName 'properly' by List<Node> dogName = new List<Node>(); but it didn't help. I also tried other datas-* values, but it seems the parser can't find them. I even tried using just datas (because that is a div, while others are paragraphs), but that didn't help either.
Basically I could just hardwire the name and some data (breed, color, etc) as those never really change, but the location of the shelter can change, and keeping it up-to-date by scraping the data seems better than pushing updates out manually. That means I mostly need the value of datas-helyszin but that isn't parsed either.
As #Günter Zöchbauer pointed out, the code actually works. I was just looking for the value too soon, before it was actually fetched...
the ASP.NET_SessionState table grows all the time, already at 18GB, not a sign of ever deleting expired sessions.
we have tried to execute DynamoDBSessionStateStore.DeleteExpiredSessions, but it seems to have no effect.
our system is running fine, sessions are created and end-users are not aware of the issue. however, it doesn't make sense the table keeps growing all the time...
we have triple checked permissions/security, everything seems to be in order. we use SDK version 3.1.0. what else remains to be checked?
With your table being over 18 GB, which is quite large (in this context), it does not surprise me that this isn't working after looking at the code for the DeleteExpiredSessions method on GitHub.
Here is the code:
public static void DeleteExpiredSessions(IAmazonDynamoDB dbClient, string tableName)
{
LogInfo("DeleteExpiredSessions");
Table table = Table.LoadTable(dbClient, tableName, DynamoDBEntryConversion.V1);
ScanFilter filter = new ScanFilter();
filter.AddCondition(ATTRIBUTE_EXPIRES, ScanOperator.LessThan, DateTime.Now);
ScanOperationConfig config = new ScanOperationConfig();
config.AttributesToGet = new List<string> { ATTRIBUTE_SESSION_ID };
config.Select = SelectValues.SpecificAttributes;
config.Filter = filter;
DocumentBatchWrite batchWrite = table.CreateBatchWrite();
Search search = table.Scan(config);
do
{
List<Document> page = search.GetNextSet();
foreach (var document in page)
{
batchWrite.AddItemToDelete(document);
}
} while (!search.IsDone);
batchWrite.Execute();
}
The above algorithm is executed in two parts. First it performs a Search (table scan) using a filter is used to identify all expired records. These are then added to a DocumentBatchWrite request that is executed as the second step.
Since your table is so large the table scan step will take a very, very long time to complete before a single record is deleted. Basically, the above algorithm is useful for lazy garbage collection on small tables, but does not scale well for large tables.
The best I can tell is that the execution of this is never actually getting past the table scan and you may be consuming all of the read throughput of your table.
A possible solution for you would be to run a slightly modified version of the above method on your own. You would want to call the the DocumentBatchWrite inside of the do-while loop so that records will start to be deleted before the table scan is concluded.
That would look like:
public static void DeleteExpiredSessions(IAmazonDynamoDB dbClient, string tableName)
{
LogInfo("DeleteExpiredSessions");
Table table = Table.LoadTable(dbClient, tableName, DynamoDBEntryConversion.V1);
ScanFilter filter = new ScanFilter();
filter.AddCondition(ATTRIBUTE_EXPIRES, ScanOperator.LessThan, DateTime.Now);
ScanOperationConfig config = new ScanOperationConfig();
config.AttributesToGet = new List<string> { ATTRIBUTE_SESSION_ID };
config.Select = SelectValues.SpecificAttributes;
config.Filter = filter;
Search search = table.Scan(config);
do
{
// Perform a batch delete for each page returned
DocumentBatchWrite batchWrite = table.CreateBatchWrite();
List<Document> page = search.GetNextSet();
foreach (var document in page)
{
batchWrite.AddItemToDelete(document);
}
batchWrite.Execute();
} while (!search.IsDone);
}
Note: I have not tested the above code, but it is just a simple modification to the open source code so it should work correctly, but would need to be tested to ensure the pagination works correctly on a table whose records are being deleted as it is being scanned.
I'm Extremely new to this and I've been trying to get the title of each unique forum page (or topic) here is the code I have so far:
function GraalGet() {
//parses forums for ALL posts one by one, extract <title> from HTML webpage
var sheet = SpreadsheetApp.getActiveSheet();
var i = 31
var url = "http://www.graalians.com/forums/showthread.php?p="+i;
//var params = {method : "post"}; can this be used at all?
//The aim: loop this once you can get 1 result.
var geturl = UrlFetchApp.fetch(url).getContentText(); //maybe .getContentText should be elsewhere?
var parseurl = Xml.parse(geturl, true); //confirmed - this is true because it wont parse HTML if false
var titleinfo = parseurl.getElement().getElement("html"); //.getElement('body');//.getElements("title");
sheet.appendRow([titleinfo, i]);
}
In addition the script would write down the topic number in the adjoining cell.
There's a lot of answered questions about extracting XML data, and this example is about parsing HTML but I couldn't pull up any results - I'm honestly stumped and any help about finding and extracting the tag will be appreciated. (If you have the time, please feel free to explain as well, but I'll be thankful for any help really.)
For reference I have used these:
Google's Kevin Bacon Script
The authors comments on bugs with the script & some explanation
I'm sorry if I'm being pedantic, this is my first post & I don't want to anger anyone, please do tell me if I've broken any rules, I'll do my best to fix them. I've left the comments I made for myself for your perusal too.
You can use Logger.log to print out debugging information. I did this with your function and figured out that the title tag is embedded within the tag. So you should use something like this. Also, getElement returns an XmlElement object which you should convert to String using getText().
var titleinfo = parseurl.getElement().getElement('head').getElement('title');
sheet.appendRow([titleinfo.getText(), i]);