I'm using Playwright for web scraping and I currently need to find a certain description text. I know there doesn't have to be a description text on each page I scrape from that website so I want it to be "optional".
I've solved it and it works but I think it's ugly and I want to know if there is a better way?
let description: string | undefined;
try {
const locator = page.locator(".svtmat_recipe__preamble");
await locator.waitFor({ timeout: 3000 });
description = (await locator.textContent()) ?? undefined;
} catch {}
Your code looks good, but I would change it a little bit:
// Because 'textContent' method returns promise of string | null.
// So you don't need to use the 'nullish coalescing' operator (??)
let description: string | null;
try {
const locator = page.locator(".svtmat_recipe__preamble");
// also you can pass { timeout: 3000 } to the 'textContent' method to wait for it
description = await locator.textContent({ timeout: 3000 })
} catch {}
It depends, weather you need to wait "some time" for that locator to be present in the DOM.
await page.isVisible("text='yourText'")
or
if(await page.locator(your_selector).count()>0)
But in case you need to wait for element to be present in DOM, try catch seems to be one of the appropriate solutions, since waitForSelector is throwing exepction if timeout is reached.
Remember locator will not wait for any selector to appear. So you might need to await for the result.
const selector = '.svtmat_recipe__preamble'
await page.waitForSelector(selector)
const el = await page.locator(foo).count()
console.log(el)
Related
I would like to take a screenshot of an element that might or might not exist
const elem = await page.locator('.selector')
await elem.scrollIntoViewIfNeeded();
await elem.screenshot({path: 'screenshot.png'});
If the element isn't on the page this results in
elementHandle.screenshot: Timeout 30000ms exceeded.
Is there a way to just ignore the nonexistence and move on?
This behaves the same
const elem = await page.locator('.selector')
if(elem) {
await elem.scrollIntoViewIfNeeded();
await elem.screenshot({path: 'screenshot.png'});
}
Try wrapping your logic in try {} catch() {} as the possible throw created by screenshot not existing is going to be passed to catch where it can be ignored, instead crashing runtime.
Since node-fetch was replaced by undici in #5117 some of us encountered the error
Node streams are no longer supported — use a ReadableStream instead
like in this post
It is not easy to reproduce, for me the error occured only in production.
This is a self-answered question in case you have the same problem.
The error comes from src/runtime/server/utils.js L46 and is thrown after checking the _readableState property and some type on the response body of the request.
For me the problem was that my endpoint.ts was returning the fetch directly.
export async function post({request}){
return fetch('...')
}
This used to work but not anymore since the fetch response is a complex object with the _readableState property. To fix this you have to consume the response and return a simpler object like
export async function post({request}){
try {
const res = await fetch('...')
const data = await res.json()
return {
status: 200,
body: JSON.stringify({...data}),
}
catch(error){
return { status: 500}
}
}
I am trying to add pagination to my Zapier trigger.
The API I am using for the trigger supports pagination, but not using a page number in the traditional sense (ie. page 1,2,3,...). Instead, the API response includes a key (ie. "q1w2e3r4") which should be passed as a parameter to the next request to get the next page of results.
From looking at the docs, I can use {{bundle.meta.page}} (which defaults to 0 unless otherwise set).
I am trying to set {{bundle.meta.page}} in the code editor, with an example shown below:
const options = {
url: 'company_xyz.com/api/widgets',
method: 'GET',
...,
params: {
...,
'pagination_key': bundle.meta.page,
}
}
return z.request(options)
.then((response) => {
response.throwForStatus();
const json_response = response.json;
widgets = json_response.widgets
...
bundle.meta.page = json_response["next_pagination_key"]
return widgets;
});
The problem is that when Zapier tries to retrieve the next page, bundle.meta.page will be 1 instead of the value of "next_pagination_key" from the result of the previous request.
There are docs on cursor-based pagination in the CLI docs.
The relevant block is:
const performWithAsync = async (z, bundle) => {
let cursor;
if (bundle.meta.page) {
cursor = await z.cursor.get(); // string | null
}
const response = await z.request(
'https://5ae7ad3547436a00143e104d.mockapi.io/api/recipes',
{
// if cursor is null, it's sent as an empty query
// param and should be ignored by the server
params: { cursor: cursor }
}
);
// we successfully got page 1, should store the cursor in case the user wants page 2
await z.cursor.set(response.nextPage);
return response.items;
};
This should work in the Zapier Visual Builder, but you might need to use the CLI instead. You can export your integration using the zapier convert CLI command (docs).
In a simple JavaScript Service Worker I want to intercept a request and read a value from IndexedDB before the event.respondWith
But the asynchronous nature of IndexDB does not seem to allow this.
Since the indexedDB.open is asynchronous, we have to await it which is fine. However, the callback (onsuccess) happens later so the function will exit immediately after the await on open.
The only way I have found to get it to work reliably is to add:
var wait = ms => new Promise((r, j) => setTimeout(r, ms));
await wait(50)
at the end of my readDB function to force a wait until the onsuccess has completed.
This is completely stupid!
And please don't even try to tell me about promises. They DO NOT WORK in this circumstance.
Does anyone know how we are supposed to use this properly?
Sample readDB is here (all error checking removed for clarity). Note, we cannot use await inside the onsuccess so the two inner IndexedDB calls are not awaited!
async function readDB(dbname, storeName, id) {
var result;
var request = await indexedDB.open(dbname, 1); //indexedDB.open is an asynchronous function
request.onsuccess = function (event) {
let db = event.target.result;
var transaction = db.transaction([storeName], "readonly"); //This is also asynchronous and needs await
var store = transaction.objectStore(storeName);
var objectStoreRequest = store.get(id); //This is also asynchronous and needs await
objectStoreRequest.onsuccess = function (event) {
result = objectStoreRequest.result;
};
};
//Without this wait, this function returns BEFORE the onsuccess has completed
console.warn('ABOUT TO WAIT');
var wait = ms => new Promise((r, j) => setTimeout(r, ms));
await wait(50)
console.warn('WAIT DONE');
return result;
}
And please don't even try to tell me about promises. They DO NOT WORK in this circumstance.
...
...
...
I mean, they do, though. Assuming that you're okay putting the promise-based IndexedDB lookups inside of event.respondWith() rather than before event.respondWith(), at least. (If you're trying to do this before calling event.respondWith(), to figure out whether or not you want to respond at all, you're correct in that it's not possible, since the decision as to whether or not to call event.respondWith() needs to be made synchronously.)
It's not easy to wrap IndexedDB in a promise-based interface, but https://github.com/jakearchibald/idb has already done the hard work, and it works quite well inside of a service worker. Moreover, https://github.com/jakearchibald/idb-keyval makes it even easier to do this sort of thing if you just need a single key/value pair, rather than the full IndexedDB feature set.
Here's an example, assuming you're okay with idb-keyval:
importScripts('https://cdn.jsdelivr.net/npm/idb-keyval#3/dist/idb-keyval-iife.min.js');
// Call idbKeyval.set() to save data to your datastore in the `install` handler,
// in the context of your `window`, etc.
self.addEventListener('fetch', event => {
// Optionally, add in some *synchronous* criteria here that examines event.request
// and only calls event.respondWith() if this `fetch` handler can respond.
event.respondWith(async function() {
const id = someLogicToCalculateAnId();
const value = await idbKeyval.get(id);
// You now can use `value` however you want.
const response = generateResponseFromValue(value);
return response;
}())
});
I use my postgres database query to determine my next action. And I need to wait for the results before I can execute the next line of code. Now my conn.query returns a Future but I can't manage to get it async when I place my code in another function.
main() {
// get the database connection string from the settings.ini in the project root folder
db = getdb();
geturl().then((String url) => print(url));
}
Future geturl() {
connect(db).then((conn) {
conn.query("select trim(url) from crawler.crawls where content IS NULL").toList()
.then((result) { return result[0].toString(); })
.catchError((err) => print('Query error: $err'))
.whenComplete(() {
conn.close();
});
});
}
I just want geturl() to wait for the returned value but whatever I do; it fires immediately. Can anyone point me a of a piece of the docs that explains what I am missing here?
You're not actually returning a Future in geturl currently. You have to actually return the Futures that you use:
Future geturl() {
return connect(db).then((conn) {
return conn.query("select trim(url) from crawler.crawls where content IS NULL").toList()
.then((result) { return result[0].toString(); })
.catchError((err) => print('Query error: $err'))
.whenComplete(() {
conn.close();
});
});
}
To elaborate on John's comment, here's how you'd implement this using async/await. (The async/await feature was added in Dart 1.9)
main() async {
try {
var url = await getUrl();
print(url);
} on Exception catch (ex) {
print('Query error: $ex');
}
}
Future getUrl() async {
// get the database connection string from the settings.ini in the project root folder
db = getdb();
var conn = await connect(db);
try {
var sql = "select trim(url) from crawler.crawls where content IS NULL";
var result = await conn.query(sql).toList();
return result[0].toString();
} finally {
conn.close();
}
}
I prefer, in scenarios with multiple-chained futures (hopefully soon a thing of the past once await comes out), to use a Completer. It works like this:
Future geturl() {
final c = new Completer(); // declare a completer.
connect(db).then((conn) {
conn.query("select trim(url) from crawler.crawls where content IS NULL").toList()
.then((result) {
c.complete(result[0].toString()); // use the completer to return the result instead
})
.catchError((err) => print('Query error: $err'))
.whenComplete(() {
conn.close();
});
});
return c.future; // return the future to the completer instead
}
To answer your 'where are the docs' question: https://www.dartlang.org/docs/tutorials/futures/
You said that you were trying to get your geturl() function to 'wait for the returned value'. A function that returns a Future (as in the example in the previous answer) will execute and return immediately, it will not wait. In fact that is precisely what Futures are for, to avoid code doing nothing or 'blocking' while waiting for data to arrive or an external process to finish.
The key thing to understand is that when the interpreter gets to a call to then() or 'catchError()' on a Future, it does not execute the code inside, it puts it aside to be executed later when the future 'completes', and then just keeps right on executing any following code.
In other words, when using Futures in Dart you are setting up chunks of code that will be executed non-linearly.