Using Python RegEx in Zapier Formatter Extract Pattern - zapier

I have a field in an RSS item that includes a URL such as:
https://www.facebook.com/9999249845065110
https://www.yelp.com/biz/bix-berkeley-2?hrid=TaFUhHhVrhEJdCPjaB6RUQ
https://www.google.com/search?q=hello%20Signs%20&%20Graphics&ludocid=1720220414695611454#lrd=0x0:0x17df735a614e9c3e,1
I'm trying to setup a Zap in Zapier using the Formatter tool to essentially extract the root domain without the .com. So:
facebook
yelp
google
I have no clue how to use the Formatter Extract Pattern tool though. Can't figure out the syntax.
Best case scenario, it can look at any URL and extract the name of the site (e.g. facebook/google/yelp). If that's too complicated, then I could provide a finite list of what terms to look for and have it return the first (and only) one found. So it would check if the URL contained facebook or google or yelp and if so return that name as a value.
Any help would be appreciated. Thanks.

David here, from the Zapier Platform team.
This is totally possible. The input is the text you want to search (the full url) and the pattern is your regular expression.
In your case, you want to find the word between www. and .com. Use the regular expression www\.(\w+)\.com.
That worked for me, and pulled out yelp.
You can see each part of the regex explained here: https://regex101.com/r/KmwMAV/1
​Let me know if you've got any other questions!

Related

What language is this Salesforce code that I need to wrap?

I'm working on a Salesforce coding issue. Let me preface this by saying I'm not a developer or Salesforce expert.
What language is this?
Data Type FormulaThis formula references multiple objects
IF (Fulfillment_Submission_Form_URL__c <> "" && CONTAINS(Fulfillment_Submission_Form_URL__c, "qualtrics"),
Fulfillment_Submission_Form_URL__c &
(IF (CONTAINS(Fulfillment_Submission_Form_URL__c,"?SID="), "&", "?")) &
(IF (CONTAINS(TEXT(Type__c), "Site Visit"),
"ContactId="&Statement_of_Work__r.Contractor_Contact__c&
"&CoachType="&SUBSTITUTE(Statement_of_Work__r.Work_Type__r.Name," ","%20")&
"&CoachName="&SUBSTITUTE(Statement_of_Work__r.Contractor_Name__c," ","%20")&
"&InitPartId="&Initiative_Participation__r.Id&
"&InstitutionName="&substitute(substitute(SUBSTITUTE(Institution_Name__c," ","%20"),")",""),"(","")&
"&AccountId="&Initiative_Participation__r.Participating_Institution__r.Id&
"&TodaysDate="&TEXT(TODAY())&
"&SOWLineItemId="&Id&
"&LeaderCollege="&Initiative_Participation__r.ATD_Leader_College_Status__c&
"&SVRCompleted="&TEXT(Count_of_Site_Visit_Fulfillments__c)&
"&SVRRequired="&TEXT(Number_of_Work_Units_Allocated__c),
IF (CONTAINS(TEXT(Type__c), "Feedback"),
"InitPartId="&Initiative_Participation__r.Id&
"&SOWLineItemId="&Id&
"&ReportYear="&Statement_of_Work__r.SOW_Year__c&
"&UserId="&Contractor_User_Id__c&
"&InstitutionName="&substitute(substitute(SUBSTITUTE(Institution_Name__c," ","%20"),")",""),"(",""),
"")
))
,"")
Essentially it's pulling a link from another product we've integrated it with. We then take the basic link and reformat it to add parameters.
The problem is when it pulls in some parameters (ex: CoachName) the Coach entered their name in strange formats like: John (Coach) Doe.
So when the script outputs a URL that includes parameters it breaks at the &CoachName=John%20(Coach)% portion of the URL. Any easy way to work around this by modifying the script? Unfortunately we DO need that (Coach) identifier because the system we push to grabs that as well.
It's formula syntax, I'd compare it to Excel-like formulas. There's self-paced training if you don't want to read documentation. And as it's not exactly code-related you may have more luck on dedicated site, https://salesforce.stackexchange.com/. More admins lurk there.
So you do want that "(Coach)" to go through but it breaks the link? Looks like ( is a special character. It's not technically wrong to have unescaped parentheses, if it breaks that other site you might want to contact them and get their act together. RFC doesn't force us to encode them but looks like you'll have to to solve it at least in the short term: https://webmasters.stackexchange.com/questions/78110/is-it-bad-to-use-parentheses-in-a-url
Instead of poor man's encoding (SUBSTITUTE(Statement_of_Work__r.Contractor_Name__c," ","%20") try using proper URLENCODE(Statement_of_Work__r.Contractor_Name__c).
Or there's bit more "pro" function called URLFOR but the documentation doesn't make it very clear how powerful the 3rd parameter is with the braces [key1 = value1, key2 = value2] syntax. Basically just pass the parameters and let SF worry about encoding special characters etc.
Read my answer https://salesforce.stackexchange.com/a/46445/799 and there are some examples on the net like https://support.docusign.com/s/articles/DFS-URL-buttons-for-Lightning-basic-setup-limitations?language=en_US&rsc_301

Regex to normalize topic links in Discourse forum

I am using Discourse forum software. As in its current state, Discourse presents links to topic in two ways, with and without a post number at the end.
Example:
forum.domain.com/t/some-topic/23
forum.domain.com/t/some-topic/23/5
The first one is what I want and the second one I want to not be displayed in the forum at all.
I've written a post about it on Discourse forum but didn't receive an answer what Regex to put in the permalink normalization input field in the admin section.
I was told that there is an option to do it using permalink normalization like so (It's an example shown in the admin under the Regex input text, I didn't write it):
permalink normalizations
Apply the following regex before matching permalinks,
for example: /(topic.)\?./\1 will strip query strings from topic routes.
Format is regex+string use \1 etc. to access captures
I don't know what Regex I should use in order to remove the numerical value of the post number from links. I need it only for topic links.
This is the routes.rb routing library and this is the permalink.rb library (I think that the permalink library should help get a better clue how to achieve this). I have no idea how to approach this, because it seems that I need some knowledge of the Discourse routing to make it work. For example, I don't understand why (topic.) is part of the regex, what does it mean, so their example doesn't help me to find a solution.
In the admin I have an input field in which I nee to put the normalization regex code.
I need help with the Regex. I need the regex to work with all topics.
Things I've tried that didn't work out:
/(\/\d+)\/\d+$/\1
/(t/[^/]+/\d+).*/\1
/(\/\d+)\/[0-9]+$/\1
/(\/\d+)\/[0-9]+/\1
/(\/\d+)\/\d+$/\1/
/(forum.domain.com(\/\w+)*\/\d+)\/\d+(?=\s|$)/\1
Note: The Permalink Normalization input field treats the character | as a separator to separate between several Regex expressions.
I think this may be the expression you are looking for to put inside de settings field:
/(t\/.*\/\d+)(\/\d+)/\1
You can see it working on Rubular.
However, the code that generates the url is not using the normalization code, so the expression is being ignored.
You could try normalizing the permalink there:
def last_post_url
url = "#{Discourse.base_uri}/t/#{slug}/#{id}/#{posts_count}"
url = Permalink.normalize_url url
url
end
I didn't truly understand your question, but if I got it right, you are saying that you want links with /some-number at the end but don't what links with /some-number/some-number at the end. If that is the case, the regex is:
forum\.domain\.com\/t\/[^0-9\/]+\/\d{1,9}$
You can replace 'forum' with your forum name and 'domain' with your domain name.
This will remove trailing "/<digits>" after another "/<digits>":
/(forum.domain.com(\/\w+)*\/\d+)\/\d+(?=\s|$)/\1

Terms for URL with Query Parameters

Say I have a website that accepts URLs of the form:
http://mywebsite.com/viewdata?content=aggregated&format=circular
Where the content is modified by a parameter (content) and the format is modified by a parameter (format).
What is the term for this sort of URL? How do you refer to the URLs of a "dynamic" website? Is there a difference in terms for parameters that change the content vs the presentation?
This is similar to this question, but I'm trying to figure out what to call the entire URL and the process of using the URL to dynamically change page content.
Sorry if the question is a bit unclear: it's hard to ask when you don't even know what words to use.
you got to learn this tutorial
http://w3schools.com/php/ or http://w3schools.com/asp programming or .... langs

Custom URL format for news in Expression Engine

Our site is migrating from MovableType to ExpressionEngine, and there is one small issue we are having. MT uses a date based URL structure, e.g. www.site.com/2012/03/post-title.html, while EE uses a category based structure, e.g. www.site.com/index.php/news/comments/post-title. The issue is that our MT page used Disqus for comments, and as such comments are tied to a specific URL, meaning that we'd lose all of our comments if we were to migrate. I am wondering if there's a way to change the URL structure in EE to match MT's, thus allowing us to keep the comments. Thanks in advance.
Correction: EE uses a Template Group/Template based structure for URLs, not categories - just to clarify.
You've got a couple of options here.
One is to create an .htaccess rule which internally redirects all requests matching YYYY/MM/ to your EE template which displays your posts (say, /news/entry/). I don't know exactly what those rewrite rules would look like off the top of my head, my mod_rewrite-fu is pretty shallow. But it could definitely work.
Another is to export all of your comments from Disqus via their XML export tool, then do a grep-based find and replace using something like BBEdit, replacing all /YYYY/MM/ strings in that file with /news/entry/; delete all of your existing comments on Disqus; then import your newly-modifed XML file.

dynamic seo title for news articles

I have a news section where the pages resolve to urls like
newsArticle.php?id=210
What I would like to do is use the title from the database to create seo friendly titles like
newsArticle/joe-goes-to-town
Any ideas how I can achieve this?
Thanks,
R.
I suggest you actually include the ID in the URL, before the title part, and ignore the title itself when routing. So your URL might become
/news/210/joe-goes-to-town
That's exactly what Stack Overflow does, and it works well. It means that the title can change without links breaking.
Obviously the exact details will depend on what platform you're using - you haven't specified - but the basic steps will be:
When generating a link, take the article title and convert it into something URL-friendly; you probably want to remove all punctuation, and you should consider accented characters etc. Bear in mind that the title won't need to be unique, because you've got the ID as well
When handling a request to anything starting with /news, take the next part of the path, parse it as an integer and load the appropriate article.
Assuming you are using PHP and can alter your source code (this is quite mandatory to get the article's title), I'd do the following:
First, you'll need to have a function (or maybe a method in an object-oriented architecture) to generate the URLs for you in your code. You'd supply the function with the article object or the article ID and it returns the friendly URL with the ID and the friendly title.
Basically function url(Article $article) => URL.
You will also need some URL rewriting rules to remove the PHP script from the URL. For Apache, refer to the mod_rewrite documentation for details (RewriteEngine, RewriteRule, RewriteCond).

Resources