I have a couple of questions about the url returned for the thumbnail link in document list api:
1) is the url public? (does not seem to be access controlled)
2) does the url remain the same no matter when we get the link (is the url time limited or anything like that)?
Regards,
LT
Related
The webUrl property of an Onedrive item differs on if the file can be opened Online or not.
As an example when I call /v1.0/me/drive/root/children I get items with
https://domain-my.sharepoint.com/personal/user/_layouts/WopiFrame.aspx?source={{id}}&file={{filename}}
https://domain-my.sharepoint.com/personal/Documents/Folder/filename.txt
Why not using a consistent url here, as the Online Apps will also work with the second Url?
The second url can also be used by the client apps to open the document while the first url can't.
It's also hard to construct the second version of the link from the other item properties while the first link can be easily constructed.
The purpose of the webUrl is to provide a URL that displays the resource in the browser. In the cases where a specialized experience can be provided (such as coediting office files in the web app) we'll return URLs specific to those scenarios, otherwise we'll return a generic URL and expect the browser to "do the right thing".
It is possible to always get URLs of the form of 2. by selecting the webDavUrl property when requesting the item.
See this documentation for descriptions of these fields.
We're successfully using the Youtube API to create a metadata-and-url xml feed that the GSA requires and pushing it to our Google Search Appliance according to the documentation
The question that we have is that we know you need to put a start url in the Content Sources > Web Crawl > Start and Block URLs page in the Admin Console. If we put in https://www.youtube.com as a start url and a follow pattern of https://www.youtube.com/watch?v=* (which all looks like all youtube videos follow) will the GSA only index whats coming from the feed or will it go out to youtube.com and index a bunch of content that isn't part of our channel? I don't see anywhere you can specify a channel for a video.
FYI, we are aware of FishBowlSolutions connector for YouTube but trying to avoid spinning up another server with TomCat just to index our YouTube videos.
You should not add the youtube-url to your Start URLs, only to your Follow Patterns. That way, the crawler will not crawl Youtube from top to bottom, but the URLs you provide in the feed will be crawled. However, if GSA finds URLs on the crawled pages, it will obviously also crawl those.
An option is to tighten the Follow Patterns. And of course you can develop a Youtube connector on Googles Adaptor Framework, which is not that hard for Java-developers!
Google CSE Search
YouTube User Panel
I haven't used GSA(I'm getting ramped up on it though, which is how I found your post), but the way I've accomplished this using Google's CSE is to index the channel, user or playlist specifically, vs. youtube in general, i.e.:
youtube dot com/user/alltrapmusic
or: youtube dot com/channel/UC_ahy2GUec7EmbWF3LGxLhQ
or: youtube dot com/playlist?list=PLsHnWFR4n5jBFYdsclaKtdWQtf2Iu8bKZ
So, in CSE, I can configure to search only that user, channel and playlist and return only results found on those three (Google CSE Search link).
I can only assume GSA works the same(as I mentioned, I have no experience with GSA); if not, my apologies.
~chipleh
p.s. - in order to find your youtube channel, go to the user link(YouTube User Panel link); there you'll find home, videos, playlists, channels, etc. Hope that helps.
For anyone else looking to use the Youtube api and push their videos to the GSA, we found that there needs to be a few changes to the feed.
The feedtype needs to be full in the xml.This tells the GSA that everything it needs to know about the content is in the xml and it doesn't need to go out and index a url.
You need to have a <content> node in the xml. We used the description coming from Youtube api as the value. This is what is displayed to the user in the search results
url attribute on the record needs to be a value that can be added to the Start and Block URL and Follow patterns in the GSA settings and it needs to be unique. These actually don't need to exist but the GSA will use this value in the xml to determine if it should be included in the index. We used a fake url and the value from Youtube video ID appended to make it unique
displayurl attribute will be the url that will be displayed in the results so it would have the actual youtube url.
Start and Block URLs should contain the general url attribute value. For us, it was the fake directory http://www.yourdomain.com/video/youtube/
Follow Pattern should contain the pattern to follow that also matches the Start URL. Since we only have videos in that directory, we're able to put the same value as the Start URL. If you are pointing to a real directory and have other content in that that you don't want to index, you may need to add whatever pattern is common to your videos.
A sample record is below. Once we updated our feed, added the Start and Block URLs, our videos appear in our search results.
<gsafeed>
<header>
<datasource>youtube</datasource>
<feedtype>full</feedtype>
</header>
<group action="add">
<record url="http://www.yourdomain.com/video/youtube/?VIDEOID" displayurl="https://www.youtube.com/watch?v=VIDEOID" mimetype="text/html">
<content><![CDATA[DESCRIPTION]]></content>
<metadata>
<meta name="Title" content="TITLE OF VIDEO"></meta>
<meta name="Published" content="2016-08-15T22:00:38.000Z"></meta>
<meta name="PhotoURL" content="https://i.ytimg.com/.."></meta>
</metadata>
</record>
</group>
</gsafeed>
I need to get the base url or the original url from the short url that is obtained from twitter in the form http://t.co...
Is their any way to get the original base url from this...please suggest
Any kind of help will be appreciated..
Look at this link, https://dev.twitter.com/docs/tweet-entities
Especially "The media entity" part
One of these from api should solve your problem.
media_url The URL of the media file (see the sizes attribute for
available sizes)
media_url_https The SSL URL of the media file (see the sizes
attribute for available sizes)
url The media URL that was extracted
display_url Not a URL but a string to display instead of the media
URL
expanded_url The fully resolved media URL
I am working on ASP>Net site and my client wants when any user insert a youtube URL in his/her profile at that time my c# code or any JQuery Code or any youtube API code check this URL that it is exist on youtube.com or not. I have found many things but most of them give us string matcher or URL pattern checker code but my requirement is check this URL i.e. exist on youtube.com and this video is show for public video.
Can anyone help me out.....
Match pattern of URL against known pattern of YouTube video See Stack Overflow example
Use AJAX call to your server to screen scrape the entered URL and check for 404 header or typical text resulting in "The video you have requested is not available." using the HTML Agility pack for c#.
I can write this for you but it will cost you. :)
I am using the Twitter Search API to search for a URL. Here's an example:
http://search.twitter.com/search.json?q=url.com
The JSON response gives me the shortened URL of each search result. Is there a way for me to retrieve the full URL of each result?
From 11/2011, you can use the include_entities=true parameter to retrieve full tweet entities, which include the expanded URL (and a lot more)
https://dev.twitter.com/docs/using-search
You will have to pragmatically request each URL yourself and see where it redirect to.
On Twitter Search, you can use the same URL endpoint that Twitter Search uses to expand shortened URLs: http://search.twitter.com/hugeurl. For example, if you wanted to expand the shortened URL http://bit.ly/jIhqhq:
$ curl "http://search.twitter.com/hugeurl?url=http://bit.ly/jIhqhq"
http://edition.cnn.com/2011/SPORT/football/05/03/may.03.cnn.top.10/index.html/
This will only work for the more popular shorteners (bit.ly, j.mp, etc.) Also, this AJAX endpoint is pretty aggressively rate-limited, so don't expect to be able to use this for a production application, but something like 10 times an hour should be fine.
Not currently through Twitter. On Twitter.com, those shortened URLs are automatically expanded into something readable, however search.twitter.com doesn't seem to be expanding the t.co shortened URLs at this time.