How do url shorting sites work? - url

How do URL shorting sites like bit.ly or goo.gl work? Does anyone know what technique or algorithm they use?

Save the URL and generate unique key for the URL and store it in the DB. Use this key to navigate to the URL.
Do you need complex algorithm for this? :-)
If you want to make it complex.
Check for malicious URLs and block them
Have stats based on number of clicks
Have registrations and users have their own small urls
Develop plugins for browsers to generate short urls
etc etc

Related

How to share URL with UTM parameters on WhatsApp

I have given an option to my users to share my website on whatsapp. And I want to know how many users land back on the website using the shared link. Hence, the shared button opens this link:
https://wa.me/919876543210?text=https://www.mywebsite.com?utm_source=whatsapp&utm_medium=share
But this URL considers the end &utm_medium=share as a part of the wa.me URL, and shares only https://www.mywebsite.com?utm_source=whatsapp on WhatsApp. So instead I did this:
https://wa.me/919876543210?text=https://www.mywebsite.com?utm_source=whatsapp%26utm_medium=share
which shares the correct URL on whatsapp: https://www.mywebsite.com?utm_source=whatsapp%26utm_medium=share, but when I open it, the UTM params are not captured by GA.
What is the way out of this loop?
There's a more elegant way of doing it than utm params. Have something like: https://wa.me/919876543210?text=https://www.mywebsite.com?t=wa
See how now it's shorter and more elegant to a user? Now you have two good options.
Make a conditional redirect on your site from any url that has a t=we query param to whatever utm param you want with no restriction.
And even more elegantly: use GTM to parse pageviews where there's a t query parameter set, then make a neat lookup table where the input would be the value of t and the output - whatever you want to name it. Then use that lookup table's value to set your session-level custom dimension in pageviews.
Why a custom dimension and not UTM? Because when using UTMs, you're affecting your attribution. And sessions. You can easily override organic or paid attribution with some meaningless whatsapp attribution. Well, yes, if you don't use attribution at all and you don't care about GA session breakpoints, then sure, UTMs are just easier.
Also, try escaping the &, but not much hope there.

Can malicious code, virus, etc be loaded onto a site that accepts embed links?

I operate a CMS Site (Video Server like YouTube-like)and it permits users to embed links to videos elsewhere on the web, i.e. www.vimeo.com/videos/sjek3469df
Is there any way someone could input any type or URL "link" that could infect my website?
Thanks in advance all!
It really depends on how your site is set up, but yes, there would be XSS concerns. At the very least, I'd suggest a whitelist for allowed video hosts (with particular URL patterns, not just acceptable domains). You should also consider parsing the URLs to obtain the video IDs, and using those to generate your own embedding code on a per-host basis. That would give you more customization power, not just more security.

What is the advantage of putting the language indicator into the URL?

I'm doing a site which supports multiple languages. At the moment, I’m doing like /en/… in the URL path and using .htaccess to determine which language the user is on. Actually, this is very common for sites with multiple languages to either do http://en.example.com or http://example.com/en/.
My question is: Why is it so common to show in the URL which language the user is viewing? I can't see any technical advantages. Is it for optimizing user experience?
Because you could easily just use sessions/cookies and hide it from the user which I'm leaning to at the moment.
Thanks in advance :)
For easy bookmarking probably.
Specifying the language information in the URL is 1 way to indicate that you want to view in that particular language, ignoring your current locale.
Wrapping this information in the URL is better than using a cookie for example, as some users may delete all cookies after each browsing session.
And because of this pseudo REST like URL, /en/, it is easily bookmarkable, and search engine friendly
I think it's used as a substitute for not owning the domain within each TLD. (ie company.co.uk and company.com).
It's also usable because of the uri's possibility to be localised: ikea.com/se/stolar could be the localised variant of ikea.com/en/chairs; usable both for the end user and SEO.
It is not directory, but mod_rewrite - such url as:
http://google.pl/en
gets rewritten server side for:
http://google.pl?lang=en
and for every language it will be more handy.
Why? Because if client saves link to our page in favorites and sends it to his friend, he can pass also the language of the page he was viewing. If the default language was for example polish, and he changed it to english, he saves friend some time to search and click specific button.
If you put it in the URL the search engines will store every page in every language. If you use cookies, they will only store one. So it's more a SEO advantage I think.

Enable Query Strings in Code Igniter

I am trying to implement Twitter's OAuth into my Code Igniter web application at which the callback URL is /auth/ so once you have authenticated with Twitter you are taken to /auth/?oauth_token=SOME-TOKEN.
I want to keep the nice clean URL's the framework provides using the /controller/method/ style of URL but I want to enable query strings as well, there will only ever be one name of the data oauth_token so it's ok if it has to be hard coded.
Any ideas?
I have tried tons of the things people are saying to do, but none work :(
PS: I'm using the .htaccess method of URL rewriting.
There are several ways to handle this.
Most People, and Elliot Haughin's Twitter Lib, extend the CI_Input library with a MY_Input library that sets allow_query_strings to true
You will also need to add ? to the allowed characters in config/config.php and set $config['url_protocal'] to PATH_INFO
see here: Enable GET in CodeIgniter
Codeigniter Reactor lets you access $_GET directly or via $this->input->get(). You don't need to use MY_Input or even change your config.php. This method leaves the query string in the URL, however.
I used a hacked index.php to recognise users coming back from Twitter, check for valid and safe values, then re-direct it to to a CodeIgniter friendly URL.
It may not be to everyones taste but I preferred it over allowing query strings throughout the entire application instead of just one particular circumstance.

Track a short URL generated for a long URL

I'm writing a URL shortener similar to tinyurl and I'm wondering how to keep track of URL's that are already shortened using my service? For example, tinyurl generates the same tiny URL for the same long URL regardless of who creates it. How can this be achieved that is scalable? Bitly also does this though they generate a new URL per person. However, they are able to track the aggregate (total # of) clicks for the long URL - How?
Thanks,
They store the URLs in their database, associated with the short URL(s). How else would it be done?

Resources