PHP: Block all hyperlink to the guests - hyperlink

I am trying to find solution since 2 hours to block hyperlink to the guests but no luck.
Text content is loading from database and rendering with HTML format. All links user add either plain http:// format or anchor tag it is automatically convert to the link.
I want to hide/block all hyperlink from guests user. I need your great help.. really stuck.
I am getting content from database with this array $data['content'] this is html content.

Without seeing your code or knowing more about the context, here is something to study and apply to you situation: put this code into a file called demo.php or something like that and run it on the command line:
<?php
$text_with_links[] = 'http://example.com';
$text_with_links[] = 'Link';
echo "User text:\n\n";
foreach ( $text_with_links as $text_example ) {
echo "\t${text_example}\n";
}
echo "\n\n";
echo "Use text filtered by htmlspecialchars:\n\n";
foreach ( $text_with_links as $text_example ) {
echo "\t" . htmlspecialchars( $text_example ) . "\n";
}
echo "\n\n";
echo "Use text filtered by strip_tags:\n\n";
foreach ( $text_with_links as $text_example ) {
echo "\t" . strip_tags( $text_example ) . "\n";
}
You should see something like this:
$ php demo.php
User text:
http://example.com
Link
Use text filtered by htmlspecialchars:
http://example.com
<a href="http://example.com">Link</a>
User text filtered by strip_tags:
http://example.com
Link
This demonstrates the use of two core PHP functions, htmlspecialchars and strip_tags. Read the documentation for these here:
http://www.php.net/manual/en/function.htmlspecialchars.php
http://www.php.net/manual/en/function.strip-tags.php
Notice how neither of these functions make changes to the user entry "http://example.com". When you say that "All links user add either plain http:// format ... it is automatically convert to the link.", I suspect that there is a process happening in the middle that you are leaving out. Simply printing the string "http://example.com" onto a page will not normally cause it to be rendered as an HTML anchor tag.
As others have commented, to get the best help you should follow stackoverflow.com guidelines and provide all relevant details and context to your problem, and also print the code that you have tried so far so others can correct it and also get a better idea of what you are dealing with.
Please mention what framework or CMS you are using (as I strongly suspect you are using one). Drupal, Joomla, Wordpress, etc.
For the time being, you can simply remove "http://" from all strings you retrieve from the database using str_replace() - http://www.php.net/manual/en/function.str-replace.php. That is not the best solution though. I think what is happening is that the software you are using is converting the "http:// format" text (to use your words) into HTML anchor tags for you.
For example, if you are using Drupal and have provided a webform that guests have access to, you should change the format from "Full HTML" to something more limited, otherwise you are allowing anonymous users to entered HTML into your database.
You may not need a PHP code solution at all. The best thing is to check the controls on the WYSIWYG editor that you provide your guest users and make sure the template is set up correctly.
Also, you should Google "prevent XSS attacks PHP", and "prevent SQL injection PHP". This is research that you must do if you wish to have a secure website.
Good luck. If you provide more details, I and others can give you more specific help.
EDIT: Printing what I wrote in comment below, for easier reading:
$content_tags_stripped = strip_tags( $data['content'] );
OR
$content_no_htmlspecialchars = htmlspecialchars( $data['content'] );
Then use those variables in your code instead of $data['content'].
But please study the rest of what I wrote when you have the time. I think you will be able to find a more satisfactory solution to your problem that way. Good luck.

Related

How do I pull text from google document to display on my site?

I have a user who wants to update the text on their website using a google document. Is it possible?
I haven't looked into whether it's possible with an API, but I do know that it's possible to load the page and parse the contents quite easily.
Here is a (rather pointless) snippet from when I parsed a Google document using C# -
contents = title = browser.ResponseData;
contents = contents.Substring(contents.IndexOf(">DOCS_mutations") + 20);
contents = contents.Substring(0, contents.IndexOf("\"}") + 1);
contents = contents.Substring(contents.IndexOf("\"s\":"));
contents = contents.Substring(contents.IndexOf(":") + 1);
contents = contents.Substring(1, contents.Length - 2);
title = title.Substring(title.IndexOf("/><script type=\"text/javascript\">"));
title = title.Substring(title.IndexOf("config, '") + 9);
title = title.Substring(0, title.IndexOf("'"));
The above snippet is grabbing the content of the document from the HTML by parsing the following region (where foo was the contents of my document) -
DOCS_mutations = [{"ty":"is","ibi":1,"s":"foo"},
This information is a mere example of how possible it would be to do this.
It is important to note that some characters are replaced with others so that they do not interfere with the HTML. For example, the content of
foo " / \ <
is replaced with
foo \" / \ \u003C
I would highly recommend that you consider thinking of an alternative that would satisfy your client, as such a method of parsing another page in order to update another feels very unnecessary.
Assuming you're using PHP, if you really need to go through with this, then I would recommend using cURL or file_get_contents() to download the document on your server.
I should note that you must consider the efficiency of your ideas, as you certainly don't want to be downloading the latest document from Google Docs every time someone wishes to view one of your pages. I would recommend invoking a re-download upon the request/instruction of a site administrator. Also note that the Google Docs update the server-side version regularly (whilst you're modifying the document). This is important to note, as this would make it possible for your server to download an unfinished document, should you not take this into consideration.
Finally, As an alternative to using Google Docs, I wish to suggest creating an administration control panel for your client to use. Ask him what it is about Google Docs that makes him wish to use it in such a way, and then implement the features that would make your own editor an acceptable alternative for Google Docs.
Let me know if you need further clarification, or if you desire some example code.

Extraction from string - Ruby

I have a string. That string is a html code and it serves as a teaser for the blog posts I am creating. The whole html code (teaser) is stored in a field in the database.
My goal: I'd like to make that when a user (facebook like social button) likes certain blog post, right data is displayed on his news feeds. In order to do that I need to extract from the teaser in the first occurrence of an image an image path inside src="i-m-a-g-e--p-a-t-h". I succeeded when a user puts only one image in teaser, but if he accidentally puts two images or more the whole thing craches.
Furthermore, for description field I need to extract text inside the first occurrence inside <p> tag. The problem is also that a user can put an image inside the first tag.
I would very much appreciate if an expert could help me resolve this what's been bugging me for days.
Text string with a regular expression for extracting src can be found here: http://rubular.com/r/gajzivoBSf
Thanks!
Don't try to parse HTML by yourself. Let the professionals do it.
require 'nokogiri'
frag = Nokogiri::HTML.fragment( your_html_string )
first_img_src = frag.at_css('img')['src']
first_p_text = frag.at_css('p').text

How to dynamically generate url for image map in Oracle ApEx?

The scenario:
I have an ApEx page which pulls a record from a table. The record contains an id, the name of the chart (actually a filename) and the code for an image map as an NVARCHAR2 column called image_map.
When I render the page I have an embedded HTML region which pulls the image in using the #WORKSPACE_IMAGES#&P19_IMAGE. substitution as the src for the image.
Each chart has hot spots (defined in the image_map html markup) which point to other charts on the same ApEx page. I need to embed the:
Application ID (like &APP_ID.)
Session (like &APP_SESSION.)
My problem:
When I try to load the &APP_ID as part of the source into the database it pre-parses it and plugs in the value for the ApEx development app (e.g. 4500) instead of the actual target application (118).
Any help would be greatly appreciated.
Not a lot of feedback - guess I'm doing something atypical?
In case someone else is trying to do this, the workaround I ended up using was to have a javascript run and replace some custom replacement flags in the urls. The script is embedded in the template of the page and assigns the APEX magic fields to local variables, e.g.:
var my_app_id = '&APP_ID';
Not pretty, but it works...
Ok - I think I've left this open long enough... In the event that anyone else is trying to (mis)use apex in a similar way, it seems like the "apex way" is to use dynamic actions (which seem stable from 4.1.x) and then you can do your dynamic replace from there rather than embedding js in the page(s) themselves.
This seems to be the most maintainable, so I'll mark this as the answer - but if someone else has a better idea, I'm open to education!
I found it difficult to set a dynamic URL on a link to another page - directly - attempting to include the full URL as an individual link target doesn't work, at least in my simplistic world, I'm not an expert (as AJ said: any wisdom appreciated).
Instead, I set individual components of the url via the link, and a 'Before Header' PL/SQL process on the targeted page to combine the elements into a full url and assign it to the full url page-item:
APEX_UTIL.set_session_state(
'PG_FULL_URL',
'http...'||
v('PG_URL_COMPONENT1')||
v('PG_URL_COMPONENT2')||
'..etc..'
);
...where PG_FULL_URL is an item of Type 'Display Image', 'Based On' 'Image URL stored in Page Item Value'.
This is Apex 5.1 btw, I don't know if some of these options are new in this release.

Making tagsoup markup cleansing optional

Tagsoup is interfering with input and formatting it incorrectly. For instance when we have the following markup
Text outside anchor
It is formatted as follows
Text outside anchor
This is a simple example but we have other issues as well. So we made tagsoup cleanup/formatting optional by adding an extra attribute to textarea control.
Here is the diff(https://github.com/binnyg/orbeon-forms/commit/044c29e32ce36e5b391abfc782ee44f0354bddd3).
Textarea would now look like this
<textarea skip-cleanmarkup="true" mediatype="text/html" />
Two questions
Is this the right approach?
If I provide a patch can it make it to orbeon codebase?
Thanks
BinnyG
Erik, Alex, et al
I think there are two questions here:
The first Concern is a question of Tag Soup and the clean up that happens OOTB: Empty tags are converted to singleton tags which when consumed/sent to the client browser as markup gets "fixed" by browsers like firefox but because of the loss of precision they do the wrong thing.
Turning off this clean up helps in this case but for this issue alone is not really the right answer because we it takes away a security feature and a well-formed markup feature... so there may need to be some adjustment to the handling of at least certain empty tags (other than turning them in to invalid singleton tags.)
All this brings us to the second concern which is do we always want those features in play? Our use-case says no. We want the user to be able to spit out whatever markup they want, invalid or not. We're not putting the form in an app that needs to protect the user from cross script coding, we're building a tool that lets users edit web pages -- hence we have turned off the clean-up.
But turning off cleanup wholesale? Well it's important that we can do it if that's what our usecase calls for but the implementation we have is all or nothing. It would be nice to be able to define strategies for cleanup. Make that function plug-able. For example:
* In the XML Config of the system define a "map" of config names to class names which implement the a given strategy. In the XForm Def the author would specify the name from the map.
If TagSoup transforms:
Text outside anchor
Into:
Text outside anchor
Wouldn't that be bug in TagSoup? If that was the case, then I'd say that it is better to fix this issue rather than disable TagSoup. But, it isn't a bug in TagSoup; here is what seems to be happening. Say the browsers sends the following to the client:
<a shape="rect"></a>After<br clear="none">
This goes through TagSoup, the result goes through the XSLT clean-up code, and the following is sent to the browser:
<a shape="rect"/>After<br clear="none"/>
The issue is on the browser, which transforms this into:
<a shape="rect">After</a><br clear="none"/>
The problem is that we serialize this as XML with Dom4jUtils.domToString(cleanedDocument), while it would be more prudent to serialize it as HTML. Here we could use the Saxon serializer. It is also used from HTMLSerializer. Maybe you can try changing this code to use it instead of using Dom4jUtils.domToString(). You'll let us know what you find when a get a chance to do that.
Binesh and I agree, if there is a bug it would be a good idea to address the issue closer to the root. But I think the specific issue he is only part of the matter.
We're thinking it would be best to have some kind of name-to-strategy mapping so that RTEs can call in the server-side processing that is right for them or the default if it's not specified.

How does a website highlight search terms you used in the search engine?

I've seen some websites highlight the search engine keywords you used, to reach the page. (such as the keywords you typed in the Google search listing)
How does it know what keywords you typed in the search engine? Does it examine the referrer HTTP header or something? Any available scripts that can do this? It might be server-side or JavaScript, I'm not sure.
This can be done either server-side or client-side. The search keywords are determined by looking at the HTTP Referer (sic) header. In JavaScript you can look at document.referrer.
Once you have the referrer, you check to see if it's a search engine results page you know about, and then parse out the search terms.
For example, Google's search results have URLs that look like this:
http://www.google.com/search?hl=en&q=programming+questions
The q query parameter is the search query, so you'd want to pull that out and un-URL-escape it, resulting in:
programming questions
Then you can search for the terms on your page and highlight them as necessary. If you're doing this server side-you'd modify the HTML before sending it to the client. If you're doing it client-side you'd manipulate the DOM.
There are existing libraries that can do this for you, like this one.
Realizing this is probably too late to make any difference...
Please, I beg you -- find out how to accomplish this and then never do it. As a web user, I find it intensely annoying (and distracting) when I come across a site that does this automatically. Most of the time it just ends up highlighting every other word on the page. If I need assistance finding a certain word within a page, my browser has a much more appropriate "find" function built right in, which I can use or not use at will, rather than having to reload the whole page to get it to go away when I don't want it (which is the vast majority of the time).
Basically, you...
Examine document.referrer.
Have a list of domains to GET param that contains the search terms.
var searchEnginesToGetParam = {
'google.com' : 'q',
'bing.com' : 'q'
}
Extract the appropriate GET param, and decodeURIComponent() it.
Parse the text nodes where you want to highlight the terms (see Replacing text with JavaScript).
You're done!

Resources