TIdMessage mandatory subject field - delphi

I am using TIdMessage and when I assign empty subject e.g. IdMsg->Subject = ""; outgoing message does not have "Subject:" header.
If I add subject by having a space e.g. IdMsg->Subject = " "; then the message has Subject: header even though it trims the space - the output is not: "Subject:[sp][sp][cr][lf]" but it is "Subject:[cr][lf]". This is clearly not consistent with the rest of headers which all have a space after the colon and before actual data so the empty subject should be "Subject:[sp][sp][cr][lf]".
I understand that the TIdMessage tries to optimize message by removing headers or trimming them but it is just being too smart here.
Is there a way to force having a Subject header with 2 spaces behind it (without editing the TIdMessage source code)?
For those wondering about the reason - I want to make sure that dumb email reading programs/scripts correctly interpret as "empty subject" which is the reason for all of this and not as something else and removing Subject: header is not much of an optimization anyway.

Your space character actually survives the encoding process when TIdMessageClient is generating the header data being sent, but then the space is getting trimmed by TIdHeaderList when it is parsing the final header data and folding long headers to fit within email line length restrictions. Each line generated for a given header by the folding process gets right-trimmed, and since your header data only consists of whitespace, it gets discarded.
The only way to disable that folding is to set the TIdMessage.LastGeneratedHeaders.FoldLines property to False, which is not advisable unless you know your headers will always be short enough to never need folding.
Another option is to set the TIdMessage.Subject to a blank string, and then use the TIdMessage.ExtraHeaders property instead. You will have to use ExtraHeaders.Add() instead of ExtraHeaders.Values so that the string is added as-is and avoids folding:
Msg.ExtraHeaders.Add('Subject: ');

Related

Apostrophe (valid char) is percent-encoded - but only sometimes

Try to use Google to find Wikipedia article about De Morgan's laws.
Click the link, and see the URL. At least in Chrome, it will be
https://en.wikipedia.org/wiki/De_Morgan%27s_laws
' is percent-encoded as %27, despite it is a valid URL character (and even more, if you manually change it in address bar from %27 to ', it will work). Why?
While aposthrope may be valid char, URL-encoded version is also equally valid!
Not sure if there is a hard reason, so this is kinda "soft" answer: Aposthrope (and/or double quote) needs to be escaped somehow if URL is ever put into for example JSON or XML. URL encoding them as part of sanitizing URLs solves this one way, and protects against poor JSON/XML handling and programmer errors. It's just pragmatic.
Decoding these certain valid chars in HTTP responses' headers etc (so browser shows them "right") should be possible and maybe nice, but extra work and code. Note that there are also chars where decoding would not be ok, so this would have to be selective! So at least in this case it just wasn't done I guess. So if a char gets URL-encoded at any step of the whole page loading operation chain, they stay that way.

Undo email wordwrap line breaks in Ruby

My Rails app processes incoming emails by splitting them into multiple lines. This is what I currently use on the plain text version of the body: lines = email.body.split("\n")
This works well unless the sentences are longer than ~74 characters as most email clients will automatically add a line break per RFC 2822.
Example email: https://gist.github.com/marckohlbrugge/39c17b928eb17d330d63
Looking at the plain text part there seems to be no way to discern between a line break added by the user versus the email client. You could ignore any line break happening at the 75th position, but I think there might be a chance of false positives. (I could be wrong.)
The HTML part has all the information we need, but I'm not sure about a universal way to process this. Is replacing every div and br with a newline and then stripping al other HTML elements enough? What about all the other block-element tags? What about inline elements styled as block-elements? What if an email doesn't have an HTML part?
I did find some interesting code examples in Convert HTML to plain text (with inclusion of s), but replacing a list of html tags with newlines doesn't seem like a complete (exhaustive) solution.
Is it worth looking at something like this mail library as they've probably already thought about the edge cases? ;)

Adding custom header to TIdHttp request, header value has commas

I'm using Delphi XE2 and Indy 10.5.8.0. I have an instance of TIdHttp and I need to add a custom header to the request. The header value has commas in it so it's getting parsed automatically into multiple headers. I don't want it to do that. I need the header value for my custom header to still be one string and not split based on a comma delimiter.
I have tried setting IdHttp1.Request.CustomHeaders.Delimiter := ';' with no success. Is there a way to make sure the header doesn't get split up?
procedure SendRequest;
const HeaderStr = 'URL-Encoded-API-Key VQ0_RV,ntmcOg/G3oA==,2012-06-13 16:25:19';
begin
IdHttp1.Request.CustomHeaders.AddValue('Authorization', HeaderStr);
IdHttp1.Get(URL);
end;
I am not able to reproduce this issue using the latest Indy 10.5.8 SVN snapshot. The string you have shown gets assigned as a single line for me.
With that said, by default the TIdHeaderList.FoldLines property is set to True, and lines get folded on whitespace and comma characters, so that would explain why your string is getting split. Near as I can tell, there have not been any logic changes made to the folding algorithm between your version of Indy and the latest version in SVN.

regular expression for emails NOT ending with replace script

I'm currently modifying my regex for this:
Extracting email addresses in an html block in ruby/rails
basically, im making another obfuscator that uses ROT13 by parsing a block of text for all links that contain a mailto referrer(using hpricot). One use case this doesn't catch is that if the user just typed in an email address(without turning it into a link via tinymce)
So here's the basic flow of my method:
1. parse a block of text for all tags with href="mailto:..."
2. replace each tag with a javascript function that changes this into ROT13 (using this script: http://unixmonkey.net/?p=20)
3. once all links are obfuscated, pass the resulting block of text into another function that parses for all emails(this one has an email regex that reverses the email address and then adds a span to that email - to reverse it back)
step 3 is supposed to clean the block of text for remaining emails that AREN'T in a href tags(meaning it wasn't parsed by hpricot). Problem with this is that the emails that were converted to ROT13 are still found by my regex. What i want to catch are just emails that WEREN'T CONVERTED to ROT13.
How do i do this? well all emails the WERE CONVERTED have a trailing "'.replace" in them. meaning, i need to get all emails WITHOUT that string. so far i have this regex:
/\b([A-Z0-9._%+-]+#[A-Z0-9.-]+.[A-Z]{2,4}('.replace))\b/i
but this gets all the emails with the trailing '.replace i want to get the opposite and I'm currently stumped with this. any help from regex gurus out there?
MORE INFO:
Here's the regex + the block of text im parsing:
http://www.rubular.com/r/NqXIHrNqjI
as you can see, the first two 'email addresses' are already obfuscated using ROT13. I need a regex that gets the emails ohhellzyeah#ribute.com and kaboom#yahoo.com
On negative lookaheads
You can use a negative lookahead to assert that a pattern doesn't match.
For example, the following regex matches all strings that doesn't end with ".replace" string:
^(?!.*\.replace$).*$
As another example, this regex matches all a*b*, except aabb:
^(?!aabb$)a*b*$
Ideally,
See also
regular-expressions.info/Lookaheads and anchors
Flavor comparison - unfortunately, Ruby doesn't support lookbehinds
Specific solution
The following regex works in this scenario: (see on rubular.com):
/\b([A-Z0-9._%+-]+#(?![A-Z0-9.-]*'\.replace\b)[A-Z0-9.-]+\.[A-Z]{2,4})\b/i

Why is this query string invalid?

In my asp.net mvc page I create a link that renders as followed:
http://localhost:3035/Formula/OverView?colorId=349405&paintCode=744&name=BRILLANT%20SILVER&formulaId=570230
According to the W3C validator, this is not correct and it errors after the first ampersand. It complains about the & not being encoded and the entity &p not recognised etc.
AFAIK the & shouldn't be encoded because it is a separator for the key value pair.
For those who care: I send these pars as querystring and not as "/" seperated values because there is no decent way of passing on optional parameters that I know of.
To put all the bits together:
an anchor (<a>) tag's href attribute needs an encoded value
& encodes to &
to encode an '&' when it is part of your parameter's value, use %26
Wouldn't encoding the ampersand into & make it part of my parameter's value?
I need it to seperate the second variable from the first
Indeed, by encoding my href value, I do get rid of the errors. What I'm wondering now however is what to do if for example my colorId would be "123&456", where the ampersand is part of the value.
Since the separator has to be encoded, what to do with encoded ampersands. Do they need to be encoded twice so to speak?
So to get the url:
www.mySite.com/search?query=123&456&page=1
What should my href value be?
Also, I think I'm about the first person in the world to care about this.. go check the www and count the pages that get their query string validated in the W3C validator..
Entities which are part of the attributes should be encoded, generally. Thus you need & instead of just &
It works even if it doesn't validate because most browsers are very, very, very lenient in what to accept.
In addition, if you are outputting XHTML you have to encode every entity everywhere, not just inside the attributes.
All HTML attributes need to use character entities. You only don't need to change & into & within script blocks.
Whatever
Anywhere in an HTML document that you want an & to display directly next to something other than whitespace, you need to use the character entity &. If it is part of an attribute, the & will work as though it was an &. If the document is XHTML, you need to use character entities everywhere, even if you don't have something immediately next to the &. You can also use other character entities as part of attributes to treat them as though they were the actual characters.
If you want to use an ampersand as part of a URL in a way other than as a separator for parameters, you should use %26.
As an example...
Hello
Would send the user to http://localhost/Hello, with name=Bob and text=you & me "forever".
This is a slightly confusing concept to some people, I've found. When you put & in a HTML page, such as in <a href="abc?def=5&ghi=10">, the URL is actually abc?def=5&ghi=10. The HTML parser converts the entity to an ampersand.
Think of exactly the same as how you need to escape quotes in a string:
// though you define your string like this:
myString = "this is \"something\" you know?"
// the string is ACTUALLY: this is "something" you know?
// when you look at the HTML, you see:
<a href="foo?bar=1&baz=2">
// but the url is ACTUALLY: foo?bar=1&bar=2

Resources