Arnout's Eclectica

But I digress…

URL-encoded slashes in System.Uri

30 April 2008 23:11 — .NET,Development

Two weeks ago, an ex-colleague asked me to take a look at a problem that he and his team had encountered. They tried using a System.Uri with URL-encoded slashes, but those slashes kept ending up unencoded in the resulting URI:

Uri uri = new Uri("http://somesite/media/http%3A%2F%2Fsomesite%2Fimage.gif");
Console.WriteLine(uri.AbsoluteUri);
// Output: http://somesite/media/http%3A//somesite%2Fimage.gif

That's a totally different URL, which the target server refuses to process.

I was sure that they must have overlooked something, and that there would be some way to tell the Uri constructor to leave all encoded characters as-is. But no, it does not seem possible; dots and slashes are always decoded. I find that quite surprising, so if anyone can point me to an official solution, I'd be much obliged.

In the mean time, a reflection-based hack, courtesy of Reflector and the .NET Reference Source:

static class UriHacks
{
    // System.UriSyntaxFlags is internal, so let's duplicate the flag privately
    private const int UnEscapeDotsAndSlashes = 0x2000000;
 
    public static void LeaveDotsAndSlashesEscaped(this Uri uri)
    {
        if (uri == null)
        {
            throw new ArgumentNullException("uri");
        }
 
        FieldInfo fieldInfo = uri.GetType().GetField("m_Syntax", BindingFlags.Instance | BindingFlags.NonPublic);
        if (fieldInfo == null)
        {
            throw new MissingFieldException("'m_Syntax' field not found");
        }
        object uriParser = fieldInfo.GetValue(uri);
 
        fieldInfo = typeof(UriParser).GetField("m_Flags", BindingFlags.Instance | BindingFlags.NonPublic);
        if (fieldInfo == null)
        {
            throw new MissingFieldException("'m_Flags' field not found");
        }
        object uriSyntaxFlags = fieldInfo.GetValue(uriParser);
 
        // Clear the flag that we don't want
        uriSyntaxFlags = (int)uriSyntaxFlags & ~UnEscapeDotsAndSlashes;
 
        fieldInfo.SetValue(uriParser, uriSyntaxFlags);
    }
}

Zen error messages

24 April 2008 11:12 — Uncategorized

Triggered by Brent Strange's recent Defect of the day, I remembered a few similarly Zen-like ones from a product I worked on years ago:

'undefined' is undefined...

'True' is undefined...

□ntern□t□□n□l□z□t□□n?

18 April 2008 13:26 — Unicode

Looking for information about IRIs, I ended up at W3C's Internationalized Resource Identifiers page. Funnily enough (if you're a Unicode geek, that is...), this page deals with i18N topics but has an encoding issue:

(I've notified W3C's web-human about the fact that the page is served as UTF-8, but contains Latin-1 characters.Updated on 2008-04-29: The page has been fixed.)

Copyright © 2006-2009 Arnout Grootveld — Powered by WordPress — Hosted at pair Networks