Tuesday, November 9, 2010

Too Many Encodings

Ran into an issue today where we were running too many encodings (url and html) on a string before I reached its final destination. This was causing some really weird behavior and was a little difficult to track down. But after some digging it was pretty clear to see what was happening…

Basically if you have a string like “Your $$$$$” that you wanted to search against. (In this case we want to keep the quotes as they are a search hint to find this exact phrase), you would want to url encode this to go across the wire to our REST based Solr search server.

However, we were applying an errant Html Encoding against this string before sending it on to the Solr server via SolrNet which was in turn performing the Url Encoding on the Html Encoded value. However, that really did not change the value.

Once I removed the extra Html Encoding things were back to working as expected and I learned a valuable lesson about encodings…

Here is an example:

String: “Your $$$$$”

Url Encoded: %22Your%2B%24%24%24%24%24%22

Html Encoded: %26quot%3BYour+%24%24%24%24%24%26quot%3B

While these look very similar they are no the same and managed to produce some very strange results for us as we ended performing searches on phrases like “quot” and “;”.

No comments: