When working with memory critical code in C, you may find you don’t want to malloc()
some addition space to perform some simple decoding. In this case, we can choose to do it in-place.
To do this, we need to make some assumptions about the nature of the decoding:
I couldn’t find anything online, so decided to write my own implementation.
The following is a URL decoder (otherwise called percent-encoding) for 7-bit ASCII. This code is used for decoding a HTML form:
0001 void inplace_decode(char* s){ 0002 int z = 0; 0003 int x = -1; 0004 while(s[++x + z] != '\0'){ 0005 /* Convert characters in place */ 0006 if(s[x + z] == '+'){ 0007 s[x] = ' ';
A +
character represents a space.
0008 }else if(s[x + z] == '%'){ 0009 /* Check if we meet end of string in next two characters */ 0010 if(s[x + z + 1] == '\0' || s[x + z + 2] == '\0'){ 0011 break; 0012 } 0013 /* Convert from hex to character */ 0014 char h = s[x + z + 1]; 0015 h -= h >= '0' && h <= '9' ? '0' : (h >= 'A' && h <= 'F' ? 'A' - 10 : h); 0016 char l = s[x + z + 2]; 0017 l -= l >= '0' && l <= '9' ? '0' : (l >= 'A' && l <= 'F' ? 'A' - 10 : l); 0018 s[x] = (h << 4) | l; 0019 /* Setup shift value */ 0020 z += 2;
Decode %
encoded characters. We don’t care exactly what gets decoded for now as we will make sure it looks sane later.
0021 }else{ 0022 s[x] = s[x + z]; 0023 }
Allow un-encoded characters through.
0024 /* Replace certain characters */ 0025 if(s[x] == '<') s[x] = '{'; 0026 if(s[x] == '>') s[x] = '}';
Ensure we don’t allow HTML specific characters to get through.
0027 /* Remove illegal characters */ 0028 if((s[x] < ' ' && s[x] != '\n' && s[x] != '\r') || s[x] > '~'){ 0029 s[x] = '#'; 0030 } 0031 }
Replace illegal un-printable characters with a #
.
0032 s[x] = '\0'; 0033 }
Finally, ensure the data in NULL terminated.
Should you use this in production code? Most definitely not. Could you use this in a spicy homebrew hacked project? Hell yeah!
If anybody plans to use this for some serious projects, most definitely stress-test your code through a fuzzer or something. There are quite a few edge cases to consider!