New strlen(3) committed
Finally, I got my strlen(3)  committed against -HEAD. This is a long story, to put it short, I had proposed assembly version of some string operations at the point of 2005, but these was never committed due to a hard disk failure, and as Bruce pointed out, having the hand optimized assembly code use a different algorithm is not good in general.
Therefore, I have take some time on it and reimplemented the idea in C, resulting in a portable (say, you can use it on any 32-bit or 64-bit processors, and it can be easily extended to 128-bit) version.
So, is it important for *YOU*?
Generally speaking, it should not. Performance sensitive programs should, by all means, avoid C style string operations. Think a 5x better strlen(3) would boost performance of your application since it uses strlen(3) in critical path? Think again!
However, I found it valuable. There is difference in worldstone, where I saw some minor improvements. Micro-benchmark indicates that this version is at most 2x slower when the string is very short, but 5x faster for strings that is at least word-length long.
 Note: the version has been further revised to provide better comment and match style(9).
As a next step, I intend to carefully check and decide whether we should also make similar change to certain string functions that can (presumbly) benefit from similar algorithm.
So, how can a program that intends to deal with string to gain better performance?
Generally speaking: avoid C string functions, especially strlen(), strdup() and friends, always store length together with your string, if you need to do operations that would need its length information, try to align the string to match cache line, in order make it easier for processor to handle, etc.