Strlen

FreeBSD 的 strlen(3)

之前只有一篇关于较早版本的 strlen(3) 实现的笔记,这里补上我在 2010 年做的新增改进。

与 Pascal 等语言不同,C 的字符串并不保存串的长度,而是在字符串末尾以 nul 字符(’\0’)来表示字符串结束。这个设计决策是上世纪 60 年代作出的,有都市传说是为了省几个字节的空间,不过我个人认为也可能是因为汇编里面到处都是判断是否碰到了 0 的操作。不管怎么说,这个设计令 strlen 变成了一个 O(n) 的操作。

阅读全文…( 本文约 1364 字,阅读大致需要 3 分钟 )

New strlen(3) committed

| Development | #FreeBSD | #strlen

Finally, I got my strlen(3) [1] committed against -HEAD. This is a long story, to put it short, I had proposed assembly version of some string operations at the point of 2005, but these was never committed due to a hard disk failure, and as Bruce pointed out, having the hand optimized assembly code use a different algorithm is not good in general.

Therefore, I have take some time on it and reimplemented the idea in C, resulting in a portable (say, you can use it on any 32-bit or 64-bit processors, and it can be easily extended to 128-bit) version.

So, is it important for *YOU*?

Generally speaking, it should not. Performance sensitive programs should, by all means, avoid C style string operations. Think a 5x better strlen(3) would boost performance of your application since it uses strlen(3) in critical path? Think again!

However, I found it valuable. There is difference in worldstone, where I saw some minor improvements. Micro-benchmark indicates that this version is at most 2x slower when the string is very short, but 5x faster for strings that is at least word-length long.

[1] Note: the version has been further revised to provide better comment and match style(9).

阅读全文…( 本文约 295 字,阅读大致需要 2 分钟 )