Use of char* rather than a string class

25th November 2002

Update - 25th November 2002

OK, OK, I folded. Deep in my heart, I still believe that char* is the most appropriate string representation for the ZOOM C++ binding, but at an implementors' meeting at Manchester University on November 11th, I was outnumbered and outgunned, and decided that the only honourable course of action was to fall on my sword. So as of version 1.3b, the C++ binding uses the STL's string class.

But that doesn't mean I have to like it, and here's why ...

Original Text - 13th November 2001

The number one question that everyone asks when they first see the C++ binding specification is why we're using Boring Old char* to represent strings - for example, in option keys and values, error messages and additional-info strings and hostnames.

The answer is that in all these cases, strings are used in the C++ binding as opaque identifiers: no string manipulation is required, so the additional functionality that might be provided by the use of a string class is not needed in ZOOM programming. This frees us up to use the least-common-denominator representation, equally familiar to C++ diehards and recent converts from C.

It also saves us from the problem of which particular string class to use: the front runner would be the string type provided by the STL (Standard Template Library), but people brought up in The Ways Of Microsoft would probably expect to see MFC's Cstring type in there[1], and be disgruntled at having to learn a new string class in order to make use of ZOOM.

A final advantage to using char* is that it makes ZOOM independent of any other library. Our understanding is that the STL is not yet ubiquitous: for example, Ashley writes ``As for the solaris STL -- well it was next to useless when I tried it''. We'd like ZOOM to be portable to platforms where the STL is not yet available, and to be comprehensible to programmers who have not learned the STL.

Of course, ZOOM applications are perfectly at liberty to use whichever string class they want for other purposes: such strings, Cstrings etc. may with impunity be fed into ZOOM methods where char*s are expected, and implicit type convensions will handle the impedance mismatch.

In summary: using a string class would not buy us much, if anything; but would impose several drawbacks that we are keen to avoid.

 


Notes

[1]
Does the world contain anything more fatuous than Microsoft's decision to give all the MFC classes names that begin with ``C''? Presumably this is so that you can tell that they're classes. Clever.

But why stop there? Why not begin the names of all ``int'' types variables with ``i''? We could even institutionalise this convention as a rule in a new, special programming language optimised for math-hacking and engineering. What a great idea! Or - no, wait - it's already been done. Shame. Looks like we're forty-seven years late. [back]

Feedback to <mike@indexdata.com> is welcome!