Post by Alex StricklandPost by Charley Bay(1) Its interface is too minimal (insufficient)
(2) Its implementation is limited
(3) It does not support real-world-unicode use
That's my venting because for over a decade I never understood why
people thought std::string was an acceptable component. IMHO it's
absolutely useless. And dangerous. I don't care if I'm the only one on
the planet with my conclusion, but IMHO, std::string is
absolutely unusable insufficient crap.
I enjoyed this article: http://www.gotw.ca/gotw/084.htm
Herb Sutter's not too thrilled with it, but iirc the interface being
minimal isn't one of the reasons :)
That's a good read -- thanks!
My summary for those not wanting to RTFA (please feel free to correct me):
"Monolithic designs create complex classes that do less, not more;
sometimes member functions should be non-friend, non-member functions to
enable "logic-reuse" and "logic-extensibility"; Lookie here, "std::string"
is a good example of a monolithic class that is not-well-thought-out for
various reasons."
I admit "string" is a (somewhat) "hard problem". The concept of "string"
is:
- is ubiquitous for all systems (i.e., "key-abstraction")
- is "kind-of" a container, but not really
- is legitimately a class (requiring things like memory management,
constructors/destructors, overloaded operators, basic
accessors/manipulators)
- is legitimately a library (TONS of manipulators, search-and-replace,
accessors, etc., are required), including both "universally-understood" and
"domain-specific" library needs (e.g., advanced
"regular-expression-matching" could be considered "domain-specific", as
there are different dialects)
- implies algorithms that SHOULD be "reusable" (e.g., among
UTF-8/UTF-16/UTF-32, fixed-width or MBCS, and application-specific types
that are "like-strings" but with application-specific types, etc.)
My opinion is that "std::string" did both wrong: It's not minimal-enough
(e.g., they could have put the "heavy-lifting" into libraries), and it's
not complete-enough (it's not "useful-enough" on its own). It could have
gone to either end of the spectrum, but it chose to stand in the middle of
the road, where it will get hit by cars driving in both directions.
Our systems use both QString (it's "strong" by itself and offers great
Unicode support), and "MyCustomString" (extends functionality specific to
our needs, including *lots* of parsing). In part, we could **NEVER** use
"std::string" because there is no mechanism to extend functionality in a
reasonable way (that is not part of its current design, and we can't wait
years for a couple new API functions to show up in the "standard"). Any
APIs passing around "std::string&" are highly future-limited (e.g., not
"future-proof" enough for us).
Heck, even forcing me to decide if my API should pass a "std::string&" or
"std::wstring&" seems really stupid to me (I should not have to decide that
now, because not matter what I select, it will absolutely be wrong later.)
The proper answer is to "wrap" the decision in a "MyString" class that
handles such details so I can have a stable API. But, such is not the
belief of "The Committee".
But, quite seriously, "TheString" is absolutely a key abstraction. In any
system. That means, "Upon This Rock I Shall Build My System." I
absolutely cannot use a component that will seg-fault/core-dump when its
public API is exercised in a reasonable way (such as when assigning it a
"char*" value, that later turned out to accidentally be NULL, which happens
all the time, across module boundaries and interfacing with legacy
libraries, which is pretty much every system on Earth that uses "char*" as
a data type.) This is an absolutely (trivially) avoidable problem with a
reasonable implementation THAT THEY CHOSE TO NOT CENTRALIZE. It's bat-sh*t
INSANE to assume every application-specific use, and every programmer at
ten-minutes-till-5pm-on-Friday will merely do-the-extra-stupid-work to wrap
all operations with an "if(...)".
--charley