Showing newest posts with label C++. Show older posts
Showing newest posts with label C++. Show older posts

Wednesday, July 30, 2008

Pair #1

Finally, I'm putting up what I have for Pair, my script language intended for homebrew game projects. Pair is designed to be a small, embedded, acceptably fast (probably not yet the case) script language that will ultimately allow multiple concurrent VM (and not OS-based) threadlets, generators and continuations.

I don't have a lot of time on my hands, so stuff like this takes quite a long while. Now, there's some messed up stuff going on here. You'll have to forgive me that or wait until it's refined more. For example, scoping isn't right at all. The (let ..) expression isn't how it's supposed to be in a lisp-type language. It defines a variable for the rest of the outer scope which isn't right. I'll fix that. Continuations aren't currently supported, but the underlying machinery is mostly there as well as the mechanism for running multiple lightweight VM threads although this is not accessible from the command line program. Which is probably good, since I'm not sure how well that works with the garbage collector at this point. Also, there's a lot of just general messiness going on. There are a bunch of warnings at compile time and comments are sparse. These things will be fixed eventually. I've compiled these with Visual C++ .NET 2003. There are two parts here.

Version 0.01.00

The Pair Compiler
The Pair VM

See the Code Use Policy if you are interested in using any of this for your own purposes.

Code Use Policy

If you use code from this blog for non-commercial use please credit Paul D. Senzee and include a link to senzee5.com. If you make changes, please send them back. If you'd like to use code from this blog for commercial purposes, contact me so that we can work something out.

Tuesday, March 04, 2008

Simple Templater in C++

Like I mentioned before, I'm posting some of my personal code library that I've accumulated over the years. If you modify or improve this, send the changes back please.

Recursive template engine with escaping and lazy evaluation. Note that it is case SENSITIVE.

Use
1. Add items to Templater's map.
templater.map()["bob"] = "joe";

2. Evaluate a text item that references a map item.
s = templater.eval("well, hello [!bob]");
Returns "well, hello joe" in s;

Code
templater.h
evaluator.h
evaluator.cpp

Thursday, February 28, 2008

Java-Style Properties Files in C++

Need to handle Java-style properties files in C++? I've decided to post some of my own personal library of code on this blog. This is one of those occasionally useful things.

propertyutil.h
propertyutil.cpp

If you fix any bugs, extend the code or anything else, send back your changes if you don't mind. ;)

Thursday, December 20, 2007

Why Do We Use C++?

One of the recurring questions (especially when confronted with concurrency) is 'why do we use C++ for game development?' This entry was extracted from Concurrency in Game Development since it's really a separate topic all its own and that one was overlong anyway.

Why Do We Use C++?

Now before I hear the platitudes about how [insert favorite non-C++ language here] is better anyhow and would fix everything in one shot, let me ask you how feasible it would be to develop multi-million line software on multiple, often brand-new, platforms simultaneously with extreme performance expectations, leveraging mega/gigabytes of shared legacy code and without having to port, develop and/or maintain one's own compilers and toolsets with this superior language. If the game development industry turns to non-C++ languages to solve this, it will be slowly and painfully. And, most likely, it will be an industry-wide move and not just one studio or another, although some must lead the way. Honestly, I would never expect C++ to go away completely (just as assembly has never really gone away), and some future approaches may just be layered on top of C++ (Bigloo, Intel Ct, OpenMP), or use it when performance is critical (Java's JNI).

C++ is flexible and fast. With sufficient (possibly enormous) effort, it can do almost everything any language can do. It can function both as a high-level multiparadigm language and as a low-level portable assembly language. What makes C++ (and C) so widespread is its unrestrictive nature. This is widely seen as a negative, but in the real world, being able to abuse your language to get what you want from your machine is of crucial importance - especially when performance is a primary concern. That said, you may only want to abuse your language say 2-5% of the time. The rest (95-98%) of the time, you'd like some nice, type-safe, bounds checking, memory-managed, interactive, memoizing language, giving you a >100% increase in productivity. That we have settled on C++ indicates that the 2-5% is so crucial that we're willing to sacrifice the rest for it. In the game industry, I think that's a fair statement.

Another, often forgotten strength of C++ and of many traditional modular and modular-turned-OO languages is linking, or more generally, we might say 'package management'. C++ innately offers build-time linking and runtime 'linking' is also usually available (i.e., DLLs). This allows graceful scaling. Tool support for this in C++ is strong and reasonably robust because C++ is so frequently used to build enormous projects. It's possible to build massive projects in pieces in ways that are awkward or impossible in many functional or logic languages. (See comments, some strong points about other languages package handling vs C++'s and Which Languages Handle Packages and Libraries Best?)

Still, writing solid C++ code even in the absence of multithreading requires a mastery of nearly the whole language, making it dangerous for inexperienced developers. Even merely adequate C++ coding is heavily reliant on learned idioms that are not a part of the language, and therefore unenforceable. For example, that you are allowed to return a pointer or reference to a stack-allocated object from a function, or that you are allowed to overflow a string buffer have cost the world untold man-hours and dollars. And yet for all its ills, C++ ranks near the top of the most useful (or at least used) languages ever explicitly designed (including Esperanto).

(4/11/08 - adapted from one of my comments in reddit.)

I should stress again that one of the more crucial issues surrounding language choice is the set of tools provided to us by the console vendors. C++ wasn't adopted until late in the console world because C++ standards weren't well supported by console vendors. Tools have always been poor on consoles compared to the PC (this has changed somewhat with the 360), and good quality C++ compilers were rare in previous console generations, but C compilers were available (athough they, too, came late - with the original PlayStation or Saturn, I believe?).

The PC gaming world hasn't seen this sort of lag. Game developers are eager to adopt new technologies. Other technologies have been unavailable on these platforms.

This is an exploratory post (as they all are..) and is subject to change.

Tuesday, December 18, 2007

Concurrency in Game Development

Effectively developing concurrent software for modern machines with multiple cores is perhaps the greatest technical challenge we'll encounter in some time. Our current approaches just aren't up to the task of creating robust multithreaded code.

Why Do We Use C++?

Please see the linked post for some thoughts on this topic. Before we look directly at concurrency, let's look at what brought us to where we are.

Categories

It’s reasonably well established that categorization is a fundamental human strength. You might say that feature extraction and classification are hardware-accelerated in the brain. Even at levels far below the conscious (i.e., the visual cortex) information is categorized before it even becomes 'thought' to us. In fact, categorization is so fundamental to human thought that assuming categories themselves are real objects has been a universal illusion. In “How the Mind Works”, Pinker states that nearly all cultures initially adopt a ‘folk idealism’ as a result of this. Applying this to the boundary of man and machine communication, Object Orientation has evolved as a straightforward way to map categories (type, class, etc.) and instances of those categories onto machine architectures that deal primarily in a few primitive and largely undifferentiated numerical types.

Abstractions are complexes of categories and their interactions. Understanding complexity in terms of hierarchies of abstraction is something that people do really well. Modular Programming and, to a greater extent, Object Orientation attempt to give us tools to work at these levels of human competence - with, of course, some consequences in terms of final performance.

The predominant programming paradigms attempt to map the way people think onto the way machines operate. Let's turn our attention to concurrency and see if we can stretch this a bit further.

Concurrency as Time vs. Space

In software development, one of the things I’ve noticed is that people are much better at understanding space than time. This is why we map out time in timelines, MS Project files and a million other ways in spatial form. What this means to programming is that anywhere you can map out state in terms of space, the result is far easier to comprehend. For a clear example of this, see Google's MapReduce.

This is the core issue with concurrency. It’s very difficult to understand the possible states of concurrent systems because they happen in time. Many of the abstractions that can help with concurrency (such as Functional Programming, Message Passing, etc.) are useful because they essentially transform a state-heavy process into some equivalent but more understandable spatial map whose design appears much more static.

Open Questions

In some ways the game industry is at the vanguard of multicore development on consumer machines with the Xbox 360 and the PlayStation 3. The amount of time and effort that we spend finding and fixing multithreading bugs is terrifying. While the hardware companies move toward doubling the number of cores with each processor generation, software companies will be reeling.

Clearly, C++, as it is currently, is not well suited toward developing software on highly multicore machines. Ideally, we’d have a language, extension or paradigm that would allow us to map thread concurrency onto easily understood (that is, spatial) language constructs that discourage or prohibit the kind of multithreading errors that currently waste hours and hours of our time. Right now, we don’t have to worry how many registers there are on the processor when we write C++ code. Similarly, whatever language/paradigm we’d want to use would, at the compile stage, optimally generate code for the number of cores available to it on the target platform.

So would some sort of functional language be best? C++ with functional extensions? Erlang? Haskell?

What about programming models based more on hardware description languages such as VHDL and Verilog, which are inherently spatial? Would they map more effectively to multicore machinery since hardware languages describe processes which are inherently asynchronous?

In any event, we software developers have an interesting road ahead.

Friday, November 09, 2007

The Impossible Dream #4

(Updated 11/23/2007)

Where Have I Been?

It's been months since I've blogged. They've been productive months, though. See, in early October, another beautiful baby girl joined our family! It's been a rush having another child. It's incredible how you can still be completely awestruck as if it were the first time, each and every time. :)

Prior to that, vacation.

Because I had some time off, I slipped in some hours on my Impossible Dream. I hadn't really considered blogging about it yet until I stumbled across Project Steel & Glass at The Cluttered Desk that motivated me to write an update.

Before leaving on vacation this summer, I'd made some headway with Pair, my embedded script language. Especially, I'd made some strides compiling it to more efficent bytecode, etc. It's in a sort-of usable state. Still, some things like lexical closures are not 100%. I'll leave blogging about that until later.

Basically, much of the material pass (no illumination yet) of the game is rendering. It looks pretty raw and repetitive with respect to content right now. There's not yet much differentiation in city sections. No niceties such as shadows or even basic lighting yet. And, jeez, how am I gonna do trees? It's time, though, to switch gears for a bit and prototype the gameplay. As you might infer from the screenshots, the protagonist will be capable of flight. I've got my Xbox 360 controllers hooked up to my PC and I can fly all around town.

(Zaragosa, Mexico [fictional])


(Puebla, Mexico [real])


Please, I'm Just One Man!

So I must enlist the machine to generate a great deal of content automatically, albeit offline. This approach, of course, runs the risk of a high degree of repetition, but I think with some caution and tool refinement this will turn out great. I'm really happy with how auto-generation has gone so far. It's actually starting to vaguely resemble the residential city streets of central Mexico, which is, of course, the idea.



Anyhow, I'm pretty pumped about it. Given some time (lots) and little-by-little refinement, we may have something pretty cool here.

Sunday, July 29, 2007

Trampolines

Developing embedded languages for applications in the past, the most time consuming part always seemed to be wrapping the multitude of foreign language (ie, to/from C++) function calls of interest with script-compatible functions.

One of the goals of Pair is to make working with other languages - especially C/C++ - very easy. So here we'll take a different approach. This time we'll generate trampolines to call foreign functions (for example, from a DLL) and do the boxing/unboxing of values for Pair to use. Trampolining has a number of different meanings but many of them center around the idea of generating code at runtime to call a function with special requirements that aren't known until runtime. In this case, we're directly generating the machine code necessary to set up the stack frame and call a function using either the __cdecl (for the C runtime library) or __stdcall (for the Win32 SDK functions) calling conventions.

This allows us to do things like the following -

(let
((beep
(import "kernel32.dll" stdcall uint Beep (uint uint)))
(system
(import "msvcrt.dll" cdecl uint system (string))))
(system "dir c:\\root")
(beep 500 1000))
C++ code -

nativecaller.h
nativecaller.cpp

Tuesday, June 26, 2007

EASTL

EA genius and colleague Paul Pedriana published a paper to the C++ Standards Committee detailing his EASTL, an EA version of the Standard Template Library that provides a number of efficiencies with game development in mind, but that are nonetheless applicable across other software domains.

I've always been a fan of the STL in general, and since I've been at Electronic Arts, of EASTL. It's nice to see some of these innovations get out into the world.

Wednesday, March 21, 2007

Bytecode to Native Compilation

At work, some of the software I develop uses a bytecode interpreter. And, as always, we need better performance from the whole system. So I'm looking into bytecode-to-native (in this case C++) compilation. I've done this before, with embedded Lisp-based languages and there are a number of compilers available that do this for Java (GCC has a back-end for this), C#, Lisp and its derivatives (Bigloo for Scheme, for instance). Compiling to C or C++ is great as it serves as a sort of portable assembly language and it's possible to leverage further the fine optimization skillz of modern C/C++ compilers. I'll report on this when I make some progress - if I don't get pulled off onto something else.

Wednesday, January 24, 2007

7

(Updated 7/10/07)

Inspired by the interest in my 5-card poker hand code that plugs into Cactus Kev's evaluator, I've decided to revisit my unholy 7-card evaluator and make a faster?, cleaner one that I can then post up here.

For the 5-card hash I used Bob Jenkin's Perfect Hashing code. Check out his excellent site for great perfect hashing code & ideas.

My current 7 card evaluator first determines if there is a flush. If not, it looks up the final value in a 13 * 13 * 13 * 13 * 13 * 13 * 13 (13^7, 63M entries) precalc'd table. Arghh! If it is a flush, though, it evaluates all 21 combinations (7 choose 5) in the normal (albeit optimized) way.

But this is not how I want my grandchildren to remember my code. Let's think about other options. Now { 52 choose 7 } yields about 133 million possiblities, right? The first crucial step in thinking about optimizing the seven hand evaluation is figuring out a way to efficiently map every unique set of 7 out of 52 cards to one unique number of the 133 million possiblilities.

As it turns out, I've got some code to do that. Nevertheless, I need to do a little cleanup before I post that. So look for "7 Part II" sometime soon.. ;)

Part II: 52 Choose 7

As promised, code to map any 7 of 52 items (7 of 52 bits) to a unique index in the range of 0-133M (52 choose 7).

index52c7.h

This, of course, could be used for a super-fast 7 card hand evaluator with a precomputed table of size 266mb.

Jing, commenting below mentions that a 2+2 forum has some super-fast seven hand evaluators. Glancing briefly at the site I notice claims of 12.5 cycles per evaluation, which seems too good to be true. After all, a single out-of-cache table lookup can cost much more than that. But if it is true - sweet!

Some Clarifications

Andy Reagan emailed me and made some excellent points concerning the readability of index52c7.h.

"[It's hard] to understand what the code was doing without comments and with the generalized table and variable names.."

I apologize for that. Actually, I wrote another program to generate this file, which is one of the reasons why it's so obtuse. It would probably be a good idea to publish the generator program as well.

"What does the function index52c7 do?"

Here's the reasoning for index52c7:

We can completely represent a hand of 7 cards of 52 with a single 52-bit number with 7 bits set. We assign each possible card in a deck a number between 1 and 52, inclusive. For example, the Queen/Hearts might be 43 and 2/Spades might be 17. Then, we take a 64-bit number (large enough to contain the 52-bits) and set a bit for each of the 7 cards we have. If two of the seven cards we have are Queen/Hearts and 2/Spades we'd set bits 43 and 17 along with the bits that correspond to the other five cards.

Now, if we had unlimited memory, we could just use this number as an index into an enormous and very sparse array. Unfortunately, this array would have 2^52 (4.5 quadrillion) entries. Assuming two bytes per entry, that would require 9 petabytes of memory! So we need to somehow hash this number into a much smaller space. It turns out that the number of possible combinations of 7 items among 52 is about 133 million (52 choose 7), so ideally, we could somehow hash the 52 bit number into a number between 0 and 133 million that uniquely identifies a given hand.

That's what index52c7 does. It translates the 52 bit hand representation into a much smaller, but still unique number. At two bytes per entry, that gives us a table of 266 megabytes, which is large and in certain cases inconvenient, but certainly doable.

Using, say, Cactus Kev's code to evaluate each possible 7-card hand, we'd first generate the 266MB table and populate it by looking up the corresponding index with index52c7. Now that the table's fully populated, we can just pass index52c7 the 52-bit number and use the resulting index to pull the answer straight out of the array.

Monday, December 04, 2006

Inline Assembly

On my current project I will soon delve into optimization tasks at the level of inline assembly for PowerPC. These days the use of inline assembly is almost never justified. It's about as unportable as code can be and it's nearly impossible to understand once it's written. Most of the time, unless you devote a great deal of energy or unless you are using processor features (SIMD, for example) inaccessible through C or C++, hand-written assembly will actually be slower than compiler generated code. Furthermore, most of what you learned a few years ago about optimizing assembly code simply does not hold any more. For example, what's the faster way to multiply an integer by 5 on the x86?

A. x = (x << 2) + x;

or

B. x *= 5;

Old school assembly says that A is faster. Not true anymore. The imul (integer multiply) instruction is as fast as a single shift on the x86 these days. Counting cycles? Hard to do these days with deep pipelines, instruction reordering, branch prediction and unpredictable memory latency. The most effective way to optimize assembly seems to be aggressive profiling and trial and error. Gone are the days when you can optimize code by counting cycles with the processor manual tables in hand. Even so, these guidelines are important:

1. Most importantly, make sure you have the most efficient algorithm possible for the job before moving to assembly! There are a million good reasons for this and nothing could be more embarrassing than having your finely tuned assembly bubble sort owned by a C (or Java!) mergesort written in 12 minutes.

2. Profile changes aggressively and with the finest resolution (usually the CPU cycle counters) possible.

3. Space out memory accesses. Because of memory latency (and asynchronous memory access), you can hide cycles between your memory reads and writes.

4. Know your memory access patterns and take advantage of them. Do you only write and never read back from certain areas of memory? It may be beneficial to write-through directly to memory and avoid caching. It can also be useful to prefetch memory in certain cases.

5. Keep your data structures small enough to fit completely in cache. This will yield enormous benefits if you can do it.

6. Use SIMD where appropriate. This can give great benefit and itself may justify moving to inline assembly. However, don't spend an excessive number of cycles trying to fit data into SIMD-ready structures. It'll probably cost more than you'll get from it. Use SIMD when it's a good fit.

7. Unroll loops - to a point. Unroll tiny loops until they no longer provide a performance benefit. Keep unrolling and profiling. When you've gone too far you'll see a significant performance drop as that piece of code outgrows the instruction cache. If you have enough information on the hardware, you can figure out where this threshold will be.

8. On PC use SIMD for 64-bit integer arithmetic instead of the atrocious code that's generated for this by Visual C++.

Just so you know, this entry is subject to revision. Have any other guidelines? Let me know about 'em!

Tuesday, August 22, 2006

Let Me Count The Ways (Part II)

In the Sieve of Eratosthenes post, I mentioned that at work (EA) we have an optional programming challenge every once in a while. It was started last year by Jim Hejl and it's called Hacker's Delight - named after the excellent book of programming tricks by Henry S. Warren.

I love this stuff. Why? Who knows? Anyhow, this time I wrote the challenge - Hacker's Delight #6. I chose the { 64 choose 4 } problem from Let Me Count The Ways. My fast solution was initially ~3.9ms. Since I had not yet aggressively optimized it, I did that and got it down to ~2.5ms. That solution was written completely in C++. I figured that was about as fast as it got because it turned out that ~2.5ms is (slightly) faster than memset(data, 0, 635376 * sizeof(unsigned __int64)) !!

In short order, solutions came in that were almost exactly as fast as mine with quite different algorithms. Hmm. I thought, the bottleneck is memory-processor bandwidth - ~2.5ms is simply how long it takes to write 4.8mb of data. All the other processing is swamped by that.

Even so, Jim was experimenting with SIMD approaches and had an SSE version of 'memset' that executed in about half memset's time. I started playing with that but couldn't figure out how to use that to get better performance out of my algorithm.

Then Jim tells me that he received an entry from EA United Kingdom that executes in ~1.25ms - all SIMD (MMX)! The gauntlet is down. I asked him not to forward it to me or let me see it.

So today I finally got my own MMX version to about ~1.25ms. It seems to execute slightly faster than Jim's SSE memset. While minor speed improvements may be possible (on the order of tenths of milliseconds), I'm convinced that there's no way to get significantly faster performance.

I could be wrong.

When it's all over, I will post a more detailed description of the optimization steps for those of you who are interested. For those who are not, z z z z z z z z z z...

[The related, and more detailed document Optimizing 64 Choose 4 (.pdf)]

Thursday, June 08, 2006

Some Perfect Hash

The computer science department of the University of Alberta in Edmonton researches Artificial Intelligence (AI) for Poker. From their site I came upon Cactus Kev's Poker Hand Evaluator which is a killer fast five card hand evaluator. Reading through his algorithm you'll notice that the last step for any yet unclassified hand is a binary search through a list of values. Most hands end up in that search which is the most time-consuming part of the algorithm.

An Optimization

Replacing the binary search with a precomputed perfect hash for the 4888 values in the list yields a significant improvement over the original. The original test code included with Kevin's source (using eval_5hand) runs in 172 milliseconds on my machine and with my eval_5hand_fast it runs in 63 milliseconds. Yay, an improvement of 2.7 times!

fast_eval.c

(Updated 7/10/07)

This post has garnered a bit of attention. Cactus Kev added a comment and a link on his site back to the post and I've had an email conversation with a programmer (anonymous unless he's cool if I mention him) who ported the code to C# and reports:

Kev: 159ms
Your mod: 66ms
My C# version of your mod: 88ms

Pointers for seven hand evaluation? Check out the post 7.

Sieve of Eratosthenes

At work we had a little challenge in December to see who could come up with a program to count the number of primes between any two numbers (0 to 2^32-1 inclusive) as fast as possible. To this end I wrote this optimized Sieve of Eratosthenes (algorithm) that counts all the primes from 0 to 4,294,967,295 in about 13.75 seconds on my (admittedly fast) development machine at work.

If you come up with any improvements to it, let me know!

(2/2/2007 - I'm going to go ahead and add the link for the multicore Sieve of Eratosthenes here. On a 3GHz dual core Xeon, 7.15s!)

Friday, June 02, 2006

Let Me Count The Ways

Lately, I've been tinkering with card games (poker) a bit and one of the little questions that came up was what is the fastest way to enumerate all possible combinations of 4 items out of a possible 64? Actually, it doesn't have to be 4 of 64, it could be 7 of 48, or 2 specific aces from 52 cards or 3 bits of 10. The field of combinatorics can tell us how many there are. This is expressed as { n choose r } and has the formula factorial(n) / (factorial(r) * factorial(n - r)). In the case of n = 64 and r = 4 it yields 635,376 different combinations.

So, the task is to enumerate each of these unique combinations exactly one time until all 635,376 have been generated. This is one of those problems where the right approach makes all the difference.

The Naive Approach

Let's look, for a moment, at the most basic brute force approach: iterate through every value that can be contained in 64 bits and check if it has four bits. How long would this take? On my machine it takes 1 second to check 250 million numbers. Pretty fast, right? At this rate, however, it will take 2,340 years to check all 2^64 numbers! Clearly not a usable solution.

A Better Approach

Poking around the internet I found a function that takes a number and returns the next highest number with the same number of bits. I adapt it a bit and bam!


int enumerate_combinations(int n, int r, unsigned __int64 *v)
{
unsigned __int64 y, r, x;
unsigned count = (unsigned)math::choose(n, r);
v[0] = ((unsigned __int64)1 << r) - 1;
for (unsigned i = 1; i < count; i++)
{
x = v[i - 1];
y = x & -(__int64)x;
r = x + y;
v[i] = r | (((x ^ r) >> 2) / y);
}
return count;
}


This variation turns in 32 milliseconds for complete enumeration. Not a bad improvement over 2,340 years, eh?

The Best Approach

I actually developed the following function before the above. However, I figured (from looking at the code) that the above would blow this one away. How wrong I was. One problem with the above approach is that it has a nasty little division. Another thing is that this approach takes advantage of certain special cases like when r == 1 and n == r. The following approach is based on my initial recursive approach, but I removed the recursion so that I could rewrite it as an iterator. Removing the recursion did not seem to have a significant impact on performance. Anyhow the following runs to completion in just 3 milliseconds, over 10 times faster than the above version.


template <typename T = unsigned __int64>
int enumerate_combinations(int n, int r, T *v)
{
struct { int n, r; T h; } s[sizeof(T) * 8] = { { n, r, 0 } }, q;
int si = 1, i = 0;
T one = 1;

while (si)
{
q = s[--si];

tail:

if (q.r != 0)
{
one = 1;
if (q.r == 1)
{
for (int j = 0; j < q.n; j++)
{
v[i++] = q.h | one;
one <<= 1;
}
}
else if (q.r == q.n)
{
v[i++] = q.h | (one << q.n) - 1;
}
else
{
--q.n; s[si++] = q; q.r--;
q.h |= one << q.n;
goto tail;
}
}
}

return i;
}

Thursday, May 18, 2006

class vs. struct

As C++ interviewees know well, the only difference between class and struct in C++ is that class defaults to an access mode of private and struct defaults to public. This means that the difference between them is purely syntactic and has no semantic connotation whatsoever. Because of this, some C++ experts believe that the struct keyword should not be used at all and we should always use class { public: instead.
So why do developers continue to use both when there is no semantic difference? To people, struct and class communicate subtly different ideas. Developers often use the struct keyword (because of its C heritage) to indicate a lightweight, open record that is not encapsulated. For example, a small record intended to be written directly to a file is more likely to be a struct in these situations. The class keyword is then used for traditional C++ object orientation. The fascinating thing about this dichotomy is that even computer language keywords develop nuances of meaning apart from their original intent.

Formatting std::string

The following snippet has been a part of my personal code library for years. It is useful for formatting a std::string in a traditional printf() way. For all its ills, printf/sprintf() is incredibly convenient. This code is for Win32. Minor modification is required for Unix.


#include <stdio.h>
#include <stdarg.h>
#include <string>

std::string format_arg_list(const char *fmt, va_list args)
{
if (!fmt) return "";
int result = -1, length = 256;
char *buffer = 0;
while (result == -1)
{
if (buffer) delete [] buffer;
buffer = new char [length + 1];
memset(buffer, 0, length + 1);
result = _vsnprintf(buffer, length, fmt, args);
length *= 2;
}
std::string s(buffer);
delete [] buffer;
return s;
}

std::string format(const char *fmt, ...)
{
va_list args;
va_start(args, fmt);
std::string s = format_arg_list(fmt, args);
va_end(args);
return s;
}