Data-driven design

Well, yes and no. There's a lot of nonsense and elitism around "data-driven" approaches that makes it sound like you need a PhD and the latest drivers installed to understand it.

I'm not entirely sure about the terminology - there are some nice articles on Wikipedia:

Does everything require hellish optimization down to the byte level? Not really.

Does it need to become unreadable, unscalable development hell? Of course not. Quite the opposite.

cpp
class Enemy : public UnitBase
{
	vec2 Position;
	int  Hp;
	// ... of course bloated OOP class

public:
	virtual bool WantsToMove() const override;
	virtual vec2 GetPosition() const override;
	virtual void KeepMovingTo(vec2 InTargetPos) const override;
	virtual vec2 ApplyDamage(int InDamage) const override;
}

class ManagerClass
{
	vec2 DangerZone;
	std::array<Enemy*> AllEnemies;
}

cpp
/** Useless and slow function */
void ManagerClass::TickEnemies()
{
	// We may very well have half that are not moving
	for (auto* Obj : AllEnemies)
	{
		// Cache miss #1: vtable + member access
		if (Obj->WantsToMove())
		{
			// Second miss + member access
			Obj->KeepMovingTo(EnemyDangerZone);

			// Cache miss #3: another virtual call + member access  
			vec2 Pos = Obj->GetPosition();

			if (Pos.X > EnemyDangerZone.X) // Member access #4
			{
				// Please stop (.-.)
				Obj->ApplyDamage(3);

				// Let's not keep going with Enemy->GetHp()...
			}
		}
		
	}
}

That may very well look like a foreign language for most people. Very common for others.

cpp
class Enemy
{
public:
	void SetPosition(vec2 InNewPos);
}

struct EnemyData
{
	Enemy* Ptr;
	vec2   Pos;
	float  Speed;
	int    Hp;
	int    Sta;
};

class ManagerClass
{
	vec2 EnemyDangerZone;
	std::array<EnemyData> Active;
	std::array<EnemyData> Unactive;
	std::array<Enemy*> Pool;
}

cpp
/** Still useless, but quite faster */
void ManagerClass::TickEnemies()
{
	const vec2 TargetOnStack = EnemyDangerZone;

	// Only active ones, contiguous memory
	for (auto& Data : Active)
	{
		// Everything in cache
		Data.Pos = Lerp(Data.Pos, TargetOnStack, Data.Speed);

		// Skip virtual calls whenever possible
		Data.Ptr->SetPosition(Data.Pos);

		if (Data.Pos.X > TargetOnStack.X)
		{
			Data.Hp -= 3;

			if (Data.Hp <= 0)
			{
				// Pool whenever possible, don't reallocate.
				Data.Ptr->SetHidden();
				MoveToPool(Data.Ptr);
				return;
			}
		}
		if (--Data.Sta <= 0)
		{
			MoveToInactive(Data);
		}
	}
}

The first version is like searching for your keys in different rooms. The second version? They're all right there on the table.

But yes, that's long to read. I'm yet to find a quick 5-minute video that explains software development. C++ to boot haha

Even without mastering data locality, predictable access, virtual call cost, etc. simple changes can make a huge difference. Less pointer chasing, more actual work getting done.

Why This Actually Matters

The first approach scatters memory access all over RAM. Each GetPosition() call might jump to a completely different memory location, causing cache misses that stall the CPU. This escalates fairly easy.

The second approach keeps related data packed together. The CPU can prefetch the next ObjectData while processing the current one, keeping the pipeline full and happy.

In real-world terms? The profiler is your friend. We're not writing instructions. The compiler does.

It's not about premature optimization - it's about writing code that respects how computers actually work. Game engines are complex, optimization is part of the process. Your game code doesn't need to be.

PS: If you're optimizing more than writing code, you're probably doing it wrong. Acceptable architecture and design always wins.

A Word About Balance

While fundamentally we're trying to get the computer do what it needs to do, no more no less, it's worth mentioning that over-engineering in the name of performance can be just as harmful (or worse) as ignoring it entirely.

There's a certain elitism that celebrates turning readable, maintainable code into an unreadable mess for marginal gains. If you're making your code 10% faster 10% of the time at the cost of making it 100% harder to debug and extend, you're probably missing the point.

The goal isn't to squeeze out every last cycle - it's to write code that's both fast enough and maintainable enough. Good architecture should serve your development process, not the other way around.

And this is pretty much how ColonyBreak is being written. Simple good practices go a long way.

Self-thaught, by the way. So maybe the entire thing is wrong - but well you'll see for yourself how fast the game runs :)

Data-driven drama

Why This Actually Matters

A Word About Balance