Standards Decoded

This approach entered my workflow a few weeks ago after a colleague, Remedan Ridwan, pointed me to Testcontainers. What stood out wasn’t the tooling itself, but how little code was required to run real dependencies in tests. With Docker already becoming standard, it became practical to exercise integration paths against actual services rather than substitutes.

Most test strategies break down not because individual components are hard to test, but because the environment they run in is assumed rather than exercised.

Applications don’t run against a database in isolation. They run inside an environment: databases, message queues, caches, object stores, authentication services, clocks, filesystems, and network boundaries. Bugs tend to emerge at the seams between these pieces, especially where configuration, protocol behavior, or timing is involved.

Mocks simplify this environment aggressively. That simplification is often intentional, but it also removes entire classes of behavior that only exist when real systems are present.

What “Mocking the Environment” Usually Means

In practice, “mocking the environment” tends to mean one of the following:

Replacing external services with in-memory implementations
Mocking interfaces that represent networked systems
Using lightweight substitutes that approximate behavior (e.g., SQLite for Postgres, localstack-style APIs, fake SMTP servers)

These approaches are fast and convenient. They also tend to encode assumptions about how the environment behaves rather than verifying those assumptions.

For example:

A mocked queue acknowledges messages immediately, bypassing retry or backoff paths. In a real system like RabbitMQ, a consumer restart triggers redelivery, revealing missing idempotency in message handlers.
In-memory databases permit concurrent writes without contention. A real PostgreSQL instance under load introduces blocking transactions, timeouts, or deadlocks that surface only with actual concurrency.
Fake caches return values instantly, hiding latency from misses. A real Redis instance introduces spikes that affect downstream logic.
Mocked services respond synchronously, missing rate limits like 429 errors from real endpoints.
Test doubles accept any token, ignoring expiry or clock skew enforced by real auth providers.
Local mocks enable atomic writes. Real object stores like S3 show partial or eventual consistency.
Mocks never fail DNS, bypassing retry logic in real networks.

Using Real Dependencies as Disposable Infrastructure

An alternative approach is to run real services with production-grade binaries inside tests, and treat them as disposable infrastructure.

This is the model Testcontainers supports.

Rather than mocking a database client or a queue interface, the test starts an actual instance of the dependency in a container:

The same database engine
The same message broker
The same object storage service
The same auth provider or proxy

The test code interacts with these services over the same protocols used in production. Configuration, startup behavior, and failure modes are preserved.

Databases are a common example, but they’re not the point. The point is environment simulation.

Example: Database as One Dependency Among Many

A database example is illustrative because it’s familiar.

// Container wraps testcontainers with database and cache connections
type Container struct {
	PostgresContainer *postgres.PostgresContainer
	RedisContainer    *redisContainer.RedisContainer
	DB                *gorm.DB
	Redis             *redis.Client
}

// Terminate gracefully shuts down all containers
func (c *Container) Terminate() error {
	ctx := context.Background()

	if c.PostgresContainer != nil {
		if err := c.PostgresContainer.Terminate(ctx); err != nil {
			return err
		}
	}

	if c.RedisContainer != nil {
		if err := c.RedisContainer.Terminate(ctx); err != nil {
			return err
		}
	}

	return nil
}

// CleanupDatabase cleans all test data from database tables
func (c *Container) CleanupDatabase(t *testing.T) {
	t.Helper()

	if c.DB == nil {
		return
	}

	// Dynamically get all user tables from information_schema
	var tables []string
	err := c.DB.Raw(`
		SELECT table_name
		FROM information_schema.tables
		WHERE table_schema = 'public'
		AND table_type = 'BASE TABLE'
		AND table_name NOT LIKE 'pg_%'
		AND table_name NOT LIKE 'sql_%'
	`).Scan(&tables).Error

	if err != nil {
		t.Logf("failed to get table list: %v", err)
		return
	}

	// Truncate all found tables
	for _, table := range tables {
		err = c.DB.Exec("TRUNCATE TABLE " + table + " CASCADE").Error
		if err != nil {
			t.Logf("failed to truncate table %s: %v", table, err)
		}
	}
}

// CleanupCache clears all Redis cache data
func (c *Container) CleanupCache(t *testing.T) {
	t.Helper()

	if c.Redis == nil {
		return
	}

	err := c.Redis.FlushAll(context.Background()).Err()
	if err != nil {
		t.Logf("failed to flush Redis cache: %v", err)
	}
}

// CleanupAll cleans both database and cache
func (c *Container) CleanupAll(t *testing.T) {
	c.CleanupDatabase(t)
	c.CleanupCache(t)
}

// SetupTestMain sets up shared containers for a test package
// Should be called from TestMain to initialize containers once per package
func SetupTestMain() (*Container, func() int) {
	ctx := context.Background()

	// Setup PostgreSQL
	postgresContainer, err := postgres.Run(ctx,
		"postgres:15-alpine",
		postgres.WithDatabase("testdb"),
		postgres.WithUsername("testuser"),
		postgres.WithPassword("testpass"),
		testcontainers.WithWaitStrategy(
			wait.ForLog("database system is ready to accept connections").
				WithOccurrence(2).
				WithStartupTimeout(5*time.Minute)),
	)
	if err != nil {
		panic("failed to start postgres container: " + err.Error())
	}

	host, err := postgresContainer.Host(ctx)
	if err != nil {
		panic("failed to get postgres host: " + err.Error())
	}

	port, err := postgresContainer.MappedPort(ctx, "5432")
	if err != nil {
		panic("failed to get postgres port: " + err.Error())
	}

	cfg := &config.Config{
		Database: config.Database{
			Host:        host,
			Port:        port.Int(),
			User:        "testuser",
			Password:    "testpass",
			Name:        "testdb",
			MaxIdleConn: 10,
			MaxOpenConn: 100,
		},
	}

	dbConn, err := db.NewPostgres(cfg)
	if err != nil {
		panic("failed to connect to test database: " + err.Error())
	}

	// Setup Redis
	redisContainer, err := redisContainer.Run(ctx,
		"redis:7-alpine",
		testcontainers.WithWaitStrategy(wait.ForLog("Ready to accept connections")),
	)
	if err != nil {
		panic("failed to start redis container: " + err.Error())
	}

	redisHost, err := redisContainer.Host(ctx)
	if err != nil {
		panic("failed to get redis host: " + err.Error())
	}

	redisPort, err := redisContainer.MappedPort(ctx, "6379")
	if err != nil {
		panic("failed to get redis port: " + err.Error())
	}

	redisClient := redis.NewClient(&redis.Options{
		Addr: redisHost + ":" + redisPort.Port(),
	})

	// Test Redis connection
	err = redisClient.Ping(ctx).Err()
	if err != nil {
		panic("failed to connect to test redis: " + err.Error())
	}

	container := &Container{
		PostgresContainer: postgresContainer,
		RedisContainer:    redisContainer,
		DB:                dbConn,
		Redis:             redisClient,
	}

	// Return cleanup function for TestMain
	cleanup := func() int {
		if err := container.Terminate(); err != nil {
			println("failed to terminate containers:", err.Error())
			return 1
		}
		return 0
	}

	return container, cleanup
}

// RunStandardMigrations runs migrations for common models used across tests
func (c *Container) RunStandardMigrations(t *testing.T) {
	t.Helper()

	err := c.DB.AutoMigrate(&user.User{}, &user.Preference{})
	require.NoError(t, err)
}

This verifies more than CRUD correctness:

Connection negotiation
Authentication
Schema compatibility
Transaction semantics
Driver behavior

But the same pattern applies elsewhere.

The same test suite can bring up:

A Redis instance with real eviction behavior
A Kafka broker with actual partitioning and offsets
An S3-compatible object store that enforces request signing
An SMTP server that accepts and rejects messages based on protocol rules

Each container adds realism that mocks typically erase.

Why This Is Not “Just Another Mock”

It’s tempting to think of containers as “better mocks.” That framing is misleading.

Mocks simulate behavior. Containers instantiate systems.

A containerized dependency:

Has startup time
Has configuration errors
Has resource limits
Can fail in partial or unexpected ways

Those properties are inconvenient, but they’re also where integration bugs come from.

When tests exercise real services, failures tend to be less surprising. Not because the code is better, but because the assumptions are weaker.

Cost and Constraints

This approach is not free.

Tests are slower
Docker becomes a test dependency
Parallelism must be controlled
CI resource limits matter

It also shifts test failures from “logical mismatch” to “environmental mismatch,” which can be harder to debug if logs and observability aren’t captured.

Adopting this style of testing usually forces a rethink of the CI process. Cloud CI environments are optimized for fast, isolated test runs, not for standing up short-lived infrastructure. If these bugs are meant to be caught in CI, pipelines need to account for container startup time, resource allocation, and artifact collection.

That said, many of these issues are often easier to surface locally than in CI. A developer running tests with real dependencies on their machine can reproduce configuration errors, protocol mismatches, and startup failures quickly, without waiting on a constrained CI environment.

In my small monolithic application, using Testcontainers did not meaningfully affect CI time. Container startup was fast, Docker overhead was negligible, and tests did not introduce noticeable delays. The additional realism came without a proportional cost.

Where This Tends to Pay Off

Running real dependencies is most useful where code behavior depends on:

Protocol details
Configuration correctness
Ordering and timing
Cross service interactions
Version specific behavior

In other words, places where mocks tend to encode guesses.

Using real services doesn’t eliminate bugs. It mostly eliminates a specific category of bug: the one where the system behaved correctly according to your test, and incorrectly according to reality.

Scope

This is not an argument against mocking. It’s an argument against pretending the environment is simpler than it is.

Testcontainers provides a way to reconstruct small, controlled slices of reality inside tests. Databases are just one visible example.

Whether that trade-off is worthwhile depends on where failures are expensive and surprises are unacceptable.