Using External S3 for Testing Stopped Making Sense, So I Built SeaweedFS Storage Myself

Anonymous

Feb 6, 2026

13 min read

Introduction

When you keep testing uploads, presigned URLs, public buckets, CDN caching, and thumbnails locally, wiring an external S3 service into every test-only loop starts to feel more awkward than expected. It was not just about cost. What kept bothering me was having to route every experiment through external accounts and resources.

So I changed direction this time. I decided to bring the storage I use repeatedly in tests on-prem, while keeping the interface the application sees as close to S3 그대로 as possible. I felt this was a much better shape: use an on-prem S3-compatible store in local and dev, then swap the same interface to AWS S3 or Cloudflare R2 in production when needed.

As a result, I built a single-node S3-compatible storage stack with SeaweedFS + Nginx CDN + TLS + TTL. This project was also a case where AI automated almost the entire flow, from research drafts and infrastructure config drafts to manifest writing and deployment procedure write-ups. In this post, all sensitive information such as real keys, Secrets, internal paths, and internal IPs is masked.

Why build it myself at all?

AWS S3 itself was not the problem. It just kept feeling less natural for test workloads.

I needed to repeatedly verify file upload flows in local and development environments.
I needed to split public and private buckets and check CDN behavior, presigned URLs, and thumbnail layers together.
I had many workflows that created and deleted temporary files on short cycles.
Running all of those experiments against external storage every time felt too heavy for what was supposed to be simple testing.

What I wanted was not a massive storage platform, but 테스트와 검증에 충분한 S3 호환성 and 간단한 온프레미스 운영성. More precisely, I needed the storage contract visible to developers to stay fixed as an S3 호환 인터페이스, so only the backend implementation could be swapped by environment.

Why `SeaweedFS`?

I looked at the most familiar option first, but as of late 2025 it was hard to keep MinIO as the default choice. Because of the license change and the shift in community direction, I felt it had become an awkward fit for something I wanted to carry lightly over the long term.

On the other hand, Ceph RGW was a far more complete option, but it was too heavy for this goal. I needed single-node test storage and simple media handling, not a large-scale storage cluster.

So the final choice was SeaweedFS.

Its Apache 2.0 license keeps commercial constraints light.
It provides an S3 Gateway, so I did not need to heavily change the existing SDK flow.
For the main use cases such as uploads, downloads, bucket-level separation, and presigned URLs, it keeps the S3 interface almost intact.
It is easy to deploy directly to Kubernetes with Helm.
It is a good fit for small media files such as images, audio, and short videos.

Rather than finding a perfect S3 replacement, this was the most concise choice for my scope: S3와 거의 같은 계약으로 움직이는 저장소.

How I actually put it together

I kept the setup straightforward. I built an S3-compatible layer around SeaweedFS, put an Nginx cache layer in front of it, and added TLS, a thumbnail layer for public buckets, and a TTL policy for temporary files on top.

At a high level, the runtime pieces looked like this.

S3 endpoint: the primary storage endpoint the application talks to directly
CDN endpoint: cached delivery for public resources
thumbnail endpoint: an image transformation layer only for public buckets
bucket split: static, public are public, while images, videos, bgm, files are private
TTL: the _tmp/ prefix expires automatically after two weeks

That let me validate the main test flows in one place: uploads, public access, private presigned URLs, CDN caching, thumbnail generation, and temporary file cleanup, all on top of the same storage.

On the Spring Boot side, I also did not need to change the existing S3 integration very much. In local and test profiles, I only switched the endpoint to the on-prem S3-compatible address and aligned path-style access plus the public/private bucket rules. In other words, even after building new storage, the application code could keep its overall shape.

That part was especially good. In local and dev, I can use an on-prem store such as SeaweedFS, and in production I can switch to AWS S3 or R2 while keeping almost the same application-facing interface. Changing the storage product no longer means blowing up the application's storage abstraction.

Put differently, the core value of this build was not merely standing up one storage server. It was fixing the code-facing interface to S3 compatibility even though the actual backend implementation can vary by environment. That kind of interface is good for testing, and it also makes later production storage changes much more flexible.

How far AI went this time

The interesting part of this work was not SeaweedFS itself, but how far AI could push the process.

In practice, AI fully automated the tasks below, and the human side only handled sensitive inputs and the final review.

researching S3-compatible storage candidates and drafting the comparison
narrowing the options after comparing MinIO, SeaweedFS, Ceph, and RustFS
drafting Helm values, Nginx cache manifests, and TLS configuration
organizing bucket structure, TTL policies, test procedures, and an operations checklist
writing deployment guides and troubleshooting documents

What this reinforced for me is that infrastructure work can also be automated surprisingly deeply once the requirements and security boundaries are defined first. Especially in a repeated flow like 초안 작성 -> 설정 파일 생성 -> 배포 절차 문서화 -> 실제 반영, the practical efficiency gain was obvious.

Closing

This was not some grand story about adopting a storage platform. It was closer to a record of pulling the S3 I kept using for tests a little more into my own hands.

Even so, it was meaningful enough. I can now validate real operational flows such as public and private resources, CDN delivery, presigned URLs, thumbnails, and TTL more often without external dependencies. More importantly, I now have a structure where local and dev can stay on-prem while production can choose S3 or R2 behind the same S3-compatible interface.

And one more thing became clearer. AI can automate not only code writing, but also infrastructure drafts, configuration, and deployment documents at a fairly high level. There is still work left for humans, but at least now the boundary of what can be delegated is much clearer.

Appendix

Below is the full text of documents/workspace/infrastructure/S3_COMPATIBLE_STORAGE_RESEARCH.md.

Research on S3-Compatible Object Storage Solutions

Date: 2025-12-30 Purpose: Selecting a self-hosted infrastructure solution to replace AWS S3

Background

MinIO status (December 2025)

MinIO is no longer recommended. The main reasons are:

License issue: changed from Apache 2.0 to AGPL-3.0
- If offered as a network service, source code disclosure is required
- For commercial use, annual licensing starts at $96,000
Moved into maintenance mode (December 2025)
- No more new features, improvements, or PR acceptance
- Security patches are applied only after case-by-case evaluation
- Community edition binary distribution has ended (source code only)
Management UI removed
- Admin console functionality was removed from the community edition
- Full features are available only in the paid version

Reference: InfoQ - MinIO in Maintenance Mode

Requirements

Item	Requirement
S3 compatibility	Required - must work with the AWS SDK as-is
License	Prefer a license without commercial restrictions (Apache 2.0, MIT, etc.)
Stored assets	Images, BGM, short-form video clips
CDN expansion	Must support a caching CDN through an Nginx reverse proxy
Deployment environment	Kubernetes + Helm charts
Cost	Reduce AWS costs with self-hosted infrastructure

Solution comparison

1. SeaweedFS (Recommended)

Item	Details
License	Apache 2.0 (free for commercial use)
Language	Go
Architecture	Master + Volume + Filer structure
Characteristics	O(1) disk seek, based on the Facebook Haystack architecture
S3 compatibility	Provides an S3 Gateway (fully supports core S3 operations)
Helm	Official Helm charts + Kubernetes Operator
Maturity	Production proven (deployments over 1.5PB)
Enterprise	Free up to 25TB, then $1/TB/month

Pros

Can handle tens of billions of files, which is ideal for media storage
Fast access to small files with O(1) seek
Better storage efficiency through Erasure Coding
Cloud tiering for automatic cold-data offloading
Supports FUSE mounts, WebDAV, and Hadoop integration

Cons

Some advanced S3 features are not supported (for example lifecycle policies)
Metadata backup is mandatory (if Filer metadata is lost, files become orphaned)

Helm installation

helm repo add seaweedfs https://seaweedfs.github.io/seaweedfs/helm
helm upgrade --install seaweedfs seaweedfs/seaweedfs -n storage --create-namespace

Reference: SeaweedFS GitHub

2. RustFS

Item	Details
License	Apache 2.0
Language	Rust
Performance	2.3x faster than MinIO for 4KB objects
S3 compatibility	Full S3 API support
Helm	Official Helm charts
Maturity	Beta as of December 2025 (`0.0.77`)

Pros

Supports MinIO migration and coexistence
Strong performance for small objects
Apache 2.0 license

Cons

Still in beta, so production use needs caution
Limited documentation and community support

Helm installation

helm repo add rustfs https://charts.rustfs.com
helm install rustfs rustfs/rustfs -n rustfs --create-namespace \
  --set ingress.className="nginx"

Reference: RustFS GitHub

3. Ceph RGW (via Rook)

Item	Details
License	LGPL 2.1
Language	C++ (data path), Go (Rook operator)
Architecture	Unified RADOS-based storage (Block + File + Object)
S3 compatibility	Best in class (passes 576 `s3-tests`)
Helm	Rook Operator available
Maturity	Enterprise-grade (proven for years)

Pros

Top-tier S3 compatibility with the broadest API coverage
Unified block, file, and object storage
Multi-tenancy and namespace isolation
Advanced Erasure Coding configuration
Exabyte-scale expansion

Cons

Complex to install and operate
High resource requirements (at least 3 nodes and a fast network)
Steep learning curve

Helm installation (Rook)

helm repo add rook-release https://charts.rook.io/release
helm install rook-ceph rook-release/rook-ceph -n rook-ceph --create-namespace
# CephCluster CRD 적용 필요

Reference: Rook Documentation

4. Garage

Item	Details
License	AGPL-3.0 (same issue as MinIO)
Language	Rust
Characteristics	Lightweight and specialized for geographically distributed deployment
S3 compatibility	Supports core S3 operations (advanced features are limited)
Helm	Community chart
Maturity	Suitable for small-scale self-hosting

Pros

Lightweight and resource-efficient
Built-in multi-zone and multi-site replication
Rust-based memory safety

Cons

AGPL-3.0 license with commercial-use constraints
No Erasure Coding support (3x replication only)
Limited advanced S3 features

Reference: Garage

S3 compatibility comparison

Solution	`s3-tests` passed	Evaluation
Ceph RGW	576	Best
Zenko CloudServer	382	Strong
MinIO	321	Good
SeaweedFS	56	Basic

SeaweedFS fully supports core S3 operations (PUT, GET, DELETE, LIST, etc.), but advanced features such as Object Lock and Lifecycle are limited

Final recommendation

SeaweedFS (strongly recommended)

Why recommend it:

Apache 2.0 license - fully open for commercial use
Optimized for media storage - excellent for images, audio, and video
Production proven - backed by years of large-scale deployment cases
Kubernetes-friendly - official Helm charts and Operator
Reasonable pricing - free up to 25TB, then $1/TB per month

Runner-up options

Situation	Recommendation
Need near-perfect S3 compatibility	Ceph RGW (if you can absorb the complexity)
Prefer newer tech and can tolerate experimentation	RustFS (if you can absorb the beta risk)
Small scale and AGPL is not a concern	Garage

Nginx CDN setup guide

Example CDN setup with SeaweedFS + Nginx:

Nginx cache configuration

# /etc/nginx/nginx.conf

http {
    # 캐시 저장소 설정
    proxy_cache_path /var/cache/nginx/s3
        levels=1:2
        keys_zone=s3_cache:100m
        max_size=50g
        inactive=7d
        use_temp_path=off;

    upstream seaweedfs_s3 {
        server seaweedfs-s3.storage.svc.cluster.local:8333;
        keepalive 64;
    }

    server {
        listen 80;
        server_name cdn.example.com;

        # 이미지 캐싱 (7일)
        location ~* \.(jpg|jpeg|png|gif|webp|ico|svg)$ {
            proxy_pass http://seaweedfs_s3;
            proxy_cache s3_cache;
            proxy_cache_valid 200 7d;
            proxy_cache_valid 404 1m;
            proxy_cache_use_stale error timeout updating;
            proxy_cache_lock on;

            add_header X-Cache-Status $upstream_cache_status;
            add_header Cache-Control "public, max-age=604800";

            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
        }

        # 오디오/비디오 캐싱 (30일)
        location ~* \.(mp3|wav|ogg|mp4|webm|m4a)$ {
            proxy_pass http://seaweedfs_s3;
            proxy_cache s3_cache;
            proxy_cache_valid 200 30d;
            proxy_cache_valid 404 1m;
            proxy_cache_use_stale error timeout updating;
            proxy_cache_lock on;

            add_header X-Cache-Status $upstream_cache_status;
            add_header Cache-Control "public, max-age=2592000";

            # Range 요청 지원 (비디오 스트리밍)
            proxy_set_header Range $http_range;
            proxy_set_header If-Range $http_if_range;

            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
        }

        # 기타 파일
        location / {
            proxy_pass http://seaweedfs_s3;
            proxy_cache s3_cache;
            proxy_cache_valid 200 1d;

            add_header X-Cache-Status $upstream_cache_status;

            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
        }
    }
}

Cache status headers

HIT: served from cache
MISS: fetched from origin
STALE: an expired cache entry was served because the origin failed
UPDATING: background refresh in progress

Reference: NGINX Caching Guide

Kubernetes deployment architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Kubernetes Cluster                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                    Ingress (Nginx)                        │   │
│  │                   cdn.example.com                         │   │
│  └──────────────────────┬───────────────────────────────────┘   │
│                         │                                        │
│  ┌──────────────────────▼───────────────────────────────────┐   │
│  │              Nginx Cache Layer (CDN)                      │   │
│  │          /var/cache/nginx (PVC: 50Gi+)                   │   │
│  └──────────────────────┬───────────────────────────────────┘   │
│                         │                                        │
│  ┌──────────────────────▼───────────────────────────────────┐   │
│  │                   SeaweedFS                               │   │
│  │  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐      │   │
│  │  │ Master  │  │ Volume  │  │ Volume  │  │ Volume  │      │   │
│  │  │  (x3)   │  │   #1    │  │   #2    │  │   #3    │      │   │
│  │  └────┬────┘  └────┬────┘  └────┬────┘  └────┬────┘      │   │
│  │       │            │            │            │            │   │
│  │  ┌────▼────────────▼────────────▼────────────▼────┐      │   │
│  │  │              Filer (S3 Gateway)                 │      │   │
│  │  │           seaweedfs-s3:8333                     │      │   │
│  │  └─────────────────────────────────────────────────┘      │   │
│  └──────────────────────────────────────────────────────────┘   │
│                                                                  │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                  Persistent Volumes                       │   │
│  │     Volume #1      Volume #2      Volume #3               │   │
│  │     (100Gi+)       (100Gi+)       (100Gi+)                │   │
│  └──────────────────────────────────────────────────────────┘   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Next steps

Build a test environment with the SeaweedFS Helm chart
Run S3 SDK integration tests (confirm compatibility with existing code)
Set up the Nginx cache layer
Run performance benchmarks (images, audio, video)
Create a production deployment plan

Introduction

Why build it myself at all?

Why SeaweedFS?

How I actually put it together

How far AI went this time

Closing

Appendix

Research on S3-Compatible Object Storage Solutions

Background

MinIO status (December 2025)

Requirements

Solution comparison

1. SeaweedFS (Recommended)

Pros

Cons

Helm installation

2. RustFS

Pros

Cons

Helm installation

3. Ceph RGW (via Rook)

Pros

Cons

Helm installation (Rook)

4. Garage

Pros

Cons

S3 compatibility comparison

Final recommendation

SeaweedFS (strongly recommended)

Runner-up options

Nginx CDN setup guide

Nginx cache configuration

Cache status headers

Kubernetes deployment architecture

Next steps

References

Why `SeaweedFS`?