引言:从单机到分布式容器架构的演进
在传统Web应用部署中,我们常常面临环境不一致、部署效率低下等问题。我曾经维护过一个需要手动在5台服务器上重复部署的游戏项目,每次发布都如同走钢丝。本文将详细分享如何基于CentOS系统,构建完整的分布式Docker架构,实现GitLab+Jenkins+生产环境的三节点CI/CD流水线,最终成功部署Web游戏项目的全过程。
第一部分:架构设计与环境规划
1.1 分布式节点规划
三节点架构:
GitLab节点:192.168.1.101(4核8G内存,200G存储)
Jenkins节点:192.168.1.102(4核8G内存)
生产环境节点:192.168.1.103(8核16G内存,NVIDIA T4 GPU)
# 各节点基础环境准备(CentOS 7.9)
sudo yum install -y yum-utils device-mapper-persistent-data lvm2
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install -y docker-ce docker-ce-cli containerd.io
sudo systemctl enable --now docker
1.2 网络拓扑设计
图:三节点Docker Swarm网络拓扑
关键配置:
使用Overlay网络实现跨主机容器通信
为每个服务配置独立的子网
通过Nginx实现服务发现和负载均衡
# 初始化Docker Swarm集群(在生产节点)
docker swarm init --advertise-addr 192.168.1.103# 在其他节点加入集群
docker swarm join --token SWMTKN-1-xxx 192.168.1.103:2377
第二部分:核心组件部署
2.1 GitLab容器化部署(192.168.1.101)
# 创建数据卷目录
mkdir -p /gitlab/{config,logs,data}# 启动GitLab容器
docker run -d \--hostname gitlab.example.com \--publish 8443:443 --publish 8080:80 --publish 8022:22 \--name gitlab \--restart always \--volume /gitlab/config:/etc/gitlab \--volume /gitlab/logs:/var/log/gitlab \--volume /gitlab/data:/var/opt/gitlab \--shm-size 256m \gitlab/gitlab-ce:15.11.8-ce.0
性能调优:
修改/gitlab/config/gitlab.rb
unicorn['worker_processes'] = 4
postgresql['shared_buffers'] = "256MB"
sidekiq['concurrency'] = 10
2.2 Jenkins容器化部署(192.168.1.102)
# 自定义Jenkins Dockerfile
FROM jenkins/jenkins:2.414.3-lts-jdk11
USER root
RUN apt-get update && \apt-get install -y docker.io python3-pip && \pip3 install docker-compose
COPY plugins.txt /usr/share/jenkins/ref/plugins.txt
RUN jenkins-plugin-cli -f /usr/share/jenkins/ref/plugins.txt
USER jenkins
# 启动Jenkins容器
docker run -d \--name jenkins \-p 8081:8080 -p 50000:50000 \-v /jenkins_home:/var/jenkins_home \-v /var/run/docker.sock:/var/run/docker.sock \--restart unless-stopped \my-jenkins-image
关键插件:
Docker Pipeline
Blue Ocean
GitLab Plugin
SSH Pipeline Steps
第三部分:Web游戏项目容器化
3.1 游戏架构分析
项目采用前后端分离架构:
前端:Unity WebGL构建
后端:Node.js游戏服务器
数据库:MongoDB分片集群
实时通信:WebSocket
3.2 多服务Docker Compose编排
version: '3.8'services:game-frontend:image: registry.example.com/game-webgl:${TAG}deploy:replicas: 3update_config:parallelism: 1delay: 10srestart_policy:condition: on-failurenetworks:- game-networkgame-server:image: registry.example.com/game-server:${TAG}environment:- NODE_ENV=production- MONGO_URI=mongodb://mongo1:27017,mongo2:27017,mongo3:27017/game?replicaSet=rs0deploy:replicas: 2networks:- game-networkdepends_on:- mongo1- mongo2- mongo3mongo1:image: mongo:5.0command: mongod --replSet rs0 --bind_ip_allvolumes:- mongo1-data:/data/dbnetworks:- game-network# mongo2和mongo3配置类似...nginx:image: nginx:1.23ports:- "80:80"- "443:443"volumes:- ./nginx.conf:/etc/nginx/nginx.confdepends_on:- game-frontend- game-servernetworks:- game-networknetworks:game-network:driver: overlayvolumes:mongo1-data:mongo2-data:mongo3-data:
3.3 Nginx关键配置
# nginx.conf
upstream game_servers {server game-server:3000;
}server {listen 80;server_name game.example.com;location / {root /usr/share/nginx/html;try_files $uri /index.html;}location /api {proxy_pass http://game_servers;proxy_http_version 1.1;proxy_set_header Upgrade $http_upgrade;proxy_set_header Connection "upgrade";}
}
第四部分:CI/CD流水线实现
4.1 GitLab Runner配置
# 在Jenkins节点注册GitLab Runner
docker run -d --name gitlab-runner \-v /var/run/docker.sock:/var/run/docker.sock \-v /gitlab-runner/config:/etc/gitlab-runner \gitlab/gitlab-runner:v15.11.0docker exec -it gitlab-runner gitlab-runner register
4.2 完整的Jenkinsfile
pipeline {agent {docker {image 'node:18'args '-v $HOME/.npm:/root/.npm'}}environment {DOCKER_REGISTRY = 'registry.example.com'PROJECT = 'web-game'DEPLOY_NODE = '192.168.1.103'SSH_CREDS = credentials('prod-ssh-key')}stages {stage('Checkout') {steps {git branch: 'main', url: 'http://192.168.1.101:8080/game/web-game.git',credentialsId: 'gitlab-cred'}}stage('Build Frontend') {steps {dir('webgl-build') {sh 'npm install'sh 'npm run build'sh 'docker build -t $DOCKER_REGISTRY/$PROJECT-webgl:$BUILD_NUMBER .'}}}stage('Build Server') {steps {dir('server') {sh 'npm install --production'sh 'docker build -t $DOCKER_REGISTRY/$PROJECT-server:$BUILD_NUMBER .'}}}stage('Push Images') {steps {withCredentials([usernamePassword(credentialsId: 'docker-registry',usernameVariable: 'DOCKER_USER',passwordVariable: 'DOCKER_PASS')]) {sh 'echo $DOCKER_PASS | docker login -u $DOCKER_USER --password-stdin $DOCKER_REGISTRY'sh 'docker push $DOCKER_REGISTRY/$PROJECT-webgl:$BUILD_NUMBER'sh 'docker push $DOCKER_REGISTRY/$PROJECT-server:$BUILD_NUMBER'}}}stage('Deploy to Production') {steps {sshagent(['prod-ssh-key']) {sh """ssh -o StrictHostKeyChecking=no ubuntu@$DEPLOY_NODE \"export TAG=$BUILD_NUMBER && \docker stack deploy -c docker-compose.prod.yml game""""}}}}post {failure {slackSend channel: '#game-alerts',message: "构建失败: ${env.JOB_NAME} #${env.BUILD_NUMBER}"}success {slackSend channel: '#game-deploy',message: "新版本已上线: ${env.BUILD_NUMBER}"}}
}
4.3 关键优化点
构建缓存:复用node_modules目录加速构建
安全凭证:使用Jenkins Credential管理SSH密钥
回滚机制:保留最近5个可用镜像版本
通知系统:集成Slack实现构建状态实时通知
第五部分:监控与运维方案
5.1 分布式监控体系
# docker-compose.monitor.yml
version: '3.8'services:prometheus:image: prom/prometheusports:- "9090:9090"volumes:- ./prometheus.yml:/etc/prometheus/prometheus.ymldeploy:placement:constraints: [node.role == manager]grafana:image: grafana/grafanaports:- "3000:3000"volumes:- grafana-data:/var/lib/grafanadepends_on:- prometheusnode-exporter:image: prom/node-exporterdeploy:mode: globalvolumes:- /proc:/host/proc:ro- /sys:/host/sys:ro- /:/rootfs:rovolumes:grafana-data:
第六部分:踩坑经验与进阶思考
6.1 典型问题解决方案
问题1:跨主机容器网络不通
现象:Swarm集群中容器无法通过服务名互相访问
解决方案:检查防火墙规则:
sudo firewall-cmd --permanent --add-port=2377/tcp
sudo firewall-cmd --permanent --add-port=7946/tcp
sudo firewall-cmd --permanent --add-port=7946/udp
sudo firewall-cmd --permanent --add-port=4789/udp
sudo firewall-cmd --reload
验证Overlay网络状态:
docker network inspect game-network
优化方案:调整Runner配置:
[[runners]]name = "game-runner"url = "http://192.168.1.101:8080"executor = "docker"[runners.docker]tls_verify = falseimage = "alpine:3.16"privileged = truedisable_cache = falsevolumes = ["/cache", "/var/run/docker.sock:/var/run/docker.sock"]shm_size = "512m"
增加Runner并发数
6.2 性能优化成果
指标 | 优化前 | 优化后 |
---|---|---|
构建时间 | 23分钟 | 8分钟 |
部署时间 | 15分钟 | 45秒 |
镜像大小 | 1.8GB | 420MB |
启动时间 | 30秒 | 3秒 |
结语:从实践到生产
这套基于CentOS的分布式Docker架构已经稳定运行6个月,支撑了日均50万PV的游戏服务。关键收获包括:
基础设施即代码:所有环境配置版本化控制
不可变基础设施:通过镜像而非修改运行环境来变更应用
自动化一切:从代码提交到生产部署的全流程自动化
未来规划:
迁移到Kubernetes实现更高级的编排能力
引入服务网格(Service Mesh)管理微服务通信
实现基于Prometheus的自动扩缩容
希望这篇结合实战经验的详细分享,能为你的分布式容器化之路提供参考。欢迎在评论区交流你在CI/CD实践中遇到的挑战和解决方案!