RocketMQ 消息队列的运维

在前两节,我们讨论了RabbitMQ。然而根据实际的生产经验来看,当系统瞬时流量达到一定规模时,上述两款产品都不再适合作为消息系统的首选。

RabbitMQ在企业级应用是没有问题的,但它的抗消息堆积能力非常差,一旦突发流量提高,发生事件堆积,整个RabbitMQ集群就会挂起拒绝接受新消息,在极端情况下,甚至会发生RabbitMQ集群的宕机和消息丢失。

RocketMQ是一款高性能的分布式消息队列,它借鉴了Kafka的设计思路并继承了其高吞吐的特点,并重点改进了延迟,使得其更加适用于消息的实时处理。根据官方评测,在主流服务器上,RocketMQ的处理性能可达7万/秒,且随着Topic数量(队列数目)的增长而基本保持稳定,比Kafka更为稳定^1^ 。

在本小节,我们将讨论RocketMQ的运维,一般有两种方案。

  • RocketMQ部署在多台物理机上,优点是性能可靠。可以参考官方集群部署文档
  • RocketMQ部署在容器集群上,优点是运维方便。本小节将主要介绍这种方案。

RocketMQ服务器有两种角色:

  • NameServer: 管理元数据和Broker服务器、客户端的连接入口
  • Broker: 处理消息队列的服务器

在本小节,我们将构建4台服务器的RocketMQ集群,2台NameServer、2台Broker。

首先构建4个PersistentVolume,给上述4台机器使用:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv011 
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 20Gi
  hostPath:
    path: /data/pv011/
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv012 
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 20Gi
  hostPath:
    path: /data/pv012/
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv021
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 20Gi
  hostPath:
    path: /data/pv021/
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv022
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 20Gi
  hostPath:
    path: /data/pv022/

应用4个持久化目录:

kubectl -f ./pvs.yaml

接着,看一下NameServer的部署:

apiVersion: v1
kind: Service
metadata:
  name: rn 
spec:
  ports:
  - port: 9876
  selector:
    app: rocketmq-nameserver
  clusterIP: None
---
apiVersion: apps/v1
kind: StatefulSet 
metadata:
  name: rocketmq-nameserver 
spec:
  selector:
    matchLabels:
      app: rocketmq-nameserver
  serviceName: "rn"
  replicas: 2
  template:
    metadata:
      labels:
        app: rocketmq-nameserver 
    spec:
      restartPolicy: Always
      hostname: rocketmq-nameserver 
      containers:
      - name: rocketmq-nameserver-ct
        imagePullPolicy: Never
        image: coder4/rocketmq:4.2.0 
        ports:
        - containerPort: 9876 
        volumeMounts:
        - mountPath: /opt/rocketmq_home
          name: rocketmq-nameserver-pvc
        env:
        - name: NAME_SERVER
          value: "true"
  volumeClaimTemplates:
  - metadata:
      name: rocketmq-nameserver-pvc
    spec:
      storageClassName: standard
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 20Gi

如上所示:

  • 我们使用了定制镜像coder4/rocketmq,它集成了NameServer/Broker并支持集群部署
  • 使用StatefulSet部署两台相互独立的NameServer,主机名分别为rocketmq-nameserver-0和rocketmq-nameserver-1
  • Volume的挂在点是/opt/rocketmq_home,其中会包含data和log两个子目录。

启动一下,稍后成功:

kubectl apply -f ./nameserver-service.yaml
kubectl get pods
NAME                                                   READY     STATUS    RESTARTS   AGE
rocketmq-nameserver-0                                  1/1       Running   0          2m
rocketmq-nameserver-1                                  1/1       Running   0          2m

接下来,我们看一下Broker。 针对Broker,官方提供了几种高可用方案,我们这里采用"two master no slave"的模式,更多模式可参考官方文档。

piVersion: v1
kind: Service
metadata:
  name: rb 
spec:
  ports:
  - name: p9
    port: 10909 
  - name: p11
    port: 10911
  selector:
    app: rocketmq-brokerserver
  clusterIP: None
---
apiVersion: apps/v1
kind: StatefulSet 
metadata:
  name: rocketmq-brokerserver 
spec:
  selector:
    matchLabels:
      app: rocketmq-brokerserver
  serviceName: "rb"
  replicas: 2
  template:
    metadata:
      labels:
        app: rocketmq-brokerserver 
    spec:
      restartPolicy: Always
      hostname: rocketmq-brokerserver 
      containers:
      - name: rocketmq-brokerserver-ct
        imagePullPolicy: Never
        image: rocketmq:latest
        ports:
        - containerPort: 10909 
        - containerPort: 10911
        volumeMounts:
        - mountPath: /opt/rocketmq_home
          name: rocketmq-brokerserver-pvc
        env:
        - name: NAME_SERVER_LIST
          value: "rocketmq-nameserver-0.rn:9876;rocketmq-nameserver-1.rn:9876" 
        - name: BROKER_SERVER
          value: "true"
        - name: BROKER_CLUSTER_CONF
          value: "2m-noslave"
  volumeClaimTemplates:
  - metadata:
      name: rocketmq-brokerserver-pvc
    spec:
      storageClassName: standard
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 20Gi

上述brokerserver的配置与nameserver存在如下区别:

  • 通过环境变量NAME_SERVER_LIST设定了nameServer的集群列表,即之前启动的两台机器。
  • BROKER_SERVER表示启用的是broker server模式
  • BROKER_CLUSTER_CONF表示集群配置模式,即我们提到的"双主零从模式"

我们也启动一下broker server,稍等一会后,会成功:

kubectl apply -f ./broker-service.yaml
kubectl get pods
NAME                                                   READY     STATUS    RESTARTS   AGE
rocketmq-brokerserver-0                                1/1       Running   0          59s
rocketmq-brokerserver-1                                0/1       Running   0          49s

我们尝试用RocketMQ的自带工具n查看一下broker集群状态,能发现两台机器,说明集群部署成功:

./mqadmin clusterList -n "rocketmq-nameserver-0.rn:9876;rocketmq-nameserver-1.rn:9876"    
#Cluster Name     #Broker Name            #BID  #Addr                  #Version                #InTPS(LOAD)       #OutTPS(LOAD) #PCWait(ms) #Hour #SPACE
DefaultCluster    broker-a                0     172.17.0.15:10911      V4_2_0_SNAPSHOT          0.00(0,0ms)         0.00(0,0ms)          0 425363.14 -1.0000
DefaultCluster    broker-b                0     172.17.0.16:10911      V4_2_0_SNAPSHOT          0.00(0,0ms)         0.00(0,0ms)          0 425363.14 -1.0000

至此,我们完成了RocketMQ的集群部署工作。

下一节:在本小节中,我们将讨论在Spring Boot中整合RocketMQ。