一、K8S架构与组件
1、K8S架构
k8s 总体架构采用了经典的 maste/slave 架构模式,分 master 节点和 worker 节点,节点可以是虚拟机也可以是物理机。
K8S组件
master 节点组件Kube-apiserver
用于暴露 Kubernetes API,任何资源请求或调用操作都是通过 kube-apiserver 提供的接口进行。以 HTTPRestful API 提供接口服务,所有对象资源的增删改查和监听操作都交给 API Server 处理后再提交给 Etcd 存储。
可以理解成 API Server 是 K8S 的请求入口服务。API Server 负责接收 K8S 所有请求(来自 UI 界面或者 CLI命令行工具), 然后根据用户的具体请求,去通知其他组件干活。可以说 API Server 是 K8S 集群架构的大脑。
Kube-controller-manager
运行管理控制器,是 K8S 集群中处理常规任务的后台线程,是 K8S 集群里所有资源对象的自动化控制中心。在 K8S 集群中,一个资源对应一个控制器,而 Controller manager 就是负责管理这些控制器的
由一系列控制器组成,通过 API Server 监控整个集群的状态,并确保集群处于预期的工作状态,比如当某个 Node意外宕机时,Controller Manager 会及时发现并执行自动化修复流程,确保集群始终处于预期的工作状态。
Kube-scheduler
是负责资源调度的进程,根据调度算法为新创建的 Pod 选择一个合适的 Node 节点。可以理解成 K8S 所有 Node 节点的调度器。当用户要部署服务时,Scheduler 会根据调度算法选择最合适的 Node 节点来部署 Pod。预选策略(predicate)优选策略(priorities
etcd
K8S 的存储服务。etcd 是分布式键值存储系统,存储了 K8S 的关键配置和用户配置,K8S 中仅 API Server 才具备读写权限,其他组件必须通过 API Server 的接口才能读写数据。
node节点组件
在 Kubernetes 集群中,在每个 Node(又称 Worker Node)上都会启动一个 kubelet 服务进程。该进程用于处理 Master 下发到本节点的任务,管理 Pod 及 Pod 中的容器。每个 kubelet 进程都会在 API Server 上注册节点自身的信息,定期向 Master 汇报节点资源的使用情况,并通过 cAdvisor 监控容器和节点资源。
Kubelet
Node 节点的监视器,以及与 Master 节点的通讯器。Kubelet 是 Master 节点安插在 Node 节点上的“眼线”,它会定时向 API Server 汇报自己 Node 节点上运行的服务的状态,并接受来自 Master 节点的指示采取调整措施。从 Master 节点获取自己节点上 Pod 的期望状态(比如运行什么容器、运行的副本数量、网络或者存储如何配置等), 直接跟容器引擎交互实现容器的生命周期管理,如果自己节点上 Pod 的状态与期望状态不一致,则调用对应的容器平台接口(即 docker 的接口)达到这个状态。管理镜像和容器的清理工作,保证节点上镜像不会占满磁盘空间,退出的容器不会占用太多资源。
Kube-Proxy
在每个 Node 节点上实现 Pod 网络代理,是 Kubernetes Service 资源的载体,负责维护网络规则和四层负载均衡工作。 负责写入规则至iptables、ipvs实现服务映射访问的。Kube-Proxy 本身不是直接给 Pod 提供网络,Pod 的网络是由 Kubelet 提供的,Kube-Proxy 实际上维护的是虚拟的 Pod 集群网络。Kube-apiserver 通过监控 Kube-Proxy 进行对 Kubernetes Service 的更新和端点的维护。在 K8S 集群中微服务的负载均衡是由 Kube-proxy 实现的。Kube-proxy 是 K8S 集群内部的负载均衡器。它是一个分布式代理服务器,在 K8S 的每个节点上都会运行一个 Kube-proxy 组件。
Controller Runtime
容器引擎,如:docker、containerd,运行容器,负责本机的容器创建和管理工作。当 kubernetes 把 pod 调度到节点上,节点上的 kubelet会指示 docker 启动特定的容器。接着,kubelet 会通过 docker 持续地收集容器的信息, 然后提交到主节点上。docker 会如往常一样拉取容器镜像、启动或停止容器。不同点仅仅在于这是由自动化系统控制而非管理员在每个节点上手动操作的。
Pod
k8s 中特有的一个概念,可以理解为对容器的包装,是 k8s 的基本调度单位,一个 Pod 代表集群上正在运行的一个进程,实际的容器是运行在 Pod 中的, 可以把 Pod 理解成豌豆荚,而同一 Pod 内的每个容器是一颗颗豌豆。一个节点可以启动一个或多个 Pod。生产环境中一般都是单个容器或者具有强关联互补的多个容器组成一个 Pod。
二、POD创建过程
第一步: 客户端提交创建Pod的请求,可以通过调用API Server的Rest API接口,也可以通过kubectl命令行工具。如kubectl apply -f filename.yaml(资源清单文件)
第二步: apiserver接收到pod创建请求后,会将yaml中的属性信息(metadata)写入etcd。
第三步: apiserver触发watch机制准备创建pod,信息转发给调度器scheduler,调度器用一组规则过滤掉不符合要求的主机,比如Pod指定了所需要的资源量,那么可用资源比Pod需要的资源量少的主机会被过滤掉。调度器使用调度算法选择node,调度器将node信息给apiserver,apiserver将绑定的node信息写入etcd。
第四步: apiserver又通过watch机制,调用kubelet,指定pod信息,调用Docker API创建并启动pod内的容器。
第五步: worker创建完成之后反馈给自身的kubelet, 再反馈给控制器的kubelet,然后将pod的状态信息给apiserver,apiserver又将pod的状态信息写入etcd。
三、Pod资源清单详解
1、Pod资源清单介绍
#查看对象中包含哪些字段
[root@k8s-master ~]# kubectl explain pod.spec.containers
KIND: Pod
VERSION: v1FIELD: containers <[]Container>DESCRIPTION:List of containers belonging to the pod. Containers cannot currently beadded or removed. There must be at least one container in a Pod. Cannot beupdated.A single application container that you want to run within a pod.FIELDS:args <[]string>Arguments to the entrypoint. The container image's CMD is used if this isnot provided. Variable references $(VAR_NAME) are expanded using thecontainer's environment. If a variable cannot be resolved, the reference inthe input string will be unchanged. Double $$ are reduced to a single $,which allows for escaping the $(VAR_NAME) syntax: i.e. "$$(VAR_NAME)" willproduce the string literal "$(VAR_NAME)". Escaped references will never beexpanded, regardless of whether the variable exists or not. Cannot beupdated. More info:https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/#running-a-command-in-a-shellcommand <[]string>Entrypoint array. Not executed within a shell. The container image'sENTRYPOINT is used if this is not provided. Variable references $(VAR_NAME)are expanded using the container's environment. If a variable cannot beresolved, the reference in the input string will be unchanged. Double $$ arereduced to a single $, which allows for escaping the $(VAR_NAME) syntax:i.e. "$$(VAR_NAME)" will produce the string literal "$(VAR_NAME)". Escapedreferences will never be expanded, regardless of whether the variable existsor not. Cannot be updated. More info:https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/#running-a-command-in-a-shellenv <[]EnvVar>List of environment variables to set in the container. Cannot be updated.envFrom <[]EnvFromSource>List of sources to populate environment variables in the container. The keysdefined within a source must be a C_IDENTIFIER. All invalid keys will bereported as an event when the container is starting. When a key exists inmultiple sources, the value associated with the last source will takeprecedence. Values defined by an Env with a duplicate key will takeprecedence. Cannot be updated.image <string>Container image name. More info:https://kubernetes.io/docs/concepts/containers/images This field is optionalto allow higher level config management to default or override containerimages in workload controllers like Deployments and StatefulSets.imagePullPolicy <string>Image pull policy. One of Always, Never, IfNotPresent. Defaults to Always if:latest tag is specified, or IfNotPresent otherwise. Cannot be updated. Moreinfo: https://kubernetes.io/docs/concepts/containers/images#updating-imagesPossible enum values:- `"Always"` means that kubelet always attempts to pull the latest image.Container will fail If the pull fails.- `"IfNotPresent"` means that kubelet pulls if the image isn't present ondisk. Container will fail if the image isn't present and the pull fails.- `"Never"` means that kubelet never pulls an image, but only uses a localimage. Container will fail if the image isn't presentlifecycle <Lifecycle>Actions that the management system should take in response to containerlifecycle events. Cannot be updated.livenessProbe <Probe>Periodic probe of container liveness. Container will be restarted if theprobe fails. Cannot be updated. More info:https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probesname <string> -required-Name of the container specified as a DNS_LABEL. Each container in a pod musthave a unique name (DNS_LABEL). Cannot be updated.ports <[]ContainerPort>List of ports to expose from the container. Not specifying a port here DOESNOT prevent that port from being exposed. Any port which is listening on thedefault "0.0.0.0" address inside a container will be accessible from thenetwork. Modifying this array with strategic merge patch may corrupt thedata. For more information Seehttps://github.com/kubernetes/kubernetes/issues/108255. Cannot be updated.readinessProbe <Probe>Periodic probe of container service readiness. Container will be removedfrom service endpoints if the probe fails. Cannot be updated. More info:https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probesresizePolicy <[]ContainerResizePolicy>Resources resize policy for the container.resources <ResourceRequirements>Compute Resources required by this container. Cannot be updated. More info:https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/restartPolicy <string>RestartPolicy defines the restart behavior of individual containers in apod. This field may only be set for init containers, and the only allowedvalue is "Always". For non-init containers or when this field is notspecified, the restart behavior is defined by the Pod's restart policy andthe container type. Setting the RestartPolicy as "Always" for the initcontainer will have the following effect: this init container will becontinually restarted on exit until all regular containers have terminated.Once all regular containers have completed, all init containers withrestartPolicy "Always" will be shut down. This lifecycle differs from normalinit containers and is often referred to as a "sidecar" container. Althoughthis init container still starts in the init container sequence, it does notwait for the container to complete before proceeding to the next initcontainer. Instead, the next init container starts immediately after thisinit container is started, or after any startupProbe has successfullycompleted.securityContext <SecurityContext>SecurityContext defines the security options the container should be runwith. If set, the fields of SecurityContext override the equivalent fieldsof PodSecurityContext. More info:https://kubernetes.io/docs/tasks/configure-pod-container/security-context/startupProbe <Probe>StartupProbe indicates that the Pod has successfully initialized. Ifspecified, no other probes are executed until this completes successfully.If this probe fails, the Pod will be restarted, just as if the livenessProbefailed. This can be used to provide different probe parameters at thebeginning of a Pod's lifecycle, when it might take a long time to load dataor warm a cache, than during steady-state operation. This cannot be updated.More info:https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probesstdin <boolean>Whether this container should allocate a buffer for stdin in the containerruntime. If this is not set, reads from stdin in the container will alwaysresult in EOF. Default is false.stdinOnce <boolean>Whether the container runtime should close the stdin channel after it hasbeen opened by a single attach. When stdin is true the stdin stream willremain open across multiple attach sessions. If stdinOnce is set to true,stdin is opened on container start, is empty until the first client attachesto stdin, and then remains open and accepts data until the clientdisconnects, at which time stdin is closed and remains closed until thecontainer is restarted. If this flag is false, a container processes thatreads from stdin will never receive an EOF. Default is falseterminationMessagePath <string>Optional: Path at which the file to which the container's terminationmessage will be written is mounted into the container's filesystem. Messagewritten is intended to be brief final status, such as an assertion failuremessage. Will be truncated by the node if greater than 4096 bytes. The totalmessage length across all containers will be limited to 12kb. Defaults to/dev/termination-log. Cannot be updated.terminationMessagePolicy <string>Indicate how the termination message should be populated. File will use thecontents of terminationMessagePath to populate the container status messageon both success and failure. FallbackToLogsOnError will use the last chunkof container log output if the termination message file is empty and thecontainer exited with an error. The log output is limited to 2048 bytes or80 lines, whichever is smaller. Defaults to File. Cannot be updated.Possible enum values:- `"FallbackToLogsOnError"` will read the most recent contents of thecontainer logs for the container status message when the container exitswith an error and the terminationMessagePath has no contents.- `"File"` is the default behavior and will set the container statusmessage to the contents of the container's terminationMessagePath when thecontainer exits.tty <boolean>Whether this container should allocate a TTY for itself, also requires'stdin' to be true. Default is false.volumeDevices <[]VolumeDevice>volumeDevices is the list of block devices to be used by the container.volumeMounts <[]VolumeMount>Pod volumes to mount into the container's filesystem. Cannot be updated.workingDir <string>Container's working directory. If not specified, the container runtime'sdefault will be used, which might be configured in the container image.Cannot be updated.
2、POD YAML文件示例
# yaml格式的pod定义文件完整内容:
apiVersion: v1 #必选,版本号,例如v1
kind: Pod #必选,Pod
metadata: #必选,元数据name: string #必选,Pod名称namespace: string #必选,Pod所属的命名空间labels: #自定义标签name: string #自定义标签名字annotations: #自定义注释列表- name: string
spec: #必选,Pod中容器的详细定义containers: #必选,Pod中容器列表- name: string #必选,容器名称image: string #必选,容器的镜像名称imagePullPolicy: [Always | Never | IfNotPresent] #获取镜像的策略 Alawys表示下载镜像 IfnotPresent表示优先使用本地镜像,否则下载镜像,Nerver表示仅使用本地镜像command: [string] #容器的启动命令列表,如不指定,使用打包时使用的启动命令args: [string] #容器的启动命令参数列表workingDir: string #容器的工作目录volumeMounts: #挂载到容器内部的存储卷配置- name: string #引用pod定义的共享存储卷的名称,需用volumes[]部分定义的的卷名mountPath: string #存储卷在容器内mount的绝对路径,应少于512字符readOnly: boolean #是否为只读模式ports: #需要暴露的端口库号列表- name: string #端口号名称containerPort: int #容器需要监听的端口号hostPort: int #容器所在主机需要监听的端口号,默认与Container相同protocol: string #端口协议,支持TCP和UDP,默认TCPenv: #容器运行前需设置的环境变量列表- name: string #环境变量名称value: string #环境变量的值resources: #资源限制和请求的设置limits: #资源限制的设置cpu: string #Cpu的限制,单位为core数,将用于docker run --cpu-shares参数memory: string #内存限制,单位可以为Mib/Gib,将用于docker run --memory参数requests: #资源请求的设置cpu: string #Cpu请求,容器启动的初始可用数量memory: string #内存清楚,容器启动的初始可用数量livenessProbe: #对Pod内个容器健康检查的设置,当探测无响应几次后将自动重启该容器,检查方法有exec、httpGet和tcpSocket,对一个容器只需设置其中一种方法即可exec: #对Pod容器内检查方式设置为exec方式command: [string] #exec方式需要制定的命令或脚本httpGet: #对Pod内个容器健康检查方法设置为HttpGet,需要制定Path、portpath: stringport: numberhost: stringscheme: stringHttpHeaders:- name: stringvalue: stringtcpSocket: #对Pod内个容器健康检查方式设置为tcpSocket方式port: numberinitialDelaySeconds: 0 #容器启动完成后首次探测的时间,单位为秒timeoutSeconds: 0 #对容器健康检查探测等待响应的超时时间,单位秒,默认1秒periodSeconds: 0 #对容器监控检查的定期探测时间设置,单位秒,默认10秒一次successThreshold: 0failureThreshold: 0securityContext:privileged: falserestartPolicy: [Always | Never | OnFailure]#Pod的重启策略,Always表示一旦不管以何种方式终止运行,kubelet都将重启,OnFailure表示只有Pod以非0退出码退出才重启,Nerver表示不再重启该PodnodeSelector: obeject #设置NodeSelector表示将该Pod调度到包含这个label的node上,以key:value的格式指定imagePullSecrets: #Pull镜像时使用的secret名称,以key:secretkey格式指定- name: stringhostNetwork: false #是否使用主机网络模式,默认为false,如果设置为true,表示使用宿主机网络volumes: #在该pod上定义共享存储卷列表- name: string #共享存储卷名称 (volumes类型有很多种)emptyDir: {} #类型为emtyDir的存储卷,与Pod同生命周期的一个临时目录。为空值hostPath: string #类型为hostPath的存储卷,表示挂载Pod所在宿主机的目录path: string #Pod所在宿主机的目录,将被用于同期中mount的目录secret: #类型为secret的存储卷,挂载集群与定义的secre对象到容器内部scretname: string items: - key: stringpath: stringconfigMap: #类型为configMap的存储卷,挂载预定义的configMap对象到容器内部name: stringitems:- key: stringpath: string
四、标签
1 、什么是标签?
标签其实就一对 key/value ,被关联到对象上,比如Pod,标签的使用我们倾向于能够表示对象的特殊特点,就是一眼就看出了这个Pod是干什么的,标签可以用来划分特定的对象(比如版本,服务类型等),标签可以在创建一个对象的时候直接定义,也可以在后期随时修改,每一个对象可以拥有多个标签,但是,key值必须是唯一的。创建标签之后也可以方便我们对资源进行分组管理。如果对pod打标签之后就可以使用标签来查看、删除指定的pod。(慎重!!!)在k8s中,大部分资源都可以打标签。
2、给pod资源打标签
通过yaml文件修改
通过命令修改