Kubernetes 博客

Thursday, January 17, 2019

Update on Volume Snapshot Alpha for Kubernetes

Authors: Jing Xu (Google), Xing Yang (Huawei), Saad Ali (Google)

Volume snapshotting support was introduced in Kubernetes v1.12 as an alpha feature. In Kubernetes v1.13, it remains an alpha feature, but a few enhancements were added and some breaking changes were made. This post summarizes the changes.

Breaking Changes

CSI spec v1.0 introduced a few breaking changes to the volume snapshot feature. CSI driver maintainers should be aware of these changes as they upgrade their drivers to support v1.0.

SnapshotStatus replaced with Boolean ReadyToUse

CSI v0.3.0, defined a SnapshotStatus enum in CreateSnapshotResponse which indicates whether the snapshot is READY, UPLOADING, or ERROR_UPLOADING. In CSI v1.0, SnapshotStatus has been removed from CreateSnapshotResponse and replaced with a boolean ReadyToUse. A ReadyToUse value of true indicates that post snapshot processing (such as uploading) is complete and the snapshot is ready to be used as a source to create a volume.

Storage systems that need to do post snapshot processing (such as uploading after the snapshot is cut) should return a successful CreateSnapshotResponse with the ReadyToUse field set to false as soon as the snapshot has been taken. This indicates that the Container Orchestration System (CO) can resume any workload that was quiesced for the snapshot to be taken. The CO can then repeatedly call CreateSnapshot until the ReadyToUse field is set to true or the call returns an error indicating a problem in processing. The CSI ListSnapshot call could be used along with snapshot_id filtering to determine if the snapshot is ready to use, but is not recommended because it provides no way to detect errors during processing (the ReadyToUse field simply remains false indefinitely).

The v1.x.x releases of the CSI external-snapshotter sidecar container already handle this change by calling CreateSnapshot instead of ListSnapshots to check if a snapshot is ready to use. When upgrading their drivers to CSI 1.0, driver maintainers should use the appropriate 1.0 compatible sidecar container.

To be consistent with the change in the CSI spec, the Ready field in the VolumeSnapshot API object has been renamed to ReadyToUse. This change is visible to the user when running kubectl describe volumesnapshot to view the details of a snapshot.

Timestamp Data Type

The creation time of a snapshot is available to Kubernetes admins as part of the VolumeSnapshotContent API object. This field is populated using the creation_time field in the CSI CreateSnapshotResponse. In CSI v1.0, this creation_time field type was changed to .google.protobuf.Timestamp instead of int64. When upgrading drivers to CSI 1.0, driver maintainers must make changes accordingly. The v1.x.x releases of the CSI external-snapshotter sidecar container has been updated to handle this change.

Deprecations

The following VolumeSnapshotClass parameters are deprecated and will be removed in a future release. They will be replaced with parameters listed in the Replacement section below.

Deprecated Replacement csiSnapshotterSecretName csi.storage.k8s.io/snapshotter-secret-name csiSnapshotterSecretNameSpace csi.storage.k8s.io/snapshotter-secret-namespace

New Features

SnapshotContent Deletion/Retain Policy

As described in the initial blog post announcing the snapshot alpha, the Kubernetes snapshot APIs are similar to the PV/PVC APIs: just like a volume is represented by a bound PVC and PV pair, a snapshot is represented by a bound VolumeSnapshot and VolumeSnapshotContent pair.

With PV/PVC pairs, when a user is done with a volume, they can delete the PVC. And the reclaim policy on the PV determines what happens to the PV (whether it is also deleted or retained).

In the initial alpha release, snapshots did not support the ability to specify a reclaim policy. Instead when a snapshot object was deleted it always resulted in the snapshot being deleted. In Kubernetes v1.13, a snapshot content DeletionPolicy was added. It enables an admin to configure what what happens to a VolumeSnapshotContent after the VolumeSnapshot object it is bound to is deleted. The DeletionPolicy of a volume snapshot can either be Retain or Delete. If the value is not specified, the default depends on whether the SnapshotContent object was created via static binding or dynamic provisioning.

Retain

The Retain policy allows for manual reclamation of the resource. If a VolumeSnapshotContent is statically created and bound, the default DeletionPolicy is Retain. When the VolumeSnapshot is deleted, the VolumeSnapshotContent continues to exist and the VolumeSnapshotContent is considered “released”. But it is not available for binding to other VolumeSnapshot objects because it contains data. It is up to an administrator to decide how to handle the remaining API object and resource cleanup.

Delete

A Delete policy enables automatic deletion of the bound VolumeSnapshotContent object from Kubernetes and the associated storage asset in the external infrastructure (such as an AWS EBS snapshot or GCE PD snapshot, etc.). Snapshots that are dynamically provisioned inherit the deletion policy of their VolumeSnapshotClass, which defaults to Delete. The administrator should configure the VolumeSnapshotClass with the desired retention policy. The policy may be changed for individual VolumeSnapshotContent after it is created by patching the object.

The following example demonstrates how to check the deletion policy of a dynamically provisioned VolumeSnapshotContent.

$ kubectl create -f ./examples/kubernetes/demo-defaultsnapshotclass.yaml
$ kubectl create -f ./examples/kubernetes/demo-snapshot.yaml
$ kubectl get volumesnapshots demo-snapshot-podpvc -o yaml
apiVersion: snapshot.storage.k8s.io/v1alpha1
kind: VolumeSnapshot
metadata:
  creationTimestamp: "2018-11-27T23:57:09Z"
...
spec:
  snapshotClassName: default-snapshot-class
  snapshotContentName: snapcontent-26cd0db3-f2a0-11e8-8be6-42010a800002
  source:
    apiGroup: null
    kind: PersistentVolumeClaim
    name: podpvc
status:
…
$ kubectl get volumesnapshotcontent snapcontent-26cd0db3-f2a0-11e8-8be6-42010a800002 -o yaml
apiVersion: snapshot.storage.k8s.io/v1alpha1
kind: VolumeSnapshotContent
…
spec:
  csiVolumeSnapshotSource:
    creationTime: 1546469777852000000
    driver: pd.csi.storage.gke.io
    restoreSize: 6442450944
    snapshotHandle: projects/jing-k8s-dev/global/snapshots/snapshot-26cd0db3-f2a0-11e8-8be6-42010a800002
  deletionPolicy: Delete
  persistentVolumeRef:
    apiVersion: v1
    kind: PersistentVolume
    name: pvc-853622a4-f28b-11e8-8be6-42010a800002
    resourceVersion: "21117"
    uid: ae400e9f-f28b-11e8-8be6-42010a800002
  snapshotClassName: default-snapshot-class
  volumeSnapshotRef:
    apiVersion: snapshot.storage.k8s.io/v1alpha1
    kind: VolumeSnapshot
    name: demo-snapshot-podpvc
    namespace: default
    resourceVersion: "6948065"
    uid: 26cd0db3-f2a0-11e8-8be6-42010a800002

User can change the deletion policy by using patch:

$ kubectl patch volumesnapshotcontent snapcontent-26cd0db3-f2a0-11e8-8be6-42010a800002 -p '{"spec":{"deletionPolicy":"Retain"}}' --type=merge

$ kubectl get volumesnapshotcontent snapcontent-26cd0db3-f2a0-11e8-8be6-42010a800002 -o yaml
apiVersion: snapshot.storage.k8s.io/v1alpha1
kind: VolumeSnapshotContent
...
spec:
  csiVolumeSnapshotSource:
...
  deletionPolicy: Retain
  persistentVolumeRef:
    apiVersion: v1
    kind: PersistentVolume
    name: pvc-853622a4-f28b-11e8-8be6-42010a800002
...

Snapshot Object in Use Protection

The purpose of the Snapshot Object in Use Protection feature is to ensure that in-use snapshot API objects are not removed from the system (as this may result in data loss). There are two cases that require “in-use” protection:

  1. If a volume snapshot is in active use by a persistent volume claim as a source to create a volume.
  2. If a VolumeSnapshotContent API object is bound to a VolumeSnapshot API object, the content object is considered in use.

If a user deletes a VolumeSnapshot API object in active use by a PVC, the VolumeSnapshot object is not removed immediately. Instead, removal of the VolumeSnapshot object is postponed until the VolumeSnapshot is no longer actively used by any PVCs. Similarly, if an admin deletes a VolumeSnapshotContent that is bound to a VolumeSnapshot, the VolumeSnapshotContent is not removed immediately. Instead, the VolumeSnapshotContent removal is postponed until the VolumeSnapshotContent is not bound to the VolumeSnapshot object.

Which volume plugins support Kubernetes Snapshots?

Snapshots are only supported for CSI drivers (not for in-tree or Flexvolume). To use the Kubernetes snapshots feature, ensure that a CSI Driver that implements snapshots is deployed on your cluster.

As of the publishing of this blog post, the following CSI drivers support snapshots:

Snapshot support for other drivers is pending, and should be available soon. Read the “Container Storage Interface (CSI) for Kubernetes GA” blog post to learn more about CSI and how to deploy CSI drivers.

What’s next?

Depending on feedback and adoption, the Kubernetes team plans to push the CSI Snapshot implementation to beta in either 1.15 or 1.16. Some of the features we are interested in supporting include consistency groups, application consistent snapshots, workload quiescing, in-place restores, and more.

How can I learn more?

The code repository for snapshot APIs and controller is here: https://github.com/kubernetes-csi/external-snapshotter

Check out additional documentation on the snapshot feature here: http://k8s.io/docs/concepts/storage/volume-snapshots and https://kubernetes-csi.github.io/docs/

How do I get involved?

This project, like all of Kubernetes, is the result of hard work by many contributors from diverse backgrounds working together.

Special thanks to all the contributors that helped add CSI v1.0 support and improve the snapshot feature in this release, including Saad Ali (saadali), Michelle Au (msau42), Deep Debroy (ddebroy), James DeFelice (jdef), John Griffith (j-griffith), Julian Hjortshoj (julian-hj), Tim Hockin (thockin), Patrick Ohly (pohly), Luis Pabon (lpabon), Cheng Xing (verult), Jing Xu (jingxu97), Shiwei Xu (wackxu), Xing Yang (xing-yang), Jie Yu (jieyu), David Zhu (davidz627).

Those interested in getting involved with the design and development of CSI or any part of the Kubernetes Storage system, join the Kubernetes Storage Special Interest Group (SIG). We’re rapidly growing and always welcome new contributors.

We also hold regular SIG-Storage Snapshot Working Group meetings. New attendees are welcome to join for design and development discussions.

2018.12.05

新贡献者工作坊上海站

作者: Josh Berkus (红帽), Yang Li (The Plant), Puja Abbassi (Giant Swarm), XiangPeng Zhao (中兴通讯)

KubeCon 上海站新贡献者峰会与会者,摄影:Jerry Zhang

KubeCon 上海站新贡献者峰会与会者,摄影:Jerry Zhang

最近,在中国的首次 KubeCon 上,我们完成了在中国的首次新贡献者峰会。看到所有中国和亚洲的开发者(以及来自世界各地的一些人)有兴趣成为贡献者,这令人非常兴奋。在长达一天的课程中,他们了解了如何、为什么以及在何处为 Kubernetes 作出贡献,创建了 PR,参加了贡献者圆桌讨论,并签署了他们的 CLA。

这是我们的第二届新贡献者工作坊(NCW),它由前一次贡献者体验 SIG 成员创建和领导的哥本哈根研讨会延伸而来。根据受众情况,本次活动采用了中英文两种语言,充分利用了 CNCF 赞助的一流的同声传译服务。同样,NCW 团队由社区成员组成,既有说英语的,也有说汉语的:Yang Li、XiangPeng Zhao、Puja Abbassi、Noah Abrahams、Tim Pepper、Zach Corleissen、Sen Lu 和 Josh Berkus。除了演讲和帮助学员外,团队的双语成员还将所有幻灯片翻译成了中文。共有五十一名学员参加。

Noah Abrahams 讲解 Kubernetes 沟通渠道。摄影:Jerry Zhang

Noah Abrahams 讲解 Kubernetes 沟通渠道。摄影:Jerry Zhang

NCW 让参与者完成了为 Kubernetes 作出贡献的各个阶段,从决定在哪里作出贡献开始,接着介绍了 SIG 系统和我们的代码仓库结构。我们还有来自文档和测试基础设施领域的「客座讲者」,他们负责讲解有关的贡献。最后,我们在创建 issue、提交并批准 PR 的实践练习后,结束了工作坊。

这些实践练习使用一个名为贡献者游乐场的代码仓库,由贡献者体验 SIG 创建,让新贡献者尝试在一个 Kubernetes 仓库中执行各种操作。它修改了 Prow 和 Tide 自动化,使用与真实代码仓库类似的 Owners 文件。这可以让学员了解为我们的仓库做出贡献的有关机制,同时又不妨碍正常的开发流程。

Yang Li 讲到如何让你的 PR 通过评审。摄影:Josh Berkus

Yang Li 讲到如何让你的 PR 通过评审。摄影:Josh Berkus

「防火长城」和语言障碍都使得在中国为 Kubernetes 作出贡献变得困难。而且,中国的开源商业模式并不成熟,员工在开源项目上工作的时间有限。

中国工程师渴望参与 Kubernetes 的研发,但他们中的许多人不知道从何处开始,因为 Kubernetes 是一个如此庞大的项目。通过本次工作坊,我们希望帮助那些想要参与贡献的人,不论他们希望修复他们遇到的一些错误、改进或本地化文档,或者他们需要在工作中用到 Kubernetes。我们很高兴看到越来越多的中国贡献者在过去几年里加入社区,我们也希望将来可以看到更多。

「我已经参与了 Kubernetes 社区大约三年」,XiangPeng Zhao 说,「在社区,我注意到越来越多的中国开发者表现出对 Kubernetes 贡献的兴趣。但是,开始为这样一个项目做贡献并不容易。我尽力帮助那些我在社区遇到的人,但是,我认为可能仍有一些新的贡献者离开社区,因为他们在遇到麻烦时不知道从哪里获得帮助。幸运的是,社区在 KubeCon 哥本哈根站发起了 NCW,并在 KubeCon 上海站举办了第二届。我很高兴受到 Josh Berkus 的邀请,帮助组织这个工作坊。在工作坊期间,我当面见到了社区里的朋友,在练习中指导了与会者,等等。所有这些对我来说都是难忘的经历。作为有着多年贡献者经验的我,也学习到了很多。我希望几年前我开始为 Kubernetes 做贡献时参加过这样的工作坊」。

贡献者圆桌讨论。摄影:Jerry Zhang

贡献者圆桌讨论。摄影:Jerry Zhang

工作坊以现有贡献者圆桌讨论结束,嘉宾包括 Lucas Käldström、Janet Kuo、Da Ma、Pengfei Ni、Zefeng Wang 和 Chao Xu。这场圆桌讨论旨在让新的和现有的贡献者了解一些最活跃的贡献者和维护者的幕后日常工作,不论他们来自中国还是世界各地。嘉宾们讨论了从哪里开始贡献者的旅程,以及如何与评审者和维护者进行互动。他们进一步探讨了在中国参与贡献的主要问题,并向与会者预告了在 Kubernetes 的未来版本中可以期待的令人兴奋的功能。

工作坊结束后,XiangPeng Zhao 和一些与会者就他们的经历在微信和 Twitter 上进行了交谈。他们很高兴参加了 NCW,并就改进工作坊提出了一些建议。一位名叫 Mohammad 的与会者说:「我在工作坊上玩得很开心,学习了参与 k8s 贡献的整个过程。」另一位与会者 Jie Jia 说:「工作坊非常精彩。它系统地解释了如何为 Kubernetes 做出贡献。即使参与者之前对此一无所知,他(她)也可以理解这个过程。对于那些已经是贡献者的人,他们也可以学习到新东西。此外,我还可以在工作坊上结识来自国内外的新朋友。真是棒极了!」

贡献者体验 SIG 将继续在未来的 KubeCon 上举办新贡献者工作坊,包括西雅图站、巴塞罗那站,然后在 2019 年六月回到上海。如果你今年未能参加,请在未来的 KubeCon 上注册。并且,如果你遇到工作坊的与会者,请务必欢迎他们加入社区。

链接:

  • Jan 1
  • Jan 1