Loading…
Attending this event?
现场活动
9月26日至28日
了解更多注册参加

Sched应用程式允许您创建日程,但不能替代您的活动注册。您必须先注册KubeCon + CloudNativeCon + Open Source Summit China 2023 才能参加会议。如果您还未注册但希望加入我们,请前往活动注册页面购买注册。

请注意:此日程以中国标准时间(UTC +8)自动显示。若要查看您首选时区的日程,请从右侧顶部的"Timezone"下拉菜单选择首选时区。日程可能会有变动,并且会议席位按照先到先得的原则提供。

In-person
September 26-28
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon + Open Source Summit China to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in China Standard Time (UTC +8). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis. 
Thursday, September 28
 

8:00am CST

9:00am CST

主论坛演讲:开场致辞 | Keynote: Opening Remarks
Speakers
avatar for Kevin Wang

Kevin Wang

CNCF Ambassador, TOC contributor, Kubernetes emeritus Maintainer, Founder and Maintainer of multiple CNCF projects, Lead of Cloud Native Open Source Team at Huawei, Huawei
Kevin Wang has been an outstanding contributor in the CNCF community since its beginning and is the leader of the cloud native open source team at Huawei. Kevin has contributed critical enhancements to Kubernetes, led the incubation of the KubeEdge, Volcano, Karmada projects in CNCF... Read More →
avatar for Fog Dong

Fog Dong

Senior Engineer, BentoML
Fog Dong is a senior engineer at BentoML, a maintainer of KubeVela, and a CNCF ambassador. With a strong focus on cloud native DevOps, Fog actively contributes to the open source community. Currently, she is dedicated to developing the fastest approach to building AI applications... Read More →


Thursday September 28, 2023 9:00am - 9:05am CST
3层 301明珠厅| 3F The Pearl Hall 301

9:05am CST

主论坛演讲:中国规模的云原生 | Keynote: Cloud Native at China Scale - Chris Aniszczyk, CTO, Cloud Native Computing Foundation
中国在云原生生态系统方面的贡献正在改变世界和天空,从帮助欧洲核子研究中心揭示宇宙的构成和运行方式,到为外太空卫星提供动力。请加入云原生计算基金会(CNCF)的首席技术官克里斯·安尼兹奇克(Chris Aniszczyk),探索中国如何通过一支充满承诺的维护者和贡献者社区,推动云原生技术的界限。

China’s contributions to the cloud native ecosystem are changing the world and the skies beyond, from helping CERN to uncover what the universe is made of and how it works, to powering satellites in outer space. Join CNCF’s CTO Chris Aniszczyk to explore the ways China is pushing the boundaries of cloud native technology, driven by a committed community of maintainers and contributors.

Speakers
avatar for Chris Aniszczyk

Chris Aniszczyk

CTO, Linux Foundation (CNCF)
Chris Aniszczyk is an open source executive and engineer with a passion for building a better world through open collaboration. He's currently a CTO at the Linux Foundation focused on developer relations and running the Open Container Initiative (OCI) / Cloud Native Computing Foundation... Read More →


Thursday September 28, 2023 9:05am - 9:15am CST
3层 301明珠厅| 3F The Pearl Hall 301

9:15am CST

主论坛演讲:选择你的冒险之旅:通往生产环境的险象环生之路 | Keynote: Choose Your Own Adventure: The Perilous Passage to Production - Whitney Lee, Staff Technical Advocate, VMware & Viktor Farcic, Developer Advocate, Upbound
我们的英雄,一款在Kubernetes开发环境中运行的应用程序,知道他们注定要做更伟大的事情!他们渴望生活在生产环境中,为最终用户提供服务!然而,从开发到生产的旅程充满了关于集群提供、GitOps和应用程序配置的系统设计选择。而且谁知道在暗中潜伏着什么看不见的力量!一步错,可能会带来灾难性的后果。

由你们社区来引导我们的英雄,帮助他们从一个存储库中的容器镜像成长为最终形态⎯一个在生产环境中运行的应用程序。在他们第二次的KubeCon '选择你自己的冒险'风格的演讲中,Whitney和Viktor将呈现一个拟人化应用程序必须在寻找通往生产环境的路上做出的选择。在演示过程中,社区将决定我们的英雄应用程序的路线!在会话时间耗尽之前,我们能够导航CNCF项目并避免死胡同,将我们的应用程序推向生产环境吗?

Our hero, a running application in a Kubernetes development environment, knows that they are destined for greater things! They long to be living in production, serving end users! However, the journey from dev to prod is hard, filled with system design choices concerning cluster provisioning, GitOps, and app configuration. And who knows what unseen forces lurk in the shadows! One wrong step could be catastrophic. It is up to you, the community, to guide our hero and help them grow from a container image in a registry to their final form⎯an app running in production. In their second KubeCon 'Choose Your Own Adventure'-style talk, Whitney and Viktor will present choices that an anthropomorphized app must make as they try to find their way to production. Throughout the presentation, the community will decide our hero app's path! Can we navigate CNCF projects and avoid dead-ends to get our app to production before the session time elapses?

Speakers
avatar for Viktor Farcic

Viktor Farcic

Upbound
Viktor Farcic is a Developer Advocate at Upbound, a member of the Google Developer Experts and CD Foundation groups, and a published author. His big passions are DevOps, Containers, Kubernetes, Microservices, Continuous Integration, Delivery and Deployment (CI/CD) and Test-Driven... Read More →
avatar for Whitney Lee

Whitney Lee

Staff Technical Advocate, VMware
Whitney is a lovable goofball who enjoys understanding and using tools in the cloud native landscape. Creative and driven, Whitney recently pivoted from an art-related career to one in tech. Last fall at KubeCon, Whitney co-presented a silly-yet-informative keynote about platform... Read More →


Thursday September 28, 2023 9:15am - 9:30am CST
3层 301明珠厅| 3F The Pearl Hall 301

9:30am CST

赞助主论坛演讲: 开放创新,加速共建智能世界云底座 | Sponsored Keynote: Open Innovation, Accelerating the Co-construction of the Intelligent World Cloud Foundation - Alfred Huang, General Manager of Cloud Native Services, Huawei Cloud
随着行业智能化进程加速,企业云原生平台面临许多新的挑战。本次分享将探讨华为云在新时代下云原生领域的实践探索和技术创新进展,并探讨如何通过开源社区持续与伙伴、用户推动创新,加速共建智能世界。

As the process of industry intelligence accelerates, enterprise cloud-native platforms face many new challenges. This sharing will explore Huawei Cloud's practical exploration and technological innovation progress in the cloud-native field in the new era, and discuss how to continuously promote innovation with partners and users through the open-source community, and accelerate the co-construction of the intelligent world.

Speakers
avatar for Alfred Huang

Alfred Huang

General Manager of Cloud Native Services, Huawei
作为华为云云原生服务总监,负责云容器引擎,Serverless容器,服务网格,分布式云原生等多款云原生服务的研发、竞争力构建和业务成功。As the General Manager of Cloud Native Services at Huawei Cloud, Alfred is responsible for the research... Read More →


Thursday September 28, 2023 9:30am - 9:35am CST
3层 301明珠厅| 3F The Pearl Hall 301

9:35am CST

赞助主论坛演讲: 与亚马逊云科技和Kubernetes社区一起加速全球创新 | Sponsored Keynote: Scaling Global Innovation with AWS and the Kubernetes Community - Nathan Taber, Head of Product, Kubernetes, AWS
Nathan Taber,AWS Kubernetes产品负责人,加入我们来了解亚马逊云科技和Kubernetes如何推动创新以及新技术,如机器学习和人工智能,以帮助客户在中国和全球更快地开展业务。Nathan Taber,AWS Kubernetes产品负责人,加入我们来强调AWS和Kubernetes如何推动创新以及新技术,如机器学习和人工智能,以帮助客户在中国和全球更快地前进。

Nathan Taber, AWS Head of Product for Kubernetes, joins us to highlight how AWS and Kubernetes are driving innovation and new technologies like Machine Learning and Artificial Intelligence to help customers move faster in China, and around the world.

Speakers
avatar for Nathan Taber

Nathan Taber

Sr. Product Manager, Amazon
Nathan is a Sr. Product Manager on the AWS Kubernetes team. Nathan has been part of the launch teams for several AWS container services and currently helps to set the vision and direction for Amazon Elastic Kubernetes Service, AWS’ managed Kubernetes service. He works closely with... Read More →


Thursday September 28, 2023 9:35am - 9:40am CST
3层 301明珠厅| 3F The Pearl Hall 301

9:40am CST

主论坛演讲: 华为终端云服务大规模云原生平台工程实践 | Keynote: Cloud Native Platform Engineering for Huawei Mobile Services at Scale - Kevin Wang, Lead of Cloud Native Open Source, Huawei
华为终端云服务(Huawei Mobile Services, HMS)是华为智能终端设备的"大脑",为终端用户提供云服务支持。 为了支撑海量数据的快速增长和多样化场景的需求,终端云服务基于华为云进行全面的云原生改造,将业务全部搬迁到华为云上,并构建面向移动开发的平台工程。 本次议题中,Kevin将分享: 1. 华为终端云服务想云原生演进的关键挑战, 2. 终端云服务构建大规模云原生平台的实践, 3. 云原生平台未来展望。

Huawei Mobile Services (HMS) is the "brain" of Huawei smart devices, providing cloud service support for consumers and developers. To support the rapid growth of massive data and the diverse scenario requirements, Huawei Mobile Services are comprehensively transformed into cloud native based on Huawei Cloud, moving all businesses to Huawei Cloud and building a platform engineering for mobile development. In this talk, Kevin will share: 1. The key challenges of Huawei Mobile Services' evolution to cloud native, 2. The practice of building a large-scale cloud native platform for Huawei Mobile Services, 3. The future prospects of the cloud native platform.

Speakers
avatar for Kevin Wang

Kevin Wang

CNCF Ambassador, TOC contributor, Kubernetes emeritus Maintainer, Founder and Maintainer of multiple CNCF projects, Lead of Cloud Native Open Source Team at Huawei, Huawei
Kevin Wang has been an outstanding contributor in the CNCF community since its beginning and is the leader of the cloud native open source team at Huawei. Kevin has contributed critical enhancements to Kubernetes, led the incubation of the KubeEdge, Volcano, Karmada projects in CNCF... Read More →


Thursday September 28, 2023 9:40am - 9:55am CST
3层 301明珠厅| 3F The Pearl Hall 301

9:55am CST

主论坛演讲:代码生成模型的预训练和微调 | Keynote: Pre-training and Fine-tuning of Code Generation Models - Loubna Ben-Allal, Machine Learning Engineer, Hugging Face
基于代码训练的大型语言模型在代码补全和从自然语言描述中合成代码方面展现出了非凡的能力。在这次演讲中,我们将探讨构建和训练类似StarCoder的大型代码模型的幕后过程,这是一个跨足80多种编程语言的强大的150亿参数的代码生成模型,并且还融入了负责任的人工智能实践。此外,我们还将讨论如何使用开源库,包括transformers、datasets和PEFT,来利用这些模型。

Large Language Models trained on code have showcased remarkable abilities in code completion and synthesis from natural language descriptions. In this talk, we'll explore the behind-the-scenes process of building and training large code models like StarCoder, a robust 15B Code Generation model trained across 80+ programming languages, while also incorporating responsible AI practices. Additionally, we'll discuss how to leverage these models using open-source libraries like transformers and peft, and how to efficiently deploy them.

Speakers
avatar for Loubna Ben-Allal

Loubna Ben-Allal

Machine Learning Engineer, Hugging Face
Loubna Ben Allal is a Machine Learning Engineer at Hugging Face in the Science team. She specializes in the training and evaluation of Large Language Models, in particular for Code. She's also co-author of The Stack and StarCoder models for code generation.



Thursday September 28, 2023 9:55am - 10:10am CST
3层 301明珠厅| 3F The Pearl Hall 301

10:10am CST

主论坛演讲: 闭幕词 | Keynote: Closing Remarks
Speakers
avatar for Kevin Wang

Kevin Wang

CNCF Ambassador, TOC contributor, Kubernetes emeritus Maintainer, Founder and Maintainer of multiple CNCF projects, Lead of Cloud Native Open Source Team at Huawei, Huawei
Kevin Wang has been an outstanding contributor in the CNCF community since its beginning and is the leader of the cloud native open source team at Huawei. Kevin has contributed critical enhancements to Kubernetes, led the incubation of the KubeEdge, Volcano, Karmada projects in CNCF... Read More →
avatar for Fog Dong

Fog Dong

Senior Engineer, BentoML
Fog Dong is a senior engineer at BentoML, a maintainer of KubeVela, and a CNCF ambassador. With a strong focus on cloud native DevOps, Fog actively contributes to the open source community. Currently, she is dedicated to developing the fastest approach to building AI applications... Read More →


Thursday September 28, 2023 10:10am - 10:30am CST
3层 301明珠厅| 3F The Pearl Hall 301

10:30am CST

茶歇 | Coffee Break ☕
Thursday September 28, 2023 10:30am - 11:00am CST
1层展厅|1F Exhibition Hall

10:30am CST

女性赋能交流会 | EmpowerUs
我们很高兴邀请在KubeCon + CloudNativeCon + Open Source Summit的参会者中认同为女性、非二元性别个体或盟友的人士参加这个早晨的交流活动,与其他参会者一起开放讨论我们快速发展的生态系统中的挑战、领导创新和赋权。

Attendees who identify as women, non-binary individuals, or allies at KubeCon + CloudNativeCon + Open Source Summit are invited to join this morning networking break to have open discussions with fellow attendees about challenge, leadership innovation, and empowerment in our fast-growing ecosystem.

Thursday September 28, 2023 10:30am - 11:30am CST
3夹层 3M2会议室 | 3M Room 3M2

10:30am CST

Project Pavilion 项目展馆
Attending in-person? Swing by the Project Pavilion located inside the Solutions Showcase in 1F Exhibition Hall to connect with project maintainers and learn more about each project, ask questions or exchange ideas. 

現場参加活动?请来到位于1层展厅的解决方案展示区的项目展馆,与 maintainers互动,了解更多关于项目的信息,提问或者交流想法。

Visit the Project Engagement website for more information.

Thursday September 28, 2023 10:30am - 12:30pm CST
1层展厅|1F Exhibition Hall

10:30am CST

解决方案展示 | Solutions Showcase
请访问我们在解决方案展示区的赞助商,尝试最新的演示,观看现场演示,与专家交谈,了解工作机会,并获得一些赠品。
Visit our sponsors in the Solutions Showcase to try the latest demos, watch live presentations, talk to experts, check out job opportunities, and score some swag.

为了促进活动中的网络和业务关系,您可以选择访问第三方的展位或者获取赞助内容。我们不会强制要求您参观第三方展位或获取赞助内容。当您访问展位或参与赞助活动时,第三方将收到您的一些注册数据。这些数据包括您的名字、姓氏、职位、公司、地址、电子邮件、标准人口统计问题(例如工作职能、行业)以及您与赞助内容或资源互动的详细信息。如果您选择与展位互动或获取赞助内容,您明确同意第三方接收和使用此类数据,这将受到他们自己的隐私政策的约束。

In order to facilitate networking and business relationships at the event, you may choose to visit a third party’s booth or to access sponsored content. You are never required to visit third-party booths or to access sponsored content. When visiting a booth or participating in sponsored activities, the third party will receive some of your registration data. This data includes your first name, last name, title, company, address, email, standard demographics questions (i.e. job function, industry), and details about the sponsored content or resources you interacted with. If you choose to interact with a booth or access sponsored content, you are explicitly consenting to receipt and use of such data by the third-party recipients, which will be subject to their own privacy policies.

Thursday September 28, 2023 10:30am - 2:00pm CST
1层展厅|1F Exhibition Hall

11:00am CST

Chaos Mesh:概述、实践与未来 | Chaos Mesh: Overview, Practice and Future - Zhou Zhiqiang, Individual; Cwen Yin, PingCAP; Xianglin Gao, Tencent
加入我们,深入探讨Chaos Mesh,这是一款终极的开源混沌工程工具。在本次演讲中,我们将提供对Chaos Mesh的深入概述,包括其实际应用和充满前景的未来。了解Chaos Mesh如何赋予工程师在生产环境中创建可控混乱实验、发现漏洞并增强系统可靠性的能力。全面了解Chaos Mesh的架构、与Kubernetes的集成以及其独特的特性,使其与众不同。我们还将展示真实世界的示例和最佳实践,展示使用Chaos Mesh的实际方面。准备好探索Chaos Mesh为构建弹性和可靠系统提供的无限可能性。不要错过这个理解可控混乱的力量以及其对系统弹性的影响的机会,与Chaos Mesh一起。

Join us as we delve into Chaos Mesh, the ultimate open-source tool for chaos engineering. In this presentation, we will provide an insightful overview of Chaos Mesh, its practical applications, and its promising future. Discover how Chaos Mesh empowers engineers to create controlled chaos experiments in production environments, uncover vulnerabilities, and enhance system reliability. Gain a solid understanding of Chaos Mesh's architecture, integration with Kubernetes, and its unique features that set it apart. We will also demonstrate real-world examples and best practices, showcasing the practical aspects of using Chaos Mesh. Prepare to explore the limitless possibilities that Chaos Mesh offers for building resilient and dependable systems. Don't miss this opportunity to understand the power of controlled chaos and its impact on system resilience with Chaos Mesh.

Speakers
avatar for Cwen Yin

Cwen Yin

Tech Lead, PingCAP
Cwen Yin is a software developer and the tech leader of the Chaos Engineering Team at PingCAP, Previously led the development of the stability test framework of TiDB, a distributed database, he now focuses on the exploration and implementation of chaos engineering. He loves open source... Read More →
avatar for Xianglin Gao

Xianglin Gao

Expert Engineer, Tencent
Xianglin is a Expert Engineer currently working at Tencent. He brings a wealth of experience in the fields of Chaos Engineering and Kubernetes Cluster Management. In addition, he is a committer to the Chaos Mesh project and a member of Kubernetes, demonstrating his deep involvement... Read More →
avatar for Zhou Zhiqiang

Zhou Zhiqiang

Software Engineer, Individual
Zhou Zhiqiang is Chaos Mesh Maintainer.


Thursday September 28, 2023 11:00am - 11:35am CST
3夹层 3M5A会议室 | 3M Room 3M5A

11:00am CST

Longhorn:介绍、深入探讨和问答 | Longhorn: Intro, Deep Dive and Q+A - David Ko & Shuo Wu, SUSE
Longhorn是一个基于Kubernetes构建的云原生分布式块存储解决方案,用于在Kubernetes上运行。Longhorn旨在提供一种支持各种存储接口(包括块、文件系统和即将推出的对象网关)的主观解决方案。它设计有数据和操作服务,易于使用,并可在任何地方运行。在本次演讲中,我们将介绍Longhorn,讨论当前状态,介绍最新发布的突出项目,如新的v2数据引擎,分享技术设计和架构,未来路线图,并与观众进行深入讨论。Longhorn于2021年11月被云原生计算基金会接受为孵化项目。

Longhorn is a cloud-native distributed block storage solution built on Kubernetes and runs for Kubernetes. Longhorn is to provide an opinionated solution for supporting various storage interfaces including block, file system, and upcoming object gateway. It is designed with data and operation services, is easy to use, and runs anywhere. In this talk, we will introduce Longhorn, talk about the current status, present the outstanding items of the recent release like the new v2 data engine, share the technical design and architecture, the future roadmap, and have a deep-dive discussion with the audience. Longhorn was accepted as an incubating project by the Cloud Native Computing Foundation in November 2021.

Speakers
avatar for David Ko

David Ko

Senior Engineering Manager, SUSE
David Ko, a senior engineering manager at SUSE, is currently leading the Longhorn project (CNCF incubating) and is primarily dedicated to open-source development. David is not just a project/product/team/people manager, but also a hands-on developer and architect with 10+ years of... Read More →
SW

Shuo Wu

Senior Software Engineer, SUSE


Thursday September 28, 2023 11:00am - 11:35am CST
3夹层 3M5B会议室 | 3M Room 3M5B

11:00am CST

导航Kubernetes云提供商 | Navigating Kubernetes Cloud Provider - Pengfei Ni, Microsoft
让我们讨论一下Kubernetes云提供商是如何随着时间的推移发展的,以及您现在为了实现目标需要了解的内容。对云提供商内部的详细了解始于对四个控制器(节点、节点生命周期、路由和服务)的深入探索。 从内部云提供商迁移的关键概念将涵盖何时以及如何运行云控制器管理器,包括Kubelet镜像凭据提供程序和CSI驱动程序的外部迁移路径。了解如何构建自己的CCM将为您实现最佳结果提供一个框架。 外部迁移的最佳实践将包括最新更新的所有建议。您将了解Kubernetes如何与云提供商进行交互,以及如何在没有停机时间的情况下迁移到外部云控制器管理器。最后,我们将介绍如何与Kubernetes云提供商社区进行连接,您将获得满足云提供商需求的见解。

Let's discuss how Kubernetes Cloud Provider has evolved over time and what you need to know right now for your goals. A detailed look at cloud provider internals starts with an in-depth exploration of the four controllers (Node, Node Lifecycle, Route, and Service). Key concepts for migration from the in-tree cloud provider will cover when and how to run the cloud controller manager, including out-of-tree migration paths for the Kubelet image credential provider and the CSI drivers. Looking at how to build your own CCM will give you a framework for achieving optimal results. Best practices for out-of-tree migration will include all the latest recommendations in light of recent updates. You'll see how Kubernetes interacts with cloud providers and how to migrate to an external cloud controller manager without downtime. Wrapping up with a look at how you can connect with the Kubernetes Cloud Provider community, you will leave with insights into how to meet your cloud provider needs.

Speakers
avatar for Pengfei Ni

Pengfei Ni

Principal Software Engineer, Microsoft
Pengfei Ni is a Principal Software Engineer at Microsoft Azure and a maintainer of the Kubernetes project. With extensive experience in Cloud Computing, Kubernetes, and Software Defined Networking (SDN), he has delivered presentations at various conferences, including KubeCon, ArchSummit... Read More →



Thursday September 28, 2023 11:00am - 11:35am CST
3夹层 3M3会议室 | 3M Room 3M3

11:00am CST

云原生技术与文化背景:跨境最大化业务价值 | Cloud Native Technology and Cultural Context: Maximizing Business Value Across Borders - Katerina Arzhayev, SUSE
在当今全球经济中,了解不同文化背景对新技术的采用和认知的影响至关重要。云原生技术为企业提供了创新、可扩展性和成本降低等重要优势。然而,这些技术的商业价值在不同文化和地区可能存在差异。 文化对风险和创新的态度影响了云原生技术的采用。有些文化可能更注重稳定性和可预测性,而不是创新和敏捷性。监管环境、市场条件和文化对技术和创新的态度等因素都会影响长期战略方向。 在本次演讲中,我们将讨论如何有效地向具有不同文化背景的利益相关者、高管和投资者传达云原生技术的优势。

In today's global economy, it's crucial to understand how different cultural contexts impact the adoption and perception of new technologies. Cloud native technology offers significant benefits for businesses, such as innovation, scalability, and cost reduction. However, the business value of these technologies may vary across different cultures and regions. Cultural attitudes towards risk and innovation impact the adoption of cloud native technologies. Some cultures may prioritize stability and predictability over innovation and agility. Factors such as regulatory environments, market conditions, and cultural attitudes towards technology and innovation all affect long term strategic direction. In this talk, we will discuss how to effectively communicate the benefits of cloud native technologies to stakeholders, executives, and investors with different cultural backgrounds.

Speakers
avatar for Katerina Arzhayev

Katerina Arzhayev

Principal Product Manager, SUSE
Katerina Arzhayev is experienced in cross-cultural collaboration and technology strategy. She has a proven track record of driving business results through effective communication and strategic planning. Katerina's expertise lies in making highly complicated topics accessible to non-technical... Read More →



Thursday September 28, 2023 11:00am - 11:35am CST
2层 会议室 3 | 2F Room 3
  云原生体验 | Cloud Native Experience

11:00am CST

Carvel:云计算问题的清洁工具 | Carvel: Clean Tools for Cloudy Problems - Leigh Capili, VMware
UNIX计算机使用起来很美观。 对于你可能遇到的每一个小问题,都有一个命令行工具可以用来解决特定的子问题。 相比之下,云原生系统可能会很混乱、复杂且难以理解。 Carvel项目将UNIX哲学引入云原生领域。 它是一个开源的套件,由一系列小型、单一用途的工具组成,每个工具都能很好地完成一件事情。 需要处理镜像摘要吗?Carvel有一个专门的工具。 想要将不同仓库的依赖关系粘合在一起吗?"vendir"会为你解决! 有太多的YAML文件吗? Carvel的ytt非常擅长解决YAML问题,无论是来自Kubernetes、GitHub Actions、Docker Swarm还是其他任何地方。 Carvel的各个工具并不强制你使用一个大工作流程,它们理解你需要以自己的方式解决问题。 快来学习如何使用Carvel工具来解决你云工作流程中的痛点吧!

UNIX computers feel beautiful to use. For every little issue you might run into, there's a command-line tool that you can use to solve that specific sub-problem. In contrast, cloud-native systems can be messy, complex, and difficult to comprehend. The Carvel project brings the UNIX philosophy to the cloud-native landscape. It's an open-source suite of small, single-purpose tools that do one thing well. Working with image digests? Carvel has a tool for that. Do you want to glue some dependencies from different repos together? "vendir" will take care of you! Do you have too much YAML? Carvel's ytt is really good at solving YAML problems -- regardless of whether they are from Kubernetes, GitHub Actions, Docker Swarm, or anything else. Rather than forcing you into one big workflow, Carvel's individual tools understand that you need to solve problems your way. Come learn how you can use the Carvel tools to solve the pain points in your cloud workflow!

Speakers
avatar for Leigh Capili

Leigh Capili

Staff Developer Advocate, VMware
Leigh is an empathetic speaker and developer with niches in cloud-native systems and security. He authored kubeadm’s etcd mTLS implementation and Flux 2’s multi-tenant security model. Leigh works with the VMware Tanzu Advocacy team and previously built Developer Experience and... Read More →


Thursday September 28, 2023 11:00am - 11:35am CST
3层 305A会议室| 3F Room 305A
  云原生新手 | Cloud Native Novice

11:00am CST

平台工程的普及趋势将如何重塑开发者体验? | How will the trend towards platform engineering adoption reshape the developer experience? - Chris Yang, vivo
平台工程的普及趋势将如何重塑开发者体验?

随着IT行业及大环境的快速多变,平台工程在多种因素促进下持续火热,从开发人员、运维人员到技术Leader,从业务到平台,从大型科技企业到中小组织,甚至到产品经理,都在关注和讨论。

经过50多年的发展,软件工程行业来到了一个跃迁边缘,平台工程即将成为这种临界态的推动因素之一。回归到软件开发本质论的角度思考,软件的本质目的是解决问题和提供价值,而软件开发活动则是由人组织各项资源来实施。那么,在我们希望提升开发效率和质量的时候,人就成为需要考虑的第一要素,而平台工程给开发者体验带来了一次绝佳的重塑机会,以至于有观点表示 “这是开发者最好的时代!”
  1. 平台工程现状概览 :关键事件与时间轴
  2. 平台工程的 WHAT 和 WHY
  3. 平台工程与开发者体验
  4. 重塑开发者体验:Developer First | Developer Control Plane | Developer Journey
  5. 当前挑战与未来趋势


After over 50 years of development, the software engineering industry has reached a critical turning point, and platform engineering is set to become one of the driving factors. From the perspective of returning to the essence of software development, the purpose of software is to solve problems and provide value, and software development activities are implemented by people organizing various resources. Therefore, when we want to improve development efficiency and quality, people become the first element to consider, and platform engineering provides an excellent opportunity to reshape the developer experience.

1. Platform engineering status overview: key events and timeline in China
2. The WHAT and WHY of platform engineering
3. Platform engineering and the developer experience
4. Reshaping the developer experience: Developer First | Developer Control Plane | Developer Journey
5. Current challenges and future trends

Speakers
avatar for Chris Yang

Chris Yang

R&D Director, vivo
杨振涛(Chris),vivo 互联网 研发总监,目前关注研发管理、技术领导力、软件过程与DevOps、开源治理以及技术社区与工程文化建设。开发者体验、平台工程技术洞察者、实践者与推动者。- 在多个领域积累了15... Read More →



Thursday September 28, 2023 11:00am - 11:35am CST
2层 会议室 2 | 2F Room 2
  平台工程 | Platform Engineering

11:00am CST

基于SPDK的Ublk和Vduse的用户空间块服务 | Userspace Block Services Based on Ublk and Vduse in SPDK - Liu Xiaodong & Changpeng Liu, Intel
Ublk和Vduse是新兴的Linux内核框架,允许用户空间驱动程序高效地向内核公开块设备。SPDK可以利用这些框架提供具有最小开销和高性能的存储服务,以便本地提供容器工作负载。 本次会议将介绍如何启用SPDK以基于vduse和ublk实现用户空间块服务。将解释ublk和vduse概念上的相似之处和差异。还将演示这些方法与传统方法相比的优势和性能。最后,我们将分享使用SPDK工具和库开发和评估这些块服务的经验和挑战。

Ublk and Vduse are emerging Linux kernel frameworks that allow userspace drivers to expose block devices to the kernel efficiently. SPDK can use these frameworks to provide storage services with minimal overhead and high performance to serve container workload locally. This session will go over how to enable SPDK to implement userspace block services based on vduse and ublk. The similarities and differences in the concepts of ublk and vduse will be explained. It will also demonstrate the benefits and performance of these approaches compared to traditional methods. Finally, we will share our experience and challenges of developing and evaluating these block services using SPDK tools and libraries.

Speakers
avatar for Changpeng Liu

Changpeng Liu

Cloud Software Engineer, Intel
Changpeng is a Cloud Software Engineer in Intel. He has been working on Storage Performance Development Kit since 2014. Currently, Changpeng is a core maintainer for the SPDK. His areas of expertise include NVMe, I/O Virtualization, and storage offload on IPU.
avatar for Xiaodong Liu

Xiaodong Liu

Senior Cloud Engineer, Intel
Xiaodong is a senior cloud software engineer at Intel. He works on the areas of cloud native storage, storage acceleration, storage protocols and storage virtualization, mainly contributing to Storage Performance Development Kit (SPDK) and Intel Intelligent acceleration Library (ISA-L... Read More →



Thursday September 28, 2023 11:00am - 11:35am CST
3层 302会议室| 3F Room 302
  操作系统 | Operating Systems

11:00am CST

基于WebAssembly的FaaS框架,具备分布式机器学习能力 | WebAssembly-Based FaaS Framework with Distributed Machine Learning Capabilities - Wilson Wang, ByteDance & Michael Yuan, Second State
我们的目标是创建一个利用WebAssembly的FaaS平台,这是一种安全且轻量级的技术,用于机器学习任务,特别是推理作业。为了实现这一目标,我们正在将WebAssembly与Ray集成,Ray是一个广泛使用的用于扩展人工智能和Python应用程序的框架,以创建一个具有分布式机器学习能力的强大FaaS平台。 Ray的高级功能,如分布式调度、对象存储和任务间通信,使其成为一个优秀的ML-enabled FaaS平台选择,它统一了资源抽象,并消除了不同FaaS函数之间的障碍。通过将WebAssembly与Ray集成,我们可以使任务更加轻量级,并扩展Ray对编程语言(如Rust、Go和JavaScript)的支持,以简化现有应用程序的移植过程。

Our goal is to create a FaaS platform that utilizes WebAssembly, a secure and lightweight technology, for Machine Learning tasks, specifically inference jobs. To achieve this, we are integrating WebAssembly with Ray, a widely-used framework for scaling AI and Python applications, to create a powerful FaaS platform with distributed Machine Learning capabilities. Ray's advanced features, such as distributed scheduling, object storage, and inter-task communication, make it an excellent choice for an ML-enabled FaaS platform that unifies resource abstraction and eliminates barriers between different FaaS functions. By integrating WebAssembly with Ray, we can make tasks even more lightweight, and expand Ray's support for programming languages, such as Rust, Go, and JavaScript, to simplify the process of porting existing applications.

Speakers
avatar for Michael Yuan

Michael Yuan

Co founder of Second State and maintainer of WasmEdge, Second State
Dr. Michael Yuan is a maintainer of WasmEdge Runtime (a project under CNCF) and a co-founder of Second State. He is the author of 5 books on software engineering published by Addison-Wesley, Prentice-Hall, and O'Reilly. Michael is a long-time open-source developer and contributor... Read More →
avatar for Wilson Wang

Wilson Wang

Research Engineer, ByteDance
Wilson Wang is a Research Engineer at ByteDance. His research areas include Virtual Machines, Network Virtualization, WebAssembly, and Operating Systems.



Thursday September 28, 2023 11:00am - 11:35am CST
3层 305B会议室| 3F Room 305B
  新兴和先进技术 | Emerging + Advanced

11:00am CST

开源开发的未来导航:机遇、风险、最佳实践与GAI | Navigating the Future of Open Source Development: Opportunities, Risks, Best Practices with GAI - Anni Lai, Futurewei
随着人工智能的发展,软件开发的格局也在不断变化。这一转变为编程带来了新的前沿:生成式人工智能(GAI)。随着我们进入这个新时代,探索和理解伴随这一革命性工具而来的机遇和风险对于开源社区来说至关重要。本次会议旨在揭示将GAI整合到开源开发中的复杂性,深入探讨在开源开发社区中负责任且有效地融入GAI的潜在优势、风险和可能的最佳实践。让我们一起开启一场关于AGI时代下开源软件开发未来的重要对话。在探索如何利用人工智能的力量为开源社区服务的同时,共同应对潜在的陷阱,推动一个公平、创新和包容的未来。

As AI evolves, so does the landscape of software development. This transformation has opened the door to new frontier in programming: Generative AI (GAI). As we move into this new era, it's impotent to explore and understand both the opportunities and risks that accompany this revolutionary tool, specifically in the context of the open-source community. This session aims to unravel the complexities of integrating GAI into open-source development, providing an in-depth exploration of the potential advantages, risks, and possible best practices for incorporating GAI responsibly and effectively within the open-source development community. Join us as we start a critical conversation on the future of open-source software development in the age of AGI. Together, let's explore how we can harness the power of AI for the open-source community while navigating potential pitfalls to promote a fair, innovative and inclusive future.

Speakers
avatar for Anni Lai

Anni Lai

Head of Open Source Operations & Marketing, www.futurewei.com
Anni drives Futurewei’s open source (O.S.) governance, process, compliance, training, project alignment, and ecosystem building. Anni has a long history of serving on various O.S. boards such as OpenStack Foundation, LF CNCF, LF OCI, LF Edge, and is on the LF OMF board and LF Europe... Read More →


Thursday September 28, 2023 11:00am - 11:35am CST
3层 307会议室| 3F Room 307

11:00am CST

揭秘KubeEdge的非侵入式服务访问流量闭环 | Unveiling Non-Invasive Service Access Traffic Closure with KubeEdge - Xu Shiwei, Huawei
在边缘计算场景中,边缘节点通常分布在不同的地理区域。来自不同区域的网络没有互联。KubeEdge是基于Kubernetes构建的,将本地容器化应用编排和设备管理扩展到边缘主机。在Kubernetes中,服务和端点事件被分派到所有节点,消耗大量带宽。这对于带宽较低的边缘场景不适用。KubeEdge提供了边缘节点分组的能力,允许服务根据集群的节点拓扑路由流量。它还引入了面向边缘的工作负载类型,解决了跨区域应用部署中的操作复杂性。 本次分享会将涵盖以下主题: 1. KubeEdge中节点分组的实现 2. KubeEdge中面向边缘的工作负载 3. KubeEdge中流量闭环的实现

In edge computing scenarios, edge nodes are typically distributed across different geographical regions. Networks from different regions are not interconnected. KubeEdge is built upon Kubernetes and extends native containerized application orchestration and device management to hosts at the Edge. In Kubernetes, service and endpoint events are dispatched to all nodes, consuming a significant amount of bandwidth. This is not suitable for edge scenarios with low bandwidth. KubeEdge provides the capability for edge node grouping, allowing services to route traffic based on the node topology of the cluster. It also introduces edge-oriented workload types, addressing the complexity of operations in cross-regional application deployment. This sharing session will cover the following topics: 1.Implementation of Node Grouping in KubeEdge 2.Edge-Oriented Workloads in KubeEdge 3.Implementation of Traffic Closed-Loop in KubeEdge

Speakers
avatar for xu Shiwei

xu Shiwei

Senior Engineer, Huawei Cloud Computing Technologies Co., Ltd.
Shiwei Xu, a Senior Engineer at Huawei Cloud. He is mainly responsible for the design and development of Huawei Cloud's native intelligent edge platform. With rich experience in open-source communities and commercial implementations in areas such as cloud-native and edge computin... Read More →



Thursday September 28, 2023 11:00am - 11:35am CST
2层 会议室 4 | 2F Room 4
  网络+边缘+电信 | Networking + Edge + Telco

11:00am CST

通过containerd和Kata Containers接口演进提高节点稳定性 | Enhancing Node Stability Through Containerd and Kata Containers Interface Evolution - Chao Wu, Alibaba Cloud & Peng Tao, Ant Group
containerd和Kata Containers,两者都是被命名为容器运行时项目,但实际上它们位于软件堆栈的不同层次。虽然这两个项目已经共同发展了五年多,但容器运行时这个术语常常让很多人感到困惑。 本次演讲将探讨这两个项目的细节,它们的区别以及它们如何共同工作。特别是,我们将详细介绍它们之间的接口演变,从shim-v1到shim-v2,再到正在进行中的sandbox API。我们将解释每个接口变化,并展示为什么引入它,以及它如何导致两个项目的架构变化,从而实现更快的容器运行时,并不时提高节点的稳定性。 最后,我们将展示containerd sandbox API和Kata Containers内置的sandbox功能的最新改进,进一步展示这两个项目如何共同创新,并为容器生态系统带来价值。

containerd and Kata Containers, both are named container runtime projects but they are actually at different layers of the software stack. While the two projects have been evolving together for more than five years, the term container runtime is often confusing to many. This talk will explore the details of the two projects, what's the difference, and how they work together. Especially, we'll take a look at the interface evolvement between them, from shim-v1 to shim-v2, and to the ongoing sandbox APIs. We'll example each of the interface changes, and show why it was introduced, and how it resulted in architectural changes in both projects, which in turn resulted in faster container runtimes, and enhanced the node stability from time to time. In the end, we'll demonstrate the latest improvements with the containerd sandbox API and Kata Containers' built-in sandbox feature, which further shows how the two projects are innovating together and bringing value to the container ecosystem.

Speakers
avatar for Peng Tao

Peng Tao

Staff Engineer, Ant Group
Kata Containers architecture committee member, Nydus maintainer, and Linux kernel developer.
avatar for Chao Wu

Chao Wu

Senior Software Engineer, Alibaba Cloud
Chao Wu is a senior software engineer from Alibaba Cloud. He loves contribute to open source community and he is the maintainer of the project Kata Container. He loves to learn everything about the container and kubernetes ecosystem.



Thursday September 28, 2023 11:00am - 11:35am CST
2层 会议室 1 | 2F Room 1
  运维+性能 | Operations + Performance

11:50am CST

Crossplane介绍和深入剖析 - 云原生控制平面框架 | Crossplane Intro and Deep Dive - The Cloud Native Control Plane Framework - Ying Mo, IBM
Crossplane的维护者,一个CNCF孵化项目,将主持这个会议,向新的参与者介绍该项目,并深入探讨Crossplane的功能和路线图的细节。我们将解释Crossplane如何使您能够将云基础设施和服务组合成自定义平台API,并介绍如何最好地开始构建自己的平台。 我们将介绍最新版本中包含的关键功能,以及它们解决的问题和使用案例,以及如何将它们应用到您的控制平面中。最后,将有一个互动机会与维护者交流,提问,并影响项目未来的方向。

The maintainers of Crossplane, a CNCF Incubating project, will lead this session that will introduce the project to new attendees, as well as dive into the finer details of Crossplane’s functionality and roadmap. We will explain how Crossplane enables you to compose cloud infrastructure and services into your custom platform APIs, and how best to get started building a platform of your own. We will take a tour through the key features included in the latest releases, what problems and use cases they are solving, and how you can adopt them into your control planes. Finally, there will be an interactive opportunity to engage with the maintainers, ask questions, and influence the future of the project direction.

Speakers
avatar for Ying Mo

Ying Mo

Senior Software Engineer, IBM
Ying Mo is a Senior Software Engineer at IBM, working on IBM Cloud Pak for AIOps, focusing on multi-cloud management and monitoring using Kubernetes and container technology. He is always enthusiastic about bringing innovative idea into product by leveraging open source technology... Read More →



Thursday September 28, 2023 11:50am - 12:25pm CST
3夹层 3M5A会议室 | 3M Room 3M5A
  Maintainer Track, Crossplane

11:50am CST

Kubernetes SIG节点介绍和深入探讨 | Kubernetes SIG Node Intro and Deep Dive - Paco Xu, DaoCloud & Xiongxiong Yuan, Gitlab China
这对于 Kubernetes SIG Node 来说是令人兴奋的时刻。来参加我们的维护者跟踪会议,了解刚发布的 Kubernetes 1.28 版本,其中包含了令人兴奋的改进,并对 SIG Node 的路线图有所了解。 SIG Node 拥有控制 Pod 与主机资源之间交互的组件,包括 Kubelet、容器运行时接口(CRI)和节点 API。SIG Node 负责 Pod 的生命周期,从分配到拆除,包括活力检查和共享资源管理。我们与各种容器运行时、内核、网络、存储等进行合作;任何 Pod 接触到的东西都是 SIG Node 的责任! 我们将讨论 kubelet 如何处理 Pod 的生命周期,包括探针和钩子,以及节点和 Pod 的优雅关闭,以及许多其他改进。 加入这个会议,了解更多关于我们的 SIG 的信息,以及如何参与进来,使 Node 变得更好!

本次分享会侧重在 Kubelet Deep Dive,会重点分享下
1. Pod 生命周期: PLEG 
2. 驱逐和节点关闭:优雅关机,断电恢复
3. 资源管理: cgroup v2,Swap ,VPA,DRA

These are exciting times for Kubernetes SIG Node. Come to our maintainers‘ track session to learn about the just released version 1.28 of Kubernetes, full of exciting improvements, and get a glance into the SIG Node roadmap. SIG Node owns components that control interactions between pods and host resources, including the Kubelet, Container Runtime Interface (CRI), and Node API. SIG Node is responsible for the Pod’s lifecycle from allocation to teardown, to liveness checks and shared resource management. We work with the various container runtimes, kernels, networking, storage, and more; anything a pod touches is SIG Node’s responsibility! We will discuss how kubelet handles pod lifecycle including probes and hooks, how node and pod gracefully shutdown as well as many other improvements. Join this session to learn more about our SIG, and how you might get involved to make Node even better!

Speakers
avatar for 徐俊杰 Paco

徐俊杰 Paco

Lead of Open Source Team, DaoCloud
Paco is a kubeadm maintainer and an active kubernetes contributor, and he mainly works on SIG-Node & SIG-Cli/SIG-Testing. Paco is currently the leader of the open-source team in DaoCloud, KCD Chengdu 2022 organizer, and a speaker in KCD Shanghai, Kubecon EU 2023, and Kubecon China... Read More →
avatar for Xiongxiong Yuan

Xiongxiong Yuan

Beijing, JiHu(GitLab)
I am an open-source enthusiast who has contributed extensively to the cloud-native field, including projects like Kubernetes (k8s), Helm, and Istio. Currently, I am a member of the Kubernetes community and a maintainer of Helm. I am also the maintainer of Helmfile.



Thursday September 28, 2023 11:50am - 12:25pm CST
3夹层 3M3会议室 | 3M Room 3M3
  Maintainer Track, SIG Node

11:50am CST

Kubernetes 文档和本地化 | Kubernetes Documentation and Localization - Michael Yao, DaoCloud & Xin Li, Qihoo 360
本次会议将概述创建、构建、本地化和维护Kubernetes网站的过程。它为新贡献者提供了有关入门和熟悉文档样式指南的有益指导。与会者还将了解GitOps工作流程以及如何与其他对云原生技术感兴趣的人建立联系。通过遵循这些提示,您可以无缝地融入Kubernetes社区并为这个重要的开源项目做出贡献。 本次会议涵盖以下主题: - k8s网站的构建方式 - k8s网站包含的内容和可用的语言 - 如何为k8s文档和本地化做出贡献 - k8s网站的贡献者及其贡献分析

This session offers an overview of the process for creating, building, localizing, and maintaining the Kubernetes website. It provides helpful guidance for new contributors on getting started and becoming familiar with the documentation style guide. Attendees will also learn about the GitOps workflow and how to connect with others who share an interest in cloud-native technologies. By following these tips, you can seamlessly integrate into the Kubernetes community and make contributions to this important open-source project. The session covers the following topics: - How the k8s website is built - What content is included in the k8s website and which languages are available - How to contribute to k8s docs and localization - Contributors to k8s website and analysis of their contributions

Speakers
avatar for Michael Yao

Michael Yao

TW Lead, DaoCloud
Michael Yao is a docs maintainer of k8s and Istio as a seasoned technical writer. He is the TW Lead in DaoCloud. He contributed to guide new contributors on k8s and Istio style guides, GitOps workflows, community code of conduct, and other essential information to ensure successful... Read More →
avatar for 李鑫 Xin Li

李鑫 Xin Li

Senior Engineer of Server Development,, Qihoo360
Li Xin is an experienced senior backend R&D engineer working at qihoo360, where his team provides high-speed and reliable infrastructure for qihoo360's brain. He focuses on the convergence of cloud native and HPC, and is currently the approver of the volcano project. he is also passionate... Read More →



Thursday September 28, 2023 11:50am - 12:25pm CST
2层 会议室 3 | 2F Room 3
  云原生体验 | Cloud Native Experience

11:50am CST

从新手到贡献者:在Kubernetes和CNCF开源项目中留下自己的印记 | From Novice to Contributor: Making Your Mark in Kubernetes and CNCF Open Source Projects - Yuan Chen, Apple Inc.
Kubernetes开源贡献者,来自苹果公司的软件工程师陈源,将介绍如何在Kubernetes和CNCF开源项目中做贡献。陈源将分享他的个人经验,通过实际例子给开源新手提供一份全面的开源贡献路线图,包括基本流程,提交issues和PRs,以及有效的沟通和讨论等。演讲将讨论开源新手可能面临的挑战和解决问题的策略,还将建议如何在本职工作和开源贡献之间取得更好的平衡。这个演讲旨在给开源新手提供信息,知识和信心,使他们能够对CNCF开源项目做出贡献。

Yuan Chen, an active Kubernetes open source contributor from Apple, will guide open source novices on a journey to make their initial contributions in the world of Kubernetes and CNCF projects. Drawing from his personal experience, Yuan will provide a comprehensive roadmap, offering a step-by-step walkthrough on filing issues, submitting pull requests, engaging in fruitful discussions, and navigating the review process. Yuan will address the potential challenges that may arise along the open source path and share effective strategies for conflict resolution. Additionally, he will provide invaluable insights into time management, empowering individuals to strike a harmonious balance between personal/work commitments and their open source endeavors. This talk is designed to empower open source novices, equipping them with the knowledge and confidence to make their initial and impactful contributions that truly count within the CNCF community.

Speakers
avatar for Yuan Chen

Yuan Chen

Software Engineer, Apple Inc.
Yuan Chen is a Software Engineer at Apple Cloud Services, contributing to the development of Apple's Kubernetes infrastructure since 2019. With extensive experience, he has made significant contributions to the Kubernetes community and delivered 9 talks at KubeCon. Yuan's background... Read More →



Thursday September 28, 2023 11:50am - 12:25pm CST
3层 305A会议室| 3F Room 305A
  云原生新手 | Cloud Native Novice

11:50am CST

使用可插拔和可定制的智能运行时提升工作负载的QoS | Enhance Workload QoS with Pluggable and Customizable Smarter Runtimes - Rougang Han, Alibaba & Kang Zhang, Intel
随着云基础业务类型和硬件资源的日益丰富,数据中心的资源利用率得到了显著提高,但也带来了资源争用的风险。在提高节点资源利用率的同时,确保应用程序的QoS(服务质量)而不妥协,并避免嘈杂的邻居问题是一个关键挑战。本主题介绍并演示了如何通过在Kubernetes上集成最新的CRI运行时(例如containerd、cri-o)中的NRI(节点资源接口),在Koordinator中确保工作负载的QoS。与传统方法(如独立和运行时代理拦截CRI请求)相比,NRI的插件化设计避免了对Kubelet的侵入性修改,极大增强了Koordinator部署的灵活性,实时处理Pod生命周期事件,并为云原生系统提供了一种优雅且标准化的资源管理解决方案。

With increasingly abundant cloud-based business types and hardware resources, the resource utilization of data centers has dramatically improved but also brings the risk of resource contention. While enhancing the utilization of node resources, one of the critical challenges is to ensure application QoS (Quality of Service) without compromise and avoid the Noisy Neighbor problem. This topic introduces and demonstrates how to ensure workload QoS in the Koordinator, an open-source scheduling system on Kubernetes, by incorporating the latest NRI (Node Resource Interface) in CRI runtime (e.g. containerd, cri-o). Compared to traditional approaches, such as standalone and runtime proxy to intercept CRI requests, the plugin-based design of NRI avoids invasive modifications to Kubelet, greatly enhancing the flexibility of Koordinator deployment, real-time processing of pod lifecycle events and providing an elegant and standardized resource management solution for cloud-native systems.

Speakers
avatar for Kang Zhang

Kang Zhang

Cloud Software Engineer, Intel
Zhang, Kang, master of Tongji University. Intel Cloud Software Engineer, 5 years experiences in cloud storage field. Used to work at Dell EMC, Morgan Stanley.
avatar for Rougang Han

Rougang Han

Software Engineer, Alibaba Cloud
Rougang Han is an engineer at Alibaba Cloud. @koordinator-sh member. Focus on scheduling and resource management in large-scale clusters.



Thursday September 28, 2023 11:50am - 12:25pm CST
3层 301明珠厅| 3F The Pearl Hall 301
  平台工程 | Platform Engineering

11:50am CST

QuarkContainers:用于无服务器的高性能安全容器 | QuarkContainers: High Performance Secure Container for Serverless - Shaobao Feng, Huawei Cloud
当前的Runc容器运行时无法满足无服务器计算的要求: 1. 容器应该具有强大的隔离性。 2. 开销应该足够小,以支持在单个主机上运行数千个实例。 3. 启动时间应小于100毫秒。 4. 性能降低,特别是IO和网络,应该足够小以被忽略。 在本次会议中,我们将介绍如何对Quark容器上的安全容器进行性能增强: 1. 如何通过rust在KVM中实现应用内核以及它带来的好处。 2. 如何使用io_uring和RDMA修复IO和网络性能降低的缺陷。 3. 如何通过“休眠”加速容器的启动,该方法通过停止vCPU并交换页面来实现。 4. 如何通过将Quark集成到Kuasar中来移除shim进程。

Current Container Runtime of Runc can not meet the requirements of Serverless Computing: 1. Containers should be strongly isolated. 2. The overhead should be small enough to support running thousands of instances on a single host. 3. The start up time should be smaller than 100ms. 4. The performance degradation especially the IO and network, should be small to be ignored. In this session we will introduce how to make the performance enhancements we did for secure container on Quark Container, which is an open source secure container: 1. How to implement an application kernel by KVM in rust, and the benefits it brings. 2. How io_uring and RDMA fix the defect of the IO and network performance degradation. 3. How to accelerate the container startup by "Hibernating", which is implemented by stopping the vCPU and swapping out the pages. 4. How to remove the shim processes by integrating Quark it into Kuasar.

Speakers
avatar for Shaobao Feng

Shaobao Feng

Principal Engineer, Huawei Cloud
Shaobao is Principal Engineer working on Huawei Cloud, with his work focusing on the Serverless Platforms. He has been a leader in building secure container runtime of the first Serverless Kubernetes on public cloud. He is the main code contributor and maintainer of the open source... Read More →


Thursday September 28, 2023 11:50am - 12:25pm CST
3层 302会议室| 3F Room 302
  操作系统 | Operating Systems

11:50am CST

与KubeEdge一起航行:通过统一数据接口标准驱动多机器人调度 | Sailing with KubeEdge: Driving Multi-Robot Scheduling with Unified Data Interface Standard - Heng Zhang, Huawei
中国移动机器人产业联盟是由350多个组织组成的,于今年5月发布了AGV/AMR的统一数据接口标准。基于之前在机器人定向监控和多机器人协作方面的工作,我们正在努力在KubeEdge社区内实施这一标准。因此,我们开发了第一个采用这一标准的云原生开源多机器人调度系统。我们相信这项研究值得作为主论坛演讲展示,展示了社区在工业机器人领域推进云原生方法的承诺。 在本次会议中,与会者将会: 1. 了解这一标准以及社区如何推动其在开源生态系统中的采用。 2. 探索一个使用案例,展示我们如何利用云原生技术提高工业场景中的生产效率。 3. 了解行业标准制定者与开源社区之间的合作。

The Chinese Mobile Robot Industry Alliance, consisting of over 350 organizations, released a unified data interface standard for AGV/AMR on this May. Based on the previous work in robot oriented monitoring and multi-robot cooperation, we are working on implementing this standard within the KubeEdge community. As a result, we have developed the first cloud-native open-source multi-robot scheduling system that adopts this standard. We believe this research deserves to be presented as a keynote, showcasing the community's commitment to advancing cloud-native approaches in the industrial robotics domain. In this session, attendees will: 1. Learn about the standard and how the community is driving its adoption within the open-source ecosystem. 2. Explore a use case that shows how we leverage cloud-native technologies to improve production efficiency in industrial scenarios. 3. Understand the collaboration between the standard setters in the industry and the open-source community.

Speakers
avatar for Heng Zhang

Heng Zhang

Senior Engineer, Huawei Cloud Computing Technology Co., Ltd.
Heng Zhang received the B.E. degree and Ph.D. degree in mechanical engineering from Southeast University in 2016, and Shanghai Jiao Tong University, Shanghai in 2022, respectively. He joined Huawei as a senior engineer in 2022. He published over 10 scientific papers in top robotic... Read More →


Thursday September 28, 2023 11:50am - 12:25pm CST
3层 305B会议室| 3F Room 305B

11:50am CST

开放钱包:为什么世界需要一个开放的钱包,以及它应该如何实现? | OpenWallet: Why Does the World Need an Open Wallet & How Should It Be? - Wenjing Chu, Futurewei & Daniel Goldscheider, OpenWallet Foundation
OpenWallet基金会(OWF)是由全球领导者于2023年组建的最热门的新开源社区,通过基于标准的开源组件的协作,为数字钱包技术设定最佳实践。数字钱包是我们个人和集体都至关重要的新一代应用的关键所在。它需要一个开放和全球化的社区来推动这样一种技术基础设施的发展。OWF执行主任Daniel Goldscheider和OWF董事会成员兼技术咨询委员会成员Wenjing Chu将与中国的开源社区进行对话,向大家介绍OWF的全球使命、独特的治理结构、技术发展,并最重要的是倾听大家的反馈意见。加入Daniel和Wenjing,了解OpenWallet以及您在塑造我们的数字未来中可以发挥的作用。

The OpenWallet Foundation (OWF) is the hottest new open source community formed in 2023 by leaders around the world to set best practices for digital wallet technology through collaboration on standards-based open source components. The digital wallet holds the key to a new generation of applications that are critical to us, both individually and collectively. It requires an open and global community to drive the development of such a technology infrastructure. The OWF Executive Director Daniel Goldscheider and OWF Board Member and TAC Member Wenjing Chu will lead a conversation with the open source community in China, give the latest update on OWF's global mission, its unique governance structure, tech developments, and, most importantly, listen to your feedbacks. Join Daniel and Wenjing to learn about OpenWallet and the role you can play in shaping our digital future.

Speakers
avatar for Wenjing Chu

Wenjing Chu

Senior Director of Technology Strategy, Futurewei Technologies, Inc.
Wenjing is a senior director of technology strategy at Futurewei leading initiatives on trust in the future of computing. He is a Steering Committee member of the Trust over IP (ToIP) Foundation and co-Chairs the TSP and AI & Metaverse task forces. He is a Board Member of the OpenWallet... Read More →
avatar for Daniel Goldscheider

Daniel Goldscheider

Founder, OpenWallet Foundation
Open source software for interoperable wallets 


Thursday September 28, 2023 11:50am - 12:25pm CST
3层 307会议室| 3F Room 307

11:50am CST

Kubernetes 云边协作:优化大规模智能公园管理 | Kubernetes Cloud-Edge Collaboration: Streamlining Large-Scale Smart Park Management - Zhen Zhao, Sangfor & Linbo He, Alibaba Cloud
在这个演示中,我们将深入探讨物联网和5G驱动的云边协作场景,重点关注Kubernetes在大规模智能公园管理中的关键作用。 面对管理300多个智能公园中的10K+边缘设备的挑战,我们将讨论使用OpenYurt来巧妙解决不同环境中的个性化工作负载配置问题,解决由于物理网络隔离导致的流量闭环管理问题,克服阻碍边缘应用升级的弱连接问题,并减轻由众多边缘设备引起的带宽成本增加。这将实现各个公园的高效资源和边缘应用管理。 Sangfor和阿里巴巴云容器服务专家将分享他们的实践经验,共同揭示在云边协作架构中使用Kubernetes进行大规模智能公园管理的优化技术。

In this presentation, we'll delve into IoT and 5G-driven cloud-edge collaboration scenarios, focusing on the key role of Kubernetes in large-scale smart park management. Facing the challenge of managing 10K+ edge devices across 300+ smart parks, we'll discuss using OpenYurt to cleverly address personalized workload configurations in diverse environments, tackle traffic closed-loop management issues due to physical network isolation, overcome weak connections blocking edge appliction upgrades, and mitigate bandwidth cost increases caused by numerous edge devices. This leads to efficient resource and edge application management in various parks. Sangfor and Alibaba Cloud container service experts will share their hands-on experiences, jointly unveiling optimization techniques for large-scale smart park management using Kubernetes in a cloud-edge collaboration architecture.

Speakers
avatar for Linbo He

Linbo He

software engineer, alibaba cloud
I am a member of the Alibaba Cloud Container Service team and one of the founding contributors to the OpenYurt project. Since 2015, I have been actively engaged in the design, development, and open-source initiatives related to Kubernetes. I have taken on responsibilities in a variety... Read More →
avatar for Zhen Zhao

Zhen Zhao

Senior Cloud Native Engineer, Sangfor
Maintainer of OpenYurt open source project, has rich experience in cloud-native and edge computing, and is currently working as a Senior Cloud Native Engineer at Sangfor. Focuses on exploring and studying the latest developments and trends in cloud-native technologies, IoT, 5G, and... Read More →



Thursday September 28, 2023 11:50am - 12:25pm CST
2层 会议室 4 | 2F Room 4
  网络+边缘+电信 | Networking + Edge + Telco

11:50am CST

蚂蚁集团跨多个集群交付资源的自动化和无风险解决方案 | An Automated and Riskless Solution to Deliver Resources Accross Multiple Clusters in Ant Group - Yikun Wang & Jun Zhang, Ant Group
部署的滚动更新是一个巧妙的设计,在Kubernetes生态系统中的一些项目也提供了交付能力。然而,在生产和金融场景中这些还不够。在蚂蚁集团,这些需求包括:使用git作为唯一的真相源;多维度和自适应策略;跨多个集群的编排;支持大规模集群;以及自动自愈的能力。 基于以上原因,蚂蚁集团提供了一套Rollout套件。开发人员在git中维护多维度策略。在提交提交时,Rollout和SelfHeal Controller将观察工作负载并触发渐进式交付,同时对失败的任务进行自愈。引入Controller Mesh的能力来支持大规模集群的水平扩展操作员。本演示将描述相关原因和架构,并为您提供所有核心能力的详细信息。

The Deployment's rolling update is a clever design, and some projects in Kubernetes ecosystem also provide delivery capabilities. However, these are not enough in production and financial scenarios. At Ant Group, these requirements include: using git as the single source of truth; multi-dimensional and adaptive strategies; orchestration acrossing multiple clusters; support for large-scale clusters; and the ability to automatically self-heal. For the reasons above, Ant Group has provided a set of Rollout suites. developers maintain multi-dimensional strategies in git. When submitting a commit, the Rollout and SelfHeal Controller will observe workloads and trigger a gradient delivery while also performing self-healing for failed tasks. The ability of Controller Mesh is introduced to scale operators horizontally for supporting large-scale clusters. This presentation will describe the relevant causes and architecture, and provide you with detailed information on all core capabilities.

Speakers
avatar for Yikun Wang

Yikun Wang

Cloud Engineer, Ant Group
Yikun Wang is a Cloud Engineer in Ant Group, mainly focuses on workloads developing and Cloud Native Risk Defense, has experience around managing Cloud Native applications in large-scale K8s clusters. Recently focused on open source, member of KusionStack and OpenKruise.
avatar for Jun Zhang

Jun Zhang

Cloud Native Technical Expert, Ant Group
Jun Zhang is Cloud Native Engineer working at Ant Group, mainly focuses on workload risk-free delivery. He has rich experience in managing large-scale k8s clusters stability and expandability. Now he fouces open source, is the member of KusionStack and KubeWharf. core maintainer of... Read More →



Thursday September 28, 2023 11:50am - 12:25pm CST
2层 会议室 2 | 2F Room 2
  软件开发生命周期 | SDLC

11:50am CST

在Kubernetes上构建一个精细化和智能化的资源管理系统 | Building a Fine-Grained and Intelligent Resource Management System on Kubernetes - He Cao & Wei Shao, ByteDance
原生 Kubernetes 的资源管理能力有所局限:1. 静态的资源模型会导致节点的资源利用率较低,因为在线业务具有潮汐现象。2. 只支持申请整数个 GPU,在 AI 推理场景下会浪费大量昂贵的 GPU 资源。3. 原生的拓扑亲和策略只考虑了 NUMA 拓扑,难以满足搜索、推荐和 AI 大模型训练等业务对性能的要求。

在本次演讲中,曹贺和邵伟将介绍资源管理系统 Katalyst 及其在字节跳动的应用:1. 通过在离线混部提升资源利用率,并保障业务的 SLO 不受影响。2. 实现了 GPU 共享调度,支持 1% 算力粒度和 1 MiB 显存粒度的容器调度,从而提升了 AI 推理场景下的 GPU 利用率。3. 实现了拓扑感知调度,并扩展了 GPU 和 RDMA 在 PCIe Switch 级别的亲和策略,从而在分布式模型训练场景下可以使用 GPU DirectRDMA 技术来提升训练速度。4. 通过在线超分、规格推荐、潮汐混部等低使用门槛的措施提升资源效能。


The resource management capabilities of vanilla Kubernetes are limited:
  1. The static resource model leads to low resource utilization due to the tidal nature of online services. 
  2. Only full GPU requests are allowed, which causes huge GPU waste in AI inference scenarios. 
  3. The native micro-topology allocation strategy can not meet the performance requirements of workloads such as search, recommendation, and large model training.

In this talk, He and Wei will introduce a resource management system, Katalyst, and its application in ByteDance:
  1. Colocate online services and offline jobs to improve resource utilization and ensure their SLOs. 
  2. Implement GPU-sharing scheduling, which allows requests of 1% granularity computing power and 1 MiB granularity GPU memory, to improve GPU utilization in AI inference scenarios. 
  3. Implement topology-aware scheduling and customize a strategy for GPU-RDMA affinity at the PCIe switch level, so GPUDirect RDMA can be used to accelerate distributed model training.
  4. Enhance resource efficiency through easily implementable methods such as node over-commitment, specification recommendation, and tidal colocation.

Speakers
avatar for Wei Shao

Wei Shao

Senior Software Engineer, ByteDance
Wei Shao is a tech lead on the Orchestration & Scheduling team at ByteDance, and a maintainer of Katalyst. Wei has 5+ years of experience in the cloud native area, focusing on resource management in K8s. Wei led the development of Katalyst and the large-scale application of colocation... Read More →
avatar for He Cao

He Cao

Senior Software Engineer, ByteDance
He Cao is a software engineer on the Cloud Native team at ByteDance, a maintainer of Katalyst and KubeZoo, and a member of Istio. He has 5+ years of experience in cloud native technologies, focusing on resource management and scheduling in Kubernetes. Since joining ByteDance, he has... Read More →



Thursday September 28, 2023 11:50am - 12:25pm CST
3夹层 3M5B会议室 | 3M Room 3M5B
  运维+性能 | Operations + Performance

11:50am CST

如何使用集群自动缩放器将批处理作业的节点扩展到2k个节点 | How We Scale up to 2k Nodes for Batch Jobs Using Cluster Autoscaler - Lei Qian, ByteDance
批处理作业具有批量创建和删除的特点,而云提供了强大的弹性。因此,批处理作业和云是完美的匹配。在云原生世界中,我们可以使用Kubernetes和集群自动缩放器来降低成本。但与微服务不同,批处理作业对集群的弹性要求更高,给集群自动缩放器带来了更多挑战。 在我们的场景中,用户将在短时间内创建多达16,000个Pod。当这批任务完成时,集群需要快速缩小。在本次演讲中,我们将分享在批量创建和删除场景中使用集群自动缩放器遇到的一些问题和解决方案。例如,为什么集群无法成功扩展,为什么Pod创建时间如此长,为什么空闲节点没有及时删除等等。通过解决这些问题,我们能够将集群扩展到2,000个节点。

Batch jobs have the characteristic of bulk creation and deletion, and the cloud provides strong elasticity. Therefore, batch jobs and the cloud makes a perfect match. In the cloud-native world, we can use Kubernetes and cluster autoscaler to reduce costs. But unlike microservices, batch jobs have higher requirements for the elasticity of the cluster, posing more challenges to cluster autoscaler. In our scenario, users will create up to 16,000 pods within a short period. When this batch of tasks is completed, the cluster needs to be quickly scaled down. In this talk, we will share some issues and solutions encountered using cluster autoscaler in batch creation and deletion scenarios. For example, why cluster is not successfully scaled up, why pod creation takes so much time, why idle nodes were not promptly deleted, and so on. By solving these issues, we are able to scale the cluster to 2,000 nodes in production.

Speakers
avatar for 钱磊

钱磊

Software Engineer, Volcano Engine
A kubernetes developer in Volcano Engine. Focus on building a stable kubernetes engine on public cloud.



Thursday September 28, 2023 11:50am - 12:25pm CST
2层 会议室 1 | 2F Room 1
  运维+性能 | Operations + Performance

12:25pm CST

Lunch
Thursday September 28, 2023 12:25pm - 1:55pm CST
1层展厅|1F Exhibition Hall

1:55pm CST

为您自己的工作负载和流量定制渐进式交付 | Customize Progressive Delivery for Your Own Workload and Traffic - Zhang Zhen & Mingshan Zhao, Alibaba Cloud
OpenKruise Rollout是OpenKruise的一个新的子项目,旨在简化渐进式交付的过程。独特的非侵入性和Git-ops友好的设计极大地降低了采用成本。本次演讲探讨了OpenKruise Rollout的内部框架,以支持多个工作负载和流量管理系统(是的,您也可以逐步推出Daemonset和Statefulset!)。对于平台运营商,您将学习如何使用新的基于Lua的脚本系统轻松定制自己的流量实现。本次演讲适合对安全高效的应用程序发布感兴趣的任何人,或者因其CD流水线的独特性和限制而无法使用渐进式交付工具的任何人。

OpenKruise Rollout is a new sub-project of OpenKruise that aims to ease the process of progressive delivery. The unique non-invasive and Git-ops friendly design greatly reduce the adoption cost. This talk explores the internal framework of OpenKruise Rollout to support multiple workload and traffic management system (Yes, you can rollout progressively for a Daemonset and Statefulset too!). For platform operators, you'll learn how to easily customize rollout for your own traffic implementation using new Lua-based scripting system. This talk is for anyone who is interested in safe and efficient application rollout, or anyone who feels they can’t use progressive delivery tool because of the uniqueness and limitation in their CD pipelines.

Speakers
avatar for Zhen Zhang

Zhen Zhang

staff engineer, Alibaba Cloud
Zhen Zhang has been working on the cluster management of software applications. he is driving the new cloud native innovation in Alibaba and focus mainly on the application management domain. He is one of main maintainer in OpenKruise project.
avatar for Mingshan Zhao

Mingshan Zhao

Senior R&D Engineer, Alibaba Cloud
Senior R&D Engineer of AliCloud, Maintainer of OpenKruise community, has long been engaged in the research and development of cloud native, containers, scheduling and other fields; core R&D member of Alibaba's one million container scheduling system, and many years of experience in... Read More →



Thursday September 28, 2023 1:55pm - 2:30pm CST
3夹层 3M5A会议室 | 3M Room 3M5A
  Maintainer Track, OpenKruise

1:55pm CST

云原生存储产品CubeFS,助力人工智能加速发展 | Cloud Native Storage CubeFS, Empowering AI Acceleration - Hu Yao, OPPO
CubeFS是一款新一代的云原生存储产品,由Cloud Native Computing Foundation(CNCF)托管,目前处于“孵化”阶段。它是中国开发者开发的第一个具备完整文件和对象存储能力的开源产品。在本次演讲中,我们将首先讨论存储的机遇和挑战,并以AIGC作为背景进行探讨。接下来,我们将介绍CubeFS的架构和开发,展示它作为智能数据湖基础在人工智能场景下的特定实践,以加速AI处理。最后,我们将一起探讨和讨论CubeFS的未来计划。

CubeFS is a new generation cloud-native storage product, hosted by Cloud Native Computing Foundation (CNCF) and currently in the "incubation" phase. It is the first open-source product with complete file and object storage capabilities developed by Chinese developers. In this talk, we will first discuss the opportunities and challenges of storage, using the hot topic of AIGC as a background. Next, we will introduce the architecture and development of Cubefs, showcasing its specific practices as an intelligent data lake base in AI scenarios to accelerate AI processing. Finally, we will explore and discuss the future plans of CubeFS together.

Speakers
avatar for Hu Yao

Hu Yao

Storage Architect, OPPO
10 years of experience in storage development, responsible for launching and operating EB and storage products. Currently employed at OPPO, responsible for the design, development, and operation of CubeFS.



Thursday September 28, 2023 1:55pm - 2:30pm CST
3夹层 3M5B会议室 | 3M Room 3M5B

1:55pm CST

多云多集群HPA帮助携程集团应对业务低迷和快速恢复 | Multi-Cloud Multi-Cluster HPA Helps Trip.com Group Deal with Business Downturn and Rapid Recovery - Honghui Yue & Jingxue Li, Trip.com
随着携程集团旅行业务的快速恢复和发展,k8s集群的规模进一步扩大。为了提高效率和可用性,携程集团建立了新一代的多云多集群弹性平台。如何应对业务不确定性和快速增长?如何提高弹性平台的可用性和稳定性?如何降低成本?如何提高效率并减轻员工负担?这些都是携程集团面临的挑战。 在会议上,我们将介绍携程集团弹性平台的实际经验,特别是关于多集群水平Pod自动缩放。您可以了解到关于自动缩放、调度、效率和可用性的知识和启示。

With the rapid recovery and development of Trip.com Group's travel business, the scale of k8s cluster has further expanded. To improve efficiency and availability, Trip.com Group has built a new generation of multi-cloud multi-cluster elastic platform. How to deal with business uncertainty and rapid increase? How to improve elastic platform availability and stability? How to reduce cost? How to raise efficiency and lighten staff's burden? These are Trip.com Group's challenges. In the meeting, we will introduce you to Trip.com Group's practical experiences of elastic platform especially about multi-cluster horizontal pod autoscaling. You can gain knowledge and inspiration about autoscaling, scheduling, efficiency and availability.

Speakers
avatar for Jingxue Li

Jingxue Li

Expert Software Engineer, Trip.com Group
Expert software engineer of Trip.com Group Container&Hybrid Cloud Team, focus on multi-cluster and elastic scheduling.
avatar for Honghui Yue

Honghui Yue

Cloud Native R&D Director, Trip.com Group
Honghui is the leader of the Trip.com Group Container & Hybrid Cloud team, focusing on Container Infrastructure, Scheduling, and Hybrid Cloud. He used to work on the design and improvement of the Live&Ondemand Video CDN System at Youku Tudou for years. He has rich experiences in streaming... Read More →



Thursday September 28, 2023 1:55pm - 2:30pm CST
2层 会议室 3 | 2F Room 3
  云原生体验 | Cloud Native Experience

1:55pm CST

生产环境下的CNI实时迁移 | Live CNI Migration in Production Environments - Yike Wang and Yanzhao Li, VMware
CNI(容器网络接口)负责设置集群Pod网络,一些更全面的CNI解决方案还可以设置高级网络功能,如网络策略和kube-proxy替代。CNI Pod网络是L4服务和L7入口网络的基础。我们的客户希望将现有集群切换到另一个CNI,并且我们经常收到这样的客户请求。
在这个会话中,我们将介绍:
  1. 集群网络基础知识和CNI基础知识。
  2. 为什么在生产环境中总是需要进行CNI迁移,以及Tanzu项目如何管理多个CNI解决方案。
  3. 在现有集群上进行基本的CNI迁移方法以及在过程中对集群网络,包括Pod网络、L4服务和L7入口的停机时间的分析。
  4. 两种改进的方法,以克服流量停机时间,实现实时迁移。
  5. 我们在帮助客户在大规模集群中迁移CNI时所学到的经验教训。


CNI(container networking interface) is responsible for setting up cluster pod networking, and some more comprehensive CNI solutions can also set up advanced networking features such as network policy and kube-proxy replacement. The CNI Pod networking is the basis for L4 service and L7 ingress networking. Our customers want to switch to another CNI for their existing clusters and we see such requests from customers a lot.
In this session, we’ll Introduce:
  1. Cluster networking basics and CNI basics
  2. Why CNI migration is always needed in the production environment, and how the Tanzu project manages multi CNI solutions.
  3. The basic in-place CNI migration method on an existing cluster and the analysis of the downtime on cluster networking including pod networking, L4 service and L7 ingress during the process.
  4. Two improved methods to overcome the traffic downtime to implement live migration.
  5. The lessons we learned when helping customers migrate CNI in large scale cluster

Speakers
avatar for Yike Wang

Yike Wang

Staff Engineer, VMware
Yike Wang is a staff engineer in VMware. She is experienced at networking infrastructure like NSX-T and also Kubernetes networking. She’s been actively contributing to open source projects like cluster-api-provider-aws, and she has given talks in various conferences like Kubeon... Read More →
avatar for Yanzhao Li

Yanzhao Li

Engineering Manager, VMware
Yanzhao Li is an engineering manager in VMware, leading the Tanzu Kubernetes Grid Integrated project. He has rich experience managing the lifecycle of Kubernetes. He’s interested in CNI, eBPF, cluster-api and has been actively involved in cloud native for more than 5 years.


Thursday September 28, 2023 1:55pm - 2:30pm CST
3层 307会议室| 3F Room 307

1:55pm CST

在特定平台上,开普勒准确吗? | Is Kepler Accurate on Specific Platforms? - Jie Ren & Ken Lu, Intel
Kepler(Kubernetes高效功率级别导出器)是一个CNCF沙盒项目,它使用eBPF来探测与云原生容器相关的能源统计信息,并将其导出为Prometheus指标。它支持功率比建模和功率估计建模。 目前,Kepler的测试用例,无论是单元测试还是集成测试,都是白盒导向的,重点是度量收集和导出代码逻辑检查。 在本次演讲中,我们将介绍一个平台验证框架,为Kepler的测试框架提供有益的补充,使开发人员和最终用户能够在特定硬件平台上验证项目功能。 供应商特定的验证案例可以集成到框架中。我们可以检查它们的测试报告,以查看供应商的平台在Kepler中是否得到良好支持,评估和识别平台差距和限制,特别是功率建模的准确性。 该框架是完全自动化的,具有平台无关性。

Kepler (Kubernetes Efficient Power Level Exporter) is a CNCF Sandbox project, it uses eBPF to probe energy-related system statistics for cloud native containers and exports them as Prometheus metrics. It supports Power Ratio Modeling and Power Estimation Modeling. Currently Kepler’s test cases, both in unit test and integration test, are white-box oriented and focusing on the metrics collection and exportation code logic check. In this talk, we will introduce a platform validation framework to make beneficial supplement to Kepler’s test framework, enable the developers and end users to validate the project features on specific hardware platforms. Vendor specific validation cases could be integrated into the framework. We could check test reports of them to see if the vendors’ platforms are well supported in Kepler, to evaluate and identify the platform gap and limitation, especially on the accuracy of the power modeling. The framework is fully automated and runner system agnostic.

Speakers
avatar for Ken Lu

Ken Lu

Cloud Software Architect, Intel
System Architect for confidential computing, cloud native solutions and sustainable computing; Technology Enthusiast for many industry standards like UEFI, Intel TDX; Creator for many solutions like object analytics in ROS and cloud native inference pipeline for confidential and sustainable... Read More →
avatar for Jie Ren

Jie Ren

Senior Cloud Software Engineer, Intel
Jie Ren graduated from Zhejiang University, Master of Computer Science. He has worked in telecommunication and semiconductor industry for nearly 20 years, career path covered ZTE/Cisco/Mavenir/Intel. Now he works in Intel as Senior Cloud Software Engineer, focusing on cloud native... Read More →



Thursday September 28, 2023 1:55pm - 2:30pm CST
3层 305A会议室| 3F Room 305A
  云原生新手 | Cloud Native Novice

1:55pm CST

Kubernetes 仙境:平台构建的冒险 | Kubernetes Wonderland: Adventures in Platform Building - Alexa Griffith, Bloomberg LP & Mauricio "Salaboy" Salatino, Diagrid
在一个Kubernetes集群上安装数百个工具只有在用户能够从中受益时才有意义。但是,推动工程和数据科学团队学习大量平台工具是否会提高生产力呢? 在这个演讲中,Alexa和Mauricio讨论了构建在Kubernetes之上的平台应该向用户展示的不同特性和能力。平台应该提供在Kubernetes之上的抽象,使这些能力更容易被使用,但是抽象Kubernetes的概念可能会很复杂。我们探讨了开发和面向机器学习(针对数据科学家)的平台之间的相似性和差异,以及它们各自的工具和工作流程。通过观察使用不同工具的不同团队,我们能学到什么? 这个演讲包括有趣的绘画和实时演示,展示如何使用开源CNCF工具(如Istio、KServe、Dapr、Knative和Crossplane等)构建一个平台。

Installing hundreds of tools on a Kubernetes cluster only makes sense if users can benefit from them. But does pushing your engineering and data science teams to learn a multitude of platform tooling result in higher productivity? In this presentation, Alexa and Mauricio discuss different traits and capabilities that platforms built on top of Kubernetes should expose to their users. Platforms should expose abstractions on top of Kubernetes that make these capabilities easier to consume, but abstracting Kubernetes concepts can be complex. We explore the similarities and differences between Development-and ML-focused (for data scientists) platforms, as well as their respective tools and workflows. What can we learn by looking at different teams who are using different tools? This talk includes fun drawings and a live demo to show how a platform can be constructed using Open Source CNCF tools (e.g., Istio, KServe, Dapr, Knative, and Crossplane, among others).

Speakers
avatar for Mauricio Salatino

Mauricio Salatino

OSS Software Engineer, Diagrid
Mauricio works as Open Source Software Engineer at @Diagrid, contributing to and driving initiatives for the Dapr OSS project. Mauricio also serves as a Steering Committee member for the Knative Project, and he is also Co-Leading the Knative Functions initiative. He is writing a book... Read More →
avatar for Alexa Nicole Griffith

Alexa Nicole Griffith

Senior Software Engineer, Bloomberg LP
Alexa Griffith is a Senior Software Engineer on Bloomberg’s Cloud Native Compute Services organization. She works on building an inference platform for ML workflows and the open source project KServe. She enjoys solving engineering challenges at scale and writing code in Go. She... Read More →


Thursday September 28, 2023 1:55pm - 2:30pm CST
3层 301明珠厅| 3F The Pearl Hall 301
  平台工程 | Platform Engineering

1:55pm CST

在Kubernetes生产环境中的容器实时迁移 | Container Live Migration in Kubernetes Production Environment - Yenan Lang and Hua Liu, Tencent
这个演讲描述了一种容器实时迁移技术,它不需要对Kubernetes进行修改,并与不同版本的Kubernetes兼容。它已经在腾讯的大规模应用中成功实施。 容器实时迁移是一项技术,允许将正在运行的容器透明地迁移到其他节点。关于这项技术的讨论始于Kubernetes 1.0发布之前。由于这个主题的复杂性,讨论尚未产生明确的结论。 近年来,随着CRIU、runc和containerd等技术的发展,容器实时迁移逐渐变得可能。然而,将实时迁移整合到庞大的Kubernetes生态系统中仍然是一个挑战,特别是涉及容器网络和实时迁移性能方面。 在这个演讲中,我们将介绍在腾讯内部实施实时迁移技术的经验。

This presentation describes a container live migration technology that does not require modifications to Kubernetes and is compatible with different versions of Kubernetes. It has been successfully implemented at scale within Tencent.
Container live migration is a technology that allows running containers to be transparently migrated to other nodes. Discussions about this technology in the community started before the release of Kubernetes 1.0. Due to the complexity of the topic, the discussions have not produced definitive conclusions yet.
In recent years, with the development of technologies such as CRIU, runc, and containerd, container live migration has gradually become possible. However, integrating live migration into the vast ecosystem of Kubernetes remains a challenge, particularly with regards to container networking and live migration performance.
In this presentation, we will introduce the experience of implementing live migration technology within Tencent.

Speakers
avatar for Yenan Lang

Yenan Lang

Senior Software Engineer, Tencent
Yenan Lang, a senior engineer at Tencent, has been actively engaged in Kubernetes since 2016, specializing in K8S cluster management, container networking, and container runtimes. Presently, Yenan is dedicated to exploring the integration of Kubernetes with big data and AI techno... Read More →
avatar for Hua Liu

Hua Liu

Expert Engineer, Tencent
Liu Hua is an expert engineer at Tencent, focusing on the next-generation technology research of container infrastructure, including kubernetes, container runtime, kernel, and other related technologies.



Thursday September 28, 2023 1:55pm - 2:30pm CST
3层 302会议室| 3F Room 302

1:55pm CST

忘记kubectl,与您的集群交流:使用LLMs简化Kubernetes集群管理 | Forget Kubectl and Talk to Your Clusters: Using LLMs to Simplify Kubernetes Cluster Management - Qian Ding, Ant Group
本提案的主题是探索利用大语言模型(LLMs)进行Kubernetes集群管理的可能性。我们的目标是希望将使集群用户能够使用自然语言与集群进行交互,提高操作效率,并允许SRE使用AI来识别和解决集群问题。本次主题分享将讲述我们真实的探索经历,比如利用大语言模型帮助用户实现常规的集群查询“kubectl get”。当然,我们也将讨论到大模型目前的能力缺陷和瓶颈,基于我们的工作职能,在一个强确定性的环境中,如何能够去消除模型本身的不确定性。希望通过这次分享,能够让更多的人辩证的看待大模型的应用场景,也期待能给参会者全新的启发。

This proposal outlines our efforts to operate Kubernetes clusters using large language models (LLMs). This will enable cluster users to interact with clusters using natural language, improve operation efficiency, and allow SREs to use AI to identify and resolve cluster issues. Our design principles: - Start with replacing "kubectl get" - Use local LLM models to avoid data leaks - Iterate quickly to gather user feedback and empower the LLMs. We implemented the proposal by: - Designing training data to perform supervised fine-tuning, allowing the LLMs to learn to call our APIs to query cluster data. - Using a checklist before deploying LLM bots to multiple internal channels for production use. - By combining LLM with traditional AIOps techniques, we enabled the LLMs to detect cluster issues and facilitated cluster admins to resolve them. Finally, we share our progressive report of using LLMs with Kubernetes and propose a few open-questions for future discussions.

Speakers
avatar for Qian Ding

Qian Ding

Staff Engineer, Ant Group
Qian works at Ant Group as a staff engineer focusing on site reliability engineering. He is the SRE tech lead of adopting Kubernetes in the Ant Group production environment. He is passionate about adopting and promoting SRE's philosophy for managing large-scale production systems... Read More →



Thursday September 28, 2023 1:55pm - 2:30pm CST
3层 305B会议室| 3F Room 305B
  新兴和先进技术 | Emerging + Advanced

1:55pm CST

云原生时代物流服务的创新:顺丰速运的实践与启示 | Innovation in Logistics Services in the Cloud-Native Era: Practice and Inspiration from SF Express - Wang Jiezhang, Huawei & Panggang Cheng, SF Express
作为中国领先的物流公司,顺丰速运将分享其物流业务的数字化和边缘云转型的完成情况。顺丰速运采用了分层组织结构和云原生技术(如Kubernetes、CNI插件和KubeEdge+EdgeMesh解决方案)来处理大量的用户和货物。通过将数据处理和计算逻辑推入物流节点,并使用KubeEdge集成边缘设备,构建了智能物流系统,实现了全流程自动化,提高了效率和准确性。EdgeMesh的集成连接了云和边缘,并实现了应用数据的实时采集、处理、传输和分析,创建了高效、安全和可靠的边缘网络。边缘解决方案在减少边缘设备的资源使用和复杂性的同时,帮助解决了Kubernetes在边缘场景中的限制和挑战。

As a leading logistics company in China, SF Express will share how it completed the digital and edge cloud transformation of its logistic business. SF Express used a hierarchical organization structure and cloud-native technologies (such as Kubernetes, CNI plugins, and KubeEdge+EdgeMesh solutions) to handle large volumes of users and goods. By pushing data processing and computing logic into the logistics nodes and integrating edge devices using KubeEdge, an intelligent logistics system was built, achieving full-process automation and improving efficiency and accuracy. The integration of EdgeMesh connects cloud and edge and realizes the real-time collection, processing, transmission, and analysis of application data, creating an efficient, secure, and reliable edge network. While reducing resource usage and complexity of edge devices, the edge solution helps address the limitations and challenges of Kubernetes in edge scenarios.

Speakers
avatar for 杰章 王

杰章 王

Senior Engineer, Huawei Cloud
JieZhang Wang, a senior engineer at Huawei Cloud Computing, is an avid enthusiast of cloud native and edge computing technology. He actively participates in technical discussions and coding as a member of KubeEdge and head of the KubeEdge SIG networking group responsible for EdgeMesh... Read More →
avatar for 庞钢 程

庞钢 程

Project Leader(Senior Engineer), SF Technology
我是顺丰科技资深工程师、项目负责人,并兼任创新评审委员、后端专业职业通道CT。主导顺丰科技AIOT、镜像构建、镜像存储等具有顺丰特色的云原生项目,在云原生领域积累了丰富的实践经验。10年来坚持开源即基础的理念,引入Kubernetes、Kubeedge、Harbor、Coredns... Read More →



Thursday September 28, 2023 1:55pm - 2:30pm CST
2层 会议室 4 | 2F Room 4
  网络+边缘+电信 | Networking + Edge + Telco

1:55pm CST

数禾使用Knative加速AI模型服务部署 | Shuhe Accelerates AI Model Service Deployment with Knative - Peng Li, Alibaba Cloud & Wenzhe Wei, Shanghai Shuhe Information Technology
在数禾(上海数禾信息技术有限公司)的金融业务场景中,AI模型经常进行迭代,并且会同时在线部署多个模型版本以评估模型。这样做会带来高资源成本。如何在确保服务质量的基础上提高AI服务运维效率并降低资源成本是一个具有挑战性的问题。 Knative是一个基于Kubernetes的开源无服务器应用架构。目前,数禾通过Knative部署了500多个AI模型服务,节省了60%的资源成本,并且平均部署周期从1天缩短到了0.5天。 在本次演讲中,我们将向您展示如何基于Knative部署AI工作负载,包括: ● 扩展Serving的弹性能力,以支持基于并发数的精确弹性、弹性预测。 ● 如何在Knative中部署Stable Diffusion。 ● 数禾在Knative中的AI模型服务最佳实践。

In the financial business scenario of Shuhe(Shanghai Shuhe Information Technology Co., Ltd.), the AI model is iterated frequently, and multiple versions of the model will be deployed online at the same time for evaluating the model. The real effect has high resource costs. How to improve the efficiency of AI service operation and maintenance and reduce resource costs on the basis of ensuring service quality is challenging. Knative is an open source serverless application architecture based on Kubernetes. At present, Shuhe deploys 500+ AI model services through Knative, saving 60% of resource costs, and the average deployment cycle is shortened from 1 day to 0.5 days. In this talk, we will show you how to deploy AI workloads based on Knative, including: ● Expand the Serving elasticity capability to support precise elasticity based on the concurrency and predictive scaling. ● How to deploy Stable Diffusion in Knative ● Shuhe's Best Practices for AI model service in Knative

Speakers
avatar for Peng Li

Peng Li

Technical Expert, Alibaba Cloud
Peng Li is a technical expert at Alibaba Cloud. He works on Alibaba Cloud Container Service for Kubernetes , focusing on Knative and Kubernetes.
avatar for Wenzhe Wei

Wenzhe Wei

Senior Software Engineer, Shuhe Group
Senior Development Engineer at Shuhe Technology's Infrastructure Team, primarily responsible for the construction of Shuhe's DevOps platform and monitoring and alerting platform. Designed and developed the serverless containerization of Shuhe's AI model services in collaboration with... Read More →



Thursday September 28, 2023 1:55pm - 2:30pm CST
2层 会议室 2 | 2F Room 2
  软件开发生命周期 | SDLC

1:55pm CST

Kubernetes上的干扰检测和资源隔离增强的最佳实践 | Best Practice for Interference Detection and Resource Isolation Enhancement on Kubernetes - Haogang Wang, Kuaishou
基于Kubernetes的容器云平台部署了延迟敏感的工作负载和批处理作业的混合组合。随着Pod的部署密度增加,干扰问题已成为确保平台稳定性的主要挑战。这阻碍了资源利用率的提高,需要平台增加额外成本来添加更多服务器以支持工作负载部署。 在快手,通过建立一个干扰观测和诊断系统,我们实现了对干扰问题的快速识别和故障排除。此外,我们实现了对每个服务的资源进行细粒度控制,包括CPU和内存,有效减轻了批处理作业对延迟敏感工作负载的影响。这使得平台能够部署更多的批处理作业,同时确保延迟敏感工作负载的稳定性,从而提高了整体资源利用率。

The container cloud platform based on Kubernetes deploys a hybrid mix of latency sensitive workloads and batch jobs. As the deployment density of pods increases, interference issues have become a major challenge in ensuring the stability of the platform. This prevents the improvement of resource utilization, requiring the platform to incur additional costs to add more servers to support the workload deployment. In Kuaishou, by establishing a system for observing and diagnosing interference, we have achieved rapid identification and troubleshooting of interference issues. Additionally, we have achieved fine-grained control over resources on a per-service basis, including cpu and memory, effectively mitigating the impact of batch jobs on latency sensitive workloads. This enables the platform to deploy more batch jobs while ensuring the stability of latency sensitive workloads, thereby improving the overall resource utilization.

Speakers
avatar for Haogang Wang

Haogang Wang

Senior Engineer, Kuaishou
I graduated with a master's degree in Computer Science and Technology from Nanjing University in 2021. After graduation, I joined Kuaishou as a System Development Engineer in the Cloud Native domain, primarily focusing on container cloud stability and quality assurance of single-node... Read More →



Thursday September 28, 2023 1:55pm - 2:30pm CST
2层 会议室 1 | 2F Room 1
  运维+性能 | Operations + Performance

1:55pm CST

我们如何构建生产级HPA:从有效算法到无风险的自动扩展 | How We Build Production-Grade HPA: From Effective Algorithm to Risk-Free Autoscaling - Ziqiu Zhu & Yiru Guo, Ant Group
你是否曾经发现,Kubernetes HPA(水平自动扩展)的内置算法对于具有复杂资源使用特性的真实工作负载来说不太有效?或者,你是否曾经担心激进的自动扩展过程会突然破坏你“脆弱”的生产系统?

在这个会话中,来自蚂蚁集团的工程师将披露他们构建生产级HPA的方法,该方法每年可以稳定地节省约100,000个CPU核心。这包括一种创新的“基于流量驱动”的副本预测算法,非常适用于受多个复杂流量模式影响的工作负载,以及独特的多阶段灰度扩展策略,可以极大地减轻自动扩展过程中的风险。

此外,他们还将利用Kapacity,一款新开源的云原生容量解决方案,演示如何将上述方法实际应用于你的Kubernetes环境,而无需依赖特定的供应商功能,做到轻松自如。


Have you ever found that Kubernetes HPA's built-in algorithm is less effective for real-world workloads with complicated resource usage characteristics? Or, have you ever worried that a radical autoscaling process would suddenly break your "fragile" production system?

In this session, engineers from Ant Group will disclose their methodology for building production-grade HPA, which has been saving ~100k CPU cores yearly with high stability. This includes an innovative "traffic-driven" replica prediction algorithm, which is well adapted to workloads whose resource usage is affected by multiple complex traffic patterns, and a unique multi-stage gray scaling strategy that greatly mitigates risks during autoscaling.

Furthermore, they will utilize Kapacity, a newly open-sourced cloud native capacity solution, to demonstrate how to practically apply the above methodology to any of your Kubernetes environments with ease, without relying on specific vendor features.

Speakers
avatar for Ziqiu Zhu

Ziqiu Zhu

Senior Software Engineer | Kubernetes Member, Ant Group
Ziqiu Zhu is a senior software engineer at Ant Group. He has worked in the department of cloud native technology for about 4 years. He is a cloud native open source enthusiast who has been making contributions to various CNCF projects for years, and is experienced in utilizing cloud... Read More →
avatar for Yiru Guo

Yiru Guo

Senior Software Engineer, Ant Group
Yiru Guo is a senior software engineer at Ant Group. He has worked in the department of infrastructure reliability for nearly 10 years, and has worked in the intelligent capacity group for over 4 years. He has been deeply involved in the construction of various production-grade capacity... Read More →



Thursday September 28, 2023 1:55pm - 2:30pm CST
3夹层 3M3会议室 | 3M Room 3M3
  运维+性能 | Operations + Performance

2:45pm CST

云原生边缘计算与KubeEdge:更新与未来 | Cloud Native Edge Computing with KubeEdge: Updates and Future - Fei Xu, Huawei Cloud & Hongbing Zhang, Shanghai DaoCloud Network Technology
KubeEdge是一个开源的边缘计算框架,将Kubernetes的能力从云端扩展到边缘。在本次会议中,我们将分享以下内容:1. KubeEdge的最新开发更新;2. 来自sig-robotics、sig-node、sig-scalability和sig-networking、sig-security等的新闻;3. 最新的用户采用情况,包括:云原生区块链、云原生物流、云原生SDV(软件定义车辆)、云原生海上油田等。最后,将进行开放式问答环节,与参会者互动并回答问题。

KubeEdge is an open source edge computing framework that extends the power of Kubernetes from the cloud to the edge. In this session, we will share: 1. The latest development updates from KubeEdge; 2. News from sig-robotics, sig-node, sig-scalability and sig-networking, sig-security, etc.; 3. Latest new user adoptions, including: cloud native Blockchain, cloud native Logistics, cloud native SDV (software defined vehicle), cloud native offshore oil field, etc. In the end, there will be an open Q&A for attendees to ask questions and give feedback.

Speakers
avatar for Fei Xu

Fei Xu

Senior software engineer, Huawei Cloud
KubeEdge TSC Member, Senior Software Engineer at Huawei Cloud Focusing on Cloud Native,Kubernetes, Service Mesh, IoT and other fields. Currently maintaining the kubeedge project which is a CNCF incubation project. And also participating in Huawei Cloud container products. And has... Read More →
avatar for Hongbing Zhang

Hongbing Zhang

Chief Operating Officer, Shanghai DaoCloud Network Technology Co., Ltd
Hongbing Zhang is Chief Operating Officer of DaoCloud. He is a veteran in open source areas, he founded IBM China Linux team in 2011 and organized team to make significant contributions in Linux Kernel/openstack/hadoop projects. Now he is focusing on cloud native domain and leading... Read More →


Thursday September 28, 2023 2:45pm - 3:20pm CST
3夹层 3M5A会议室 | 3M Room 3M5A

2:45pm CST

深入研究:KWOK | Deep Dive: KWOK - Shiming Zhang, DaoCloud & Hao Liang, Tencent
KWOK使控制器测试变得如此简单和容易,正如您所希望的那样。让我们回顾一些功能,并讨论路线图上的下一步。

KWOK makes controller testing as easy and simple as you could hope for. Let's look back at some of the features and discuss what's next on the roadmap.

Speakers
avatar for Hao Liang

Hao Liang

SRE, Tencent
Hao Liang is a Site Reliability Engineer working at Tencent infrastructure team, maintaining large-scale computing Kubernetes clusters. He is focusing on cloud native and kubernetes.
avatar for Shiming Zhang

Shiming Zhang

Software Engineer, DaoCloud
Shiming Zhang is a contributor to Kubernetes with the main focus on scalability, performance, reliability and testing, he gained experience and contributed to many Kubernetes features and most of its components.



Thursday September 28, 2023 2:45pm - 3:20pm CST
3夹层 3M3会议室 | 3M Room 3M3
  Maintainer Track, SIG Scheduling

2:45pm CST

只是噪音还是真正的字节?云原生中的eBPF | Just Buzz or Real Byte? eBPF in Cloud Native - Bill Mulligan, Isovalent
在云原生领域,eBPF的热度正在迅速增长,但是要知道从哪里开始或如何跟上进展可能会让人感到害怕。在这次演讲中,比尔将追溯他是如何接触到eBPF的,探索当今云原生领域中一些可用的eBPF应用,并教授其他人如何在eBPF的活动中深入探索而不被咬伤。 刚开始接触eBPF的人将了解到在云原生世界中,eBPF如何实现高效的网络、无需仪器的可观测性、轻松的追踪和实时安全等功能。已经熟悉eBPF的人将获得对eBPF领域的概述,并了解到许多新的和不断扩展的eBPF应用,使他们能够利用其功能而无需深入研究字节码。听众将对云原生中eBPF的热度有所了解,并了解到可能解决他们在网络、可观测性和安全性方面问题的新工具。

The buzz around eBPF in cloud native is growing quickly but it can be scary to know where to start or how to keep up. In this talk, Bill will trace how he got into eBPF, explore some of the cloud native eBPF applications available today, and teach others how to dive into the hive of activity around eBPF without getting bytten. People just beginning with eBPF will learn how eBPF makes it possible to have efficient networking, observability without instrumentation, effortless tracing, and real-time security (among other things) in a cloud native world. Those already familiar with eBPF will get an overview of the eBPF landscape and learn about many new and expanding eBPF applications that allow them to harness the power without needing to dive into the bytecode. The audience will walk away with an understanding of the buzz around eBPF in cloud native and knowledge of new tools that may solve some of their problems in networking, observability, and security.

Speakers
avatar for Bill Mulligan

Bill Mulligan

Community Pollinator, Isovalent
Bill Mulligan is a cloud native pollinator and community builder. He has given talks and written articles about building the business case for cloud native. While at CNCF he restarted the Kubernetes Community Day program and worked to grow the student community. He is currently at... Read More →



Thursday September 28, 2023 2:45pm - 3:20pm CST
3层 305A会议室| 3F Room 305A
  云原生新手 | Cloud Native Novice

2:45pm CST

Kubernetes 命名空间揭秘:释放基础设施的全部潜力 | Kubernetes Namespaces Unleashed: Unlocking the Full Potential of Your Infrastructure - Victor Varza & Adrian Aneci, Adobe Inc
每个现代开发平台都被设计或渴望成为多租户。满足这一基本要求对于任何平台来说都是至关重要的,以便有效扩展、为众多客户提供服务,并提供一种具有成本效益的解决方案。 在Adobe,Victor和Adrian正在构建和运行一个跨云、多租户的Kubernetes平台,帮助产品团队以高速度和成本效益的生态系统构建和运行其服务。使用或开发了一套开源工具和流程,如Cilium、Envoy、Kata Containers、Open Policy Agent、K8s-Shredder等。 他们将展示他们五年的旅程,运行一个跨云、多租户平台,旨在托管各种应用程序,从简单的应用到高度先进的人工智能应用。讨论将涵盖平台的指导理念、他们遇到的挑战以及他们在成功实现速度和可靠性之间的平衡时所学到的教训。

Every modern developer platform is designed or aspires to be multi-tenant. It is essential for any platform to meet this fundamental requirement in order to scale effectively, serve numerous customers, and provide a cost-effective solution. At Adobe, Victor and Adrian are working on building and running a cross-cloud, multi-tenant Kubernetes platform that helps product teams to build and run their services with high velocity and in a cost-effective ecosystem. A set of open-source tools and processes were used or developed like Cilium, Envoy, Kata Containers, Open Policy Agent, K8s-Shredder and friends. They will present their five-year journey of running a cross-cloud, multi-tenant platform designed to host a wide range of applications, from simple ones to highly advanced AI ones. The discussion will encompass the platform's guiding philosophies, the challenges they encountered, and the lessons they learned while successfully achieved the balance between velocity and reliability.

Speakers
avatar for Victor

Victor

TechLead, Adobe Inc
Victor is a TechLead at Adobe Inc, where he is engaged in managing a Kubernetes-powered enterprise cross-cloud multi-tenant microservices platform. His expertise lies in designing and implementing excellent software systems, with over 15 years of experience in the development of large-scale... Read More →
avatar for Adrian Aneci

Adrian Aneci

Lead Cloud Software Engineer, Adobe Inc
Adrian Aneci is a Lead Cloud Software Engineer at Adobe, where his day-to-day is filled helping to develop Ethos, Adobe's de facto developer platform, built on top of Kubernetes. He is also involved in open source community, contributing to projects under the Kubernetes ecosystem... Read More →



Thursday September 28, 2023 2:45pm - 3:20pm CST
3层 301明珠厅| 3F The Pearl Hall 301
  平台工程 | Platform Engineering

2:45pm CST

中国招商银行的大规模金融级平台工程实践 | Large-Scale Financial-Grade Platform Engineering Practice in China Merchants Bank - Jiahang Xu, China Merchants Bank & Qingguo Zeng, Alibaba
作为中国最大的银行之一,招商银行一直是云原生金融生产实践的贡献者和先驱,以实现其快速数字化转型。但是,我们面临着在满足银行对安全合规性、交易一致性和全链条业务风险管理的更高需求与采用云原生应用之间的平衡挑战。同时,我们还需要应对由银行IT组织结构引起的云原生演进问题。 平台工程,利用CNCF项目如KubeVela、OpenFeature、Envoy、Opentelemetry、OPA和Team Topologies模式,是解决与技术和组织结构相关的这些挑战的务实解决方案。 在本次演讲中,我们将分享我们的金融级平台工程实践,以提供一致的开发者体验,简化DevOps流程,通过端到端的可观察性提高效率并减少错误,并实施安全风险策略以确保合规性和风险控制。

As one of the biggest banks in China, China Merchants Bank has been a contributor and pioneer in cloud-native financial production practices for its rapid digital transformation. But we face challenges in balancing cloud-native application adoption with banking's higher demand for security compliance, transaction consistency, and full-chain business risk management. Meanwhile, we need to face the cloud-native evolution issue caused by the banking IT org structure. Platform engineering, leveraging CNCF projects such as KubeVela, OpenFeature, Envoy, Opentelemetry, OPA, and Team Topologies patterns, is a pragmatic solution to address these challenges related to technology and org structure. In this talk, we'll share our financial-grade platform engineering practices to provide a consistent developer experience, streamline the DevOps process, improve efficiency and reduce errors with end-to-end observability, and implement security risk policies to ensure compliance and risk control.

Speakers
avatar for Jiahang Xu

Jiahang Xu

System Architect, China Merchants Bank(招商银行)
Jiahang Xu is a System Architect at China Merchants Bank. He has over 14 years of unique cross-domain experience working in telecom, automotive, financial industry, startup as a co-founder, and KubeVela maintainer. He's mainly focused on cloud-native application technology practice... Read More →
avatar for Qingguo Zeng

Qingguo Zeng

Senior Engineer, Alibaba Cloud(阿里云智能)
Qingguo Zeng (nickname: Yueda) is a Senior Engineer at Alibaba Cloud, currently responsible for OAM/KubeVela products and open source community evangelism, has been engaged in cloud-native, application delivery, observability, open-source field research and practice for many years... Read More →



Thursday September 28, 2023 2:45pm - 3:20pm CST
2层 会议室 4 | 2F Room 4
  平台工程 | Platform Engineering

2:45pm CST

全球范围内的EROFS:一种适用于各种用例的基于镜像的内核方法 | EROFS Everywhere: An Image-Based Kernel Approach for Various Use Cases - Xiang Gao, Alibaba Cloud
自2019年将EROFS引入Linux内核5.4以来,它在Android世界中得到了广泛应用。自Android 13(2022年)以来,EROFS实际上已成为Android推荐的系统分区文件系统之一。在本主题中,将介绍最近内核版本(截至v6.4)中新增的功能,如全局压缩数据去重等。 在过去两年(2021-2022年),EROFS在容器领域也带来了几项技术创新,例如为runC容器提供的EROFS over fscache,为安全容器提供的EROFS DAX支持,为机密容器提供的EROFS TarFS使用案例以及OSTree/Composefs使用案例(erofs+overlayfs)。本主题还将详细介绍这些技术的内部技术细节。 最后,将像往常一样展示开发路线图,以供对EROFS文件系统未来感兴趣的所有人参考。

Since EROFS was introduced to Linux kernel 5.4 in 2019, it has been widely used in Android world these years. EROFS has actually become a Android recommended file system for system partitions since Android 13 (2022). In this topic, new features shipped in the recent kernels (up to v6.4) like global compressed data deduplication and more will be introduced. Since the past years (2021-2022), EROFS has also been rapidly adapted to the container world with several technology innovations such as EROFS over fscache for runC containers, EROFS DAX support for secure containers, EROFS TarFS use cases for confidential containers and OSTree/Composefs use cases (erofs+overlayfs). Detailed technology internals of these technologies will also be given in this topic. Finally, a development process will be shown as usual for everyone interested in the EROFS filesystem future.

Speakers
avatar for Xiang Gao

Xiang Gao

Staff Software Engineer, Alibaba Cloud
Xiang Gao is a Linux kernel developper at Alibaba Cloud who once worked for Huawei and Red Hat, focusing on kernel local filesystem development like EROFS, XFS and F2FS. He is one of EROFS filesystem authors and maintainers.



Thursday September 28, 2023 2:45pm - 3:20pm CST
3层 302会议室| 3F Room 302
  操作系统 | Operating Systems

2:45pm CST

填补Kubernetes的空白:IO资源调度和隔离 | Fill a Gap of Kubernetes: IO Resource Scheduling and Isolation - Theresa Shan & Cathy Zhang, Intel
当超额订阅发生时,需要对关键工作负载资源进行隔离,以解决嘈杂邻居问题。当前的K8S提供了CPU、内存和存储资源感知调度和隔离,但不支持磁盘IO。本次演讲将介绍一种填补这一空白的方法,通过磁盘IO感知调度和动态资源边界执行的组合,提供了保证的磁盘IO。用户只需在pod规范中为每个容器指定磁盘IO请求。一个新的调度器插件将确保将pod调度到能够保证其磁盘IO性能的节点上。该方法收集每个工作负载的实时磁盘IO利用率以及总磁盘饱和度,并将它们反馈到k8S控制平面以优化资源利用。它还在需要时动态调整尽力而为的工作负载的磁盘IO资源边界,以将磁盘IO资源让给保证的工作负载。演讲中将包含一个演示。

When oversubscription happens, isolation between critical workloads’ resources is needed to address the noisy-neighbor problem. Current K8S provides CPU, memory, and storage resource aware scheduling and isolation, but it does not support disk IO. The talk will introduce an approach to fill this gap, which provides guaranteed disk IO through a combination of disk IO aware scheduling and dynamic resource boundary enforcement. Users only need to specify the disk IO requests for each container in the pod spec. A new scheduler plugin will ensure the pod is scheduled to a node that can ensure its disk IO performance. The approach collects each workload’s real-time disk IO utilization as well as total disk saturation tenancy and feeds them back into the k8S control plane to optimize resource utilization. It also dynamically adjusts best effort workloads’ disk IO resource boundaries to relinquish disk IO resource to guaranteed workloads when needed. A demo will be included in the talk.

Speakers
avatar for Cathy Zhang

Cathy Zhang

senior principal engineer/architect, Intel
As a member of the CNCF TOC, Cathy has been sponsoring and guiding projects' applications for graduation/incubating, and reviewing/approving new sandbox projects. She has been a committee member for several KubeCon. Cathy is a currently Senior Principal Engineer at Intel, leading... Read More →
avatar for Theresa Shan

Theresa Shan

Senior Cloud Software Engineer, Intel
Xumei(Theresa) Shan has more than ten years' experience in Cloud infrastructure and platform. She has a Master degree in Computer Science. She works as a senior cloud engineer and a technical lead in cloud native field at Intel with rich experience in container runtime, kubernetes... Read More →



Thursday September 28, 2023 2:45pm - 3:20pm CST
3层 305B会议室| 3F Room 305B
  新兴和先进技术 | Emerging + Advanced

2:45pm CST

云原生很好,但如何在电信网络中应用它呢? | Cloud Native Is Good, but How to Apply It in Telecom Networks? - Hanyu Ding, China Mobile & Qihui Zhao, China Mobile
当将云原生引入电信网络云时,并不仅仅是引入容器、微服务或DevOps。电信运营商与IT行业有着不同的关注点,前者更加注重绝对稳定性而非敏捷性。在这个前提下,电信运营商正在探索平稳演进的方法,以实现充满电信特性的云原生。在本次会议中,我们将介绍中国移动在电信网络云的云原生演进方面的经验,包括但不限于演进节奏、需要云原生升级的关键技术领域、差距分析、基于开源软件的可能解决方案。其中,将回答一些重要的技术问题,如裸金属上容器的安全问题、MANO与K8S在虚拟机/裸金属上的集群管理协作、微服务的无状态设计、网络功能的FOA流程更新以及用于CNF(云原生网络功能)的特殊负载均衡器,如AMF、UPF的电信协议处理。

When involving cloud native to telecom network cloud, it’s not simply importing container, microservice or DevOps. Telco players have different concerns from IT industry, while the former gives more focus on absolute stability than agility. Under this premise, telco players are exploring smooth evolution methods to achieve cloud native full of telecom characteristics. In this session, we will introduce China Mobile's experience on cloud native evolution of telecom network cloud, including but not limited to evolution rhythm, key technical areas need cloud native upgrade, gap analysis, possible solutions based on open-source software. Within which, major technical problems will be answered like security issue of container on bare metal, MANO collaboration with K8S on cluster management on VM/BM, stateless design of microservices, process update of Network Function’s FOA and special load banlancer for telecom protocol processing for CNF (Cloud native Network Function) such as AMF,UPF.

Speakers
avatar for hanyu

hanyu

Project Manager, China Mobile
Hanyu Ding, an open-source expert from China Mobile Research Insitute, is the PTL of LF Edge Akraino CFN Ubiquitos Computing Force Scheduling Blueprint. He has been engaged in technical research in the fields of MEC, cloud computing and computing resource management and scheduling... Read More →
avatar for Qihui Zhao

Qihui Zhao

Project Manager, China Mobile
Qihui is a project manager at China Mobile Research Institute. She was a member of NovoNet project which drives NFV/SDN strategy for China Mobile, and are now working on cloud native evolution of CMCC's network cloud. She is the TSC member of XGVela project, and has been active speacker... Read More →


Thursday September 28, 2023 2:45pm - 3:20pm CST
3层 307会议室| 3F Room 307

2:45pm CST

扩展Kubernetes与CRDs时需要了解的设计约定 | Design Conventions You Need to Know When Extending Kubernetes with CRDs - Hongcai Ren, Huawei
这个演讲想要谈论一个重要的话题,即在使用CRD扩展Kubernetes时遇到的一些挑战。正如您所知,自定义资源定义(CRDs)是扩展Kubernetes以管理自定义资源的强大方式。然而,当人们第一次开始使用它时,由于对API设计约定的理解不足,他们经常遇到不同的问题。在这个演讲中,我将与您分享一些在使用CRD扩展Kubernetes时需要了解的API设计约定。我们将讨论在设计API时需要牢记的事项,最佳实践以及一些设计标准。通过这个演讲的结束,您将对API设计约定有更深入的了解,并具备成功使用CRD扩展Kubernetes所需的知识。

This representation would like to talk about an important topic that has been causing some challenges when extending Kubernetes using CRDs. As you may know, Custom Resource Definitions (CRDs) are a powerful way to extend Kubernetes to manage your custom resources. However, when people start using it for the first time, they often encounter different issues due to a lack of understanding of API design conventions. In this talk, I'll be sharing with you some of the API design conventions that you need to know when using CRD to extend Kubernetes. We will be discussing what you need to keep in mind while designing your API, the best practices, and some of the design standards. By the end of this talk, you'll have gained a deeper understanding of API design conventions and be equipped with the knowledge you need to successfully extend Kubernetes with CRDs.

Speakers
avatar for Hongcai Ren

Hongcai Ren

Senior Software Engineer, Huawei
Hongcai Ren(@RainbowMango) is the CNCF Ambassador, who has been working on Kubernetes and other CNCF projects since 2019, and is the maintainer of the Kubernetes and Karmada projects.


Thursday September 28, 2023 2:45pm - 3:20pm CST
2层 会议室 2 | 2F Room 2
  软件开发生命周期 | SDLC

2:45pm CST

如何在大型集群中加速 Pod 的启动? | How Can Pod Start-up Be Accelerated on Nodes in Large Clusters? - Paco Xu, DaoCloud & Byron Wang, Birentech
这个想法来自我最近写的博客 Kubernetes 1.27:加速Pod启动的更新。 这是一个集群管理员可能面临的常见问题。 本次会话将向您展示Pod启动的过程以及如何加速Pod的启动。

  1. API:
    控制器管理器创建Pod的时间,
    KCM:PV和PVC绑定以及Webhooks。 
  2. 调度:
    GPU拓扑感知,节点负载感知。
  3. 来自kubelet方面的节点级别:
    镜像拉取,Sidecar,API QPS和Burs,事件驱动PLEG,
    限流,磁盘和卷驱动程序,静态CPU策略,容器运行时。
  4. GPU管理:
    拓扑不仅仅是NUMA,共享和监控。
  5. 数据负载:
    数据预加载和本地存储或分布式存储。
  6. 可观察性:
    如何检查为什么Pod启动缓慢?

The idea came from my recently written blog Kubernetes 1.27: updates on speeding up Pod startup. This is a common issue that cluster administrators may face. This session will show you the process of pod startup and everything about how to speed up the startup of pods.

  1. API:
    the creation time of pods by controller-manager,
    KCM: PV & PVC binding and webhooks. 
  2. Scheduling:
    GPU Topology-aware, node load aware 
  3. Node level from the kubelet side:
    Image Pulling, Sidecar, API QPS & Burs, Event-Driven PLEG, Throttling, Disk and Volume driver, Static CPU Policy, Container Runtime 
  4. GPU Management:
    Topology not only NUMA, Sharing, and Monitoring 
  5. Data load:
    data preload & local storage or distributed storage 
  6. Observability:
    How to check why the pod starts up slowly?

Speakers
avatar for 徐俊杰 Paco

徐俊杰 Paco

Lead of Open Source Team, DaoCloud
Paco is a kubeadm maintainer and an active kubernetes contributor, and he mainly works on SIG-Node & SIG-Cli/SIG-Testing. Paco is currently the leader of the open-source team in DaoCloud, KCD Chengdu 2022 organizer, and a speaker in KCD Shanghai, Kubecon EU 2023, and Kubecon China... Read More →
avatar for Byron Wang

Byron Wang

Staff Engineer, Birentech
Free my soul



Thursday September 28, 2023 2:45pm - 3:20pm CST
2层 会议室 1 | 2F Room 1
  运维+性能 | Operations + Performance
 

Filter sessions
Apply filters to sessions.