Hello, my name is Jack Huang. Currently I am a Cloud Native Container Service Social Architecture in Alibaba Cloud, focusing on helping customer's business transformation based on Cloud Native technology. Today I will give you a briefing on Cloud native application observability on Alibaba Cloud. In a large scale distributed system, all kinds of stability or performance problems may occur in the infrastructure such as networks, computing nodes, operating systems or the applications themselves. Observability can help us understand the state of distributed systems helping users to make decisions easier, and serve as the basis for elastic scattering and automated operation. In general, observability includes several important aspects. First of all, logging. We provide a complete log solution based on Alibaba Cloud log service as a OS, which not only can collect and process application logs but also provide operations, auditing, Kubernetes even center and other capabilities then metrics. Cloud monitoring provides the ability to comprehensive monitoring of infrastructure services such as ECAs, storage, and the network. For performance indicators of business applications such as the heap memory, [inaudible] of Java applications. ARMS can provide a full-scale performance monitoring for Java and PHP applications without modifying the business code. For Kubernetes applications and the components, the hosted promiscuous service provided by ARMS provides a variety of pre-state monitoring out of the box. Also provides an interface to facilitate the third-party integration. Another very important aspect is tracing. Tracing Analysis provides the ability of full link tracing analysis. It provides developers with complete distributed application call-link statistics, topology analysis and other tools. It can help developers quickly find and diagnose performance [inaudible] in distributed applications and improve the performance in the stability of micron service applications. With the basic knowledge of the previous slide, let's take a look at this overall climatic monitoring architecture. For a cloud native application platform such as Kubernetes we need to ensure everything is working properly, including Kubernetes platforms, applications, and also the infrastructure. For infrastructure monitoring we can use Cloud monitor to monitor the status of all kinds of infrastructures such as ECS, SLB, et cetera. If we need to have a better understanding of our Ecosystem, Cloud resources, AHAS can help to provide the architecture awareness. AHAS is a service which is already serving in China, it is planned to serve overseas later. We can use Cloud logs services to capture all kinds of logs. For example Kubernetes API servers, audit logs issue or numerous controllers, ingress logs, and also the business logs from the containerized applications. SLS not only captures the logs, but also provides a lot of well-designed dashboards for users to use them directly. If we need to monitor the application performance or if we need to monitor customer metrics, we can use ARMS and the managed promiscuous provided by ARMS with the ability of NPD, the node problem detector, we can send the events alarm to Kafka or DingDing. Let's take a look at the observability services separately. First, let's take a look at Cloud monitor. This is a service most of Alibaba Cloud users are using at the first day. It can provide nearly all kinds of infrastructure monitoring capability. It has a lot of dashboards for users to use directly. It also provides the ability for monitoring alarms. It's easy to integrate with open sourced tools as well. It also provides some Kubernetes monitoring capabilities such as basic pod in the deployment metrics. Kubernetes has many components. We also need the capability to monitor more metrics. A lot of users are already using open source tools. Prometheus, it is a CNCF native monitoring tool. It also has a lot of dashboards for you to monitor. Of course, you can customize your dashboard with your most concerned metrics. If you have tried the open-sourced Prometheus, you must know, you will require a lot of work to maintain the Prometheus platform, to ensure it's working for you. But you may you still get a problem of continuous stability of this monitoring tool. So Alibaba Cloud launched a service in ARMS. We call it ARMS-Prometheus. It's a managed Prometheus service. It provides the open, high availability and scalable monitoring capability. It can help to reduce the cost and the effort to maintain such monitor platform. We have already provided several of the dashboards, and then you can also easily customize your own dashboard or set up your most concerned metrics with alarms. When you are using Kubernetes, you must want to have well-designed dashboards, which can help you to check the overall status of your cluster with the capability of SLS and Log service. We can aggregate all kinds of logs into this service. The SLS already has defined several Kubernetes dashboard for you to use, such as apiserver audit dashboard. You can see how many resources are deleted, who is accessing into your Container, who has browsed your secrets, etc. You can also capture all your Ingress logs into SLS Log service. With this Ingress dashboard, it can help you to analyze your status of Ingress such as PV, UV, latency, where are your users located, etc. Of course, the operators can define some allowance to monitor your cluster. There must be a lot of Kubernetes users complaining that there are so many components in Kubernetes, so many logs, and the events they need to check when problem occurs. In order to simplify the process of troubleshooting, we designed this Kubernetes Event Center in SLS. As you can see, we aggregate all kinds of Kubernetes events here. We also capture some critical node level events here. We achieve this via a node problem detector. With this dashboard, you can have an overview of the health status of your cluster. These metrics can even help the users to make decision if they need to expand their resources in the cluster. One very important reason for users to use Cloud Kubernetes is that users can outer scale their business applications via HPA, horizontal pod auto scale. Also, the HPA is prevented due to the limited resources. We need the ability to monitor which applications are scaling out or scaling in. How many nodes are scaled and how long they are using these resources. With the apparent here of this auto scale or audit dashboard, we can easily get overview of such kind of information. Users always need to and rise their business blogs and extract the valuable information to guide their business decisions. Then need to know, where are our customers located? What kind of device they are using to access our business? What are our most used services? After you aggregated business logs into SLS, you will have this kind of ability. Here is an example of Ingress dashboard. If your service entry point is Ingress, this dashboard will have been very helpful for you. It already has the most popular metrics for a lot of users, so try this dashboard if you are already using Ingress. It's a very common scenario that you are facing a proper performance problem. There are so many components in the whole application architecture. You may need application performance monitoring tool to help you show you the whole architecture topology. You may need to analyze the cold chain traces, analyze method stacks or monitoring JVM and is the right tool for you. It is very easy to inject into your applications, especially for Java and PHP applications. You don't need to change your application codes, just enabled product via Kubernetes YAML file. It's just a simple annotation that is enough for you. With all the services we just talked about, we can have the easiest way to do that for chain problem detection, identity, and analyze your problems. We have helped the operators or the developers are lot to solve their daily problem. It is much easier to manage your cloud-native applications. Please feel free to try them if you also believe they can help you. Now, we will do some live demos about the application of the parity. First of all, I will show you h to collect logs from containerized applications. Here I have YAML file, as you can see. It's Ingress simple YAML file. I just used tomcat image and I am going to collect all the standard output from the container. I just need to conflict this [inaudible]. I name it as Tomcat Stan output 111 and the value is standard output. By the way, if you are going to collect some log files from the container and you can also choose an name and a state the value as the path or the filename. Just as a value. Now I'm going to apply this file an it will create to deployment. As you can see, the container is creating now and check. First of the container is running now, and we're going to login to log service console and from here, I'm going to refresh. As you can see, the Tomcat stand output 111 log stories they're already created by the YAML file and I click this log store and I see the logs are collected. IoS theta. Auto refresh, I said 15 seconds to auto refresh the log store and see and see if the logs are collected. As you can see, the stand output, log files are already gathered in the SIOS log store. You can do some analysis here. As we talked in a previous slides, there are a lot of predefined dashboards in the SIOS, log services. Now let's take a look at what dashboards we have and what we can do with them. I have already logged into the SIOS console. I can just click the dashboard here and as you can see, all the predefined dashboards are listed here. There's a lot of them and I'm not going to go through each of them. I will try one or two very interesting dashboard for you to take a look. The first dashboard I will show you is the Kubernetes Event Center. I'm sorry, for the Chinese characters here, we have already developed the English version, which will be released very soon. From this event center as you can see, we have abrogate all of the events from the Cloud Cluster and also from the node. This dashboard can give you some statistics here, then it will help you to get a better understanding of the health status of your cluster. As you can see, you have got several docker, or [inaudible] or [inaudible] or pod pending or [inaudible] and the node, disk space, alert, etc. In this dashboard we can gate the events from your cognits components, or pods. Also we can get the unmet or the events from your node. This is the ability provided by the node problem detector. We can see that top 10 most important events that the trends here. If you find more and more fair scheduling, for example, maybe you'll need to expand the resources in your cluster. You can also see the pod oom here, the pod eviction events here and what's the most important events here. From this unified event centered, you can get everything you need to know about your cluster. Let's take another where you use dashboard, which is called a Kubernetes Audit Center. We just opened the Kubernetes Audit Center dashboard here. From this dashboard, you can have the overview of the Total Events, the Public Visits, the number of created events, the number of the unauthorized visits, and you can also see the Public Access Distribution Area. There is another thing we're interested in this dashboard is as you can see, we have the Command Execution List here, and we also have the Secret Access Events here. Because in Kubernetes, as you know, we don't always need to exec into your container and we don't need to browse your secret as a common thing. So these actions we always consider as unsafe action. The operators of your cluster. They need to know who is doing this, who is browsing your secret with exec into your container, and they should have the ability or they should get the alert when someone doing this kind of actions. Let's do a live demo. I'm going to exec into a port, and I will also going to browse a secret. Here, I'm still using the same cluster. I will exec into a CentOS Core. Let's see what will happen. I have already created alert from here, and when someone exec into the port, I will get the alert. I have configured the alert to my DingDing group. As you can see, we have got the alert here. It is the exec action alert. Let's do another browse secret. I get a secret. I have browsed the secret. Maybe we have some credential information here. I don't want anyone to read this information. So I need to know who is browsing my secret. Let's say if we get the alert. Let's take a look at the Log Service. Actually, I have gathered my logs for that cluster is in this log project so I need to take a look here. Someone has browsed my secret. I will refresh my sos console. As you can see, yeah, it's already here. My DingDing has got the alert. As you can see, it's much safe for the operators to protect their resources, and also to protect their credential information. Please try it if you like it. Besides the dashboards we talked about the event center, the audit center, a lot of us were used for dashboard listing here. I'm not going to talk each one of them. If you have interest in it, you can take a look at them by yourselves, especially some numerous dashboard. They are very helpful to analyze your business metrics. Please feel free to use the SOS, if you already have the Kubernetes, and you already have the log service enabled in Alibaba Cloud, this is very helpful for you. We are going to take a look at the last thing, we are going to show you the ability of the ARMS prometheus. As you know, if you are going to manage open-source prometheus, you need to install everything by yourself. You not only need to maintain your applications, you also need to maintain the prometheus platforms by yourself. The stability of the monitoring platform is very important to you. Alibaba Cloud, we provide a managed prometheus monitoring tool, we provide it in ARMS service. ARMS is the Application Real-Time Monitoring Service. From here you can see there is a prometheus monitoring here. If you are going to install the prometheus components into one of your cluster, you can choose the right region, and you can see if your clusters have installed the prometheus plugins or not. For example, we have a cluster named VJK, FLWK. If I click installation, I don't need to specify a lot of things, it will install everything for you. I'm not going to install on this cluster because I have several clusters which have already installed the prometheus plugins. I'm going to show you one of my cluster in China, Hangzhou. In this cluster, I have installed the prometheus monitoring plugins. As you can see, how many plugins are enabled in this cluster. If we are going to take a look at for example, the Kubernetes overview, it will open the graph out for you. As you can see, you have the dashboard which is well familiar to the prometheus users. As you can see, you also have a lot of other dashboards for you. Of course, you can also import the dashboard if you already have such configurations. Today, because of the limit of time, we are not going to talk too much about the other tools or other capabilities. From the slides and the from the demos, we hope you will like the ecosystem we built for the Kubernetes users. I hope you will enjoy your life with the cloud native. Thanks for listening. Goodbye.