Now that we have a good understanding of the process of Streaming Data, let's dive into Cloud Pub/Sub to see how it works. As we begin this topic, make sure to keep an open mind to new ways of doing things. Cloud Pub/Sub does Streaming differently from probably anything you've used in the past. It may be a different model from what you've seen before. Cloud Pub/Sub provides a fully managed data ingestion and distribution system. It can be used for many purposes. In the previous module, we mentioned using Cloud Pub/Sub to ingest and distribute incoming Streaming events. Cloud Pub/Sub provides an asynchronous messaging bus, which can hold these events until they are consumed by respective services for further processing. You can use Cloud Pub/Sub to connect applications within Google Cloud, and with applications on-premise, or in other Clouds to create Hybrid Data Engineering solutions with Cloud Pub/Sub, the applications do not need to be online and available all the time. The parts don't need to know how to communicate to each other, just to Cloud Pub/Sub, which can help simplify system design. A good example is the email scenario, where sender and receiver do not need to be available at the same time, but the message still goes through, and the receiver ultimately we'll get to consume it when it's ready to receive it. Cloud Pub/Sub is a ready to use service with nothing to install, and Cloud Pub Sub Client Libraries are available in C-sharp, Go, Java, Node.js, Python and Ruby, to help you write your code. These wrap REST API calls which can be made in any language. Cloud Pub/Sub has qualities that make it desirable when we consider Streaming solutions. Firstly, Cloud Pub/Sub is highly available. Cloud Pub/Sub servers run in most GCP regions around the world, and this allows the service to offer fast global Data Access. Messages are stored in multiple locations for availability and durability. But Cloud Pub/Sub does more to ensure durability in messages. By default, it will save your messages for up to seven days. In the event your systems are down and not able to process them at the time, you may retrieve messages at a later time, and continue with your processing. Finally, Cloud Pub/Sub is highly scalable. One of the use cases for Pub/Sub early on at Google was to be able to distribute the search engine and index around the world, processing internally about 100 million messages per second across the entire infrastructure. Currently, Google indexes the web anywhere from every two weeks, which is the slowest, to more than once per hour, for example, on really popular new sites. So on average, Google is indexing the web up to three times a day. Pub/Sub being a serverless service provides necessary infrastructure resources automatically to provide this kind of scalability. Cloud Pub/Sub is a HIPAA compliant service, offering fine grained access controls, and end to end encryption. Messages are encrypted in transit and at rest. These features make Cloud Pub/Sub an ideal solution for ingesting incoming Streaming Data, but we also need resilience. What happens if your systems get overloaded with large volumes of transactions, like on Black Friday? What you really need is some sort of buffer or backlog so that you can FIM messages only as fast as the systems are able to process them. Pub/Sub has this as a built-in capability. This relieves you of having to size the application to handle the highest traffic spike, plus some additional capacity as a safety buffer. This is not only wasteful of resources, which must be retained at top capacity, even when not being used, but it provides a recipe for a distributed denial of service attack, by creating an upper limit at which the application will cease to behave normally, and will exhibit non-deterministic behavior. Instead, you can use Cloud Pub/Sub as an intermediary, receiving and holding data until the application has resources to handle it. Either through processing the backlog of work, or by auto-scaling to meet the demand. As we dive deeper into Pub/Sub in the following slides, this will become more obvious. Now that we've given you a high-level overview, let's see how Pub/Sub works. The model is simple. The story of Cloud Pub/Sub is the story of two data structures, the topic and the subscription. Both the topic and the subscription are abstractions which exists in the Pub/Sub framework, independently of any workers, subscribers, or anything else. The Cloud Pub/Sub Client that creates the topic is called the publisher, and the Cloud Pub/Sub Client that creates the subscription is called the subscriber. Publisher will publish events and messages into a topic to be distributed and used for further processing. To receive messages published to a topic, you must create a subscription to that topic. In this example, the subscription is subscribed to the topic. Only messages published to the topic after the subscription is created are available to subscriber applications. The subscription connects the topic to a subscriber application that receives and processes messages published to the topic. A topic can have multiple subscriptions, but a given subscription belongs to a single topic. It helps to think of it as an enterprise messaging bus. Let's use the following example. Here, you see there is an HR topic that relates to new hire events. For example, a new person joins your company, and this notification should allow other applications that need to be notified about a new user joining to subscribe and get that message. What applications could tell you that a new person joined? Well, one example is the company directory. This is a client of the subscription, also called a subscriber. However, Cloud Pub/Sub is not limited to one subscriber or one subscription. Here, there are multiple subscriptions, and multiple subscribers. Maybe the facilities system needs to know about the new employee for badging, and the accounting and provisioning system needs to know for payroll. Each subscription guarantees delivery of the message to the service. These subscriber clients are decoupled from one another, and isolated from the publisher. In fact, we will see later that the HR system could go offline after it has sent its message to the HR topic, and the message will still be delivered to the subscribers. These examples show one subscription and one subscriber, but you can actually have more subscribers for a single subscription. In this example, the badge activation system requires a human being to activate the badge. There are multiple workers, but not all of them are available all the time. Cloud Pub/Sub makes the message available to all of them, but only one person needs to fetch the message and handle it. This is called a pull subscription. This is also what we call a push subscription, and we'll see more about this later. Now, say a new contractor arrives, instead of entering through the HR system, they go through the vendor office. The same kinds of actions need to occur for this worker. They need to be listed in the company directory. The facilities team needs to assign them a desk. Account provisioning needs to set their corporate identity and their accounts. The badge activation system needs to print and activate their contractor badge. A message can be published by the vendor office to the HR topic. The vendor office and the HR system are entirely decoupled from one another, but can make use of the same company services. The way that Pub/Sub works makes coupling possible. In the next section, you will learn more about the different patterns or flows of Pub/Sub messaging.