Okay. To get started with this lab, make sure your browser is open to the Google Cloud Platform dashboard. Begin by clicking on activate Google Cloud Shell. It is critical that you have your Cloud Shell environment prepared with the source code and packages needed to execute it. If you recently completed the previous lab, you should already have the code and the packages installed. However, if you find that you're missing the training data analyst directory in your Cloud Shell environment, you should stop here and complete the previous lab before moving forward. If your Cloud Shell environment is set up, you can use that Cloud Shell code editor to open the source code for the apache beam pipeline used in this lab. You can find it under training data analyst, courses, data analysis, lab two, Python directory, in the is popular.py file. There's more code in this file now compared to the previous lab. So, next you will see the code in more detail. If you scroll down to the body of the main method, notice the input argument for the code. As input, the pipeline takes the Java source code files in the Java help directory. Also, notice that the Alvarado's pipeline is going to be stored in the /tmp directory was files having output prefix by default but of course it's possible to override the setting. After the data is read from Google Cloud storage, then next step in this pipeline is to check for the lines that start with the key term. As you remember from the previous lab, the key term for this pipeline is the import keyboard. Next, the pipeline is processing the names of the imported packages. Notice that this depends on the package you use method that in turn, looks at the names of the packages in the import statement and extracts out the package name itself, removing the import keyword, and the closing semicolon character. Finally, once the package name is found, the split package name function returns the multiple prefixes for each package name. For example, for a package com example appname, the function will return prefixes com, com.example, com.example.appname. For each one of those packages, the method returns a pair was the package prefix and an in digit one for each occurrence. The occurrences are added together using the combine PerKey operation and the sum function as the argument. The top five combiner identifies the top five most frequently imported packages. Next, you can go ahead and execute the is popular.py file. Once the pipeline is done executing, you can take a look at the output directory and if you list out output file contents, you can see the top most popular packages specifically org, org.apache, org.apache.beam, and org.apache.beam.sdk. Notice that in this implementation of the pipeline, it is possible to modify the destination of the output. So for example, if you override the defaults to ask the pipeline to write the results out to /tmp directory was my output as the prefix. You can run the pipeline again and you'll find the new instances of the output. The new output file instances will have the my output prefix. All right. That's it for this lab.