You've annotated your genome in PATRIC, and now it's time to look at it. I've shown you in different videos how you can go into the jobs monitor and find your jobs. You could also go into the Workspaces, go into your Home directory, and find where you stored your job, or you could go directly to My Jobs. Let's click on that. This takes you directly to your Jobs page. I've submitted a lot of jobs in PATRIC, and if I want to look only at those jobs that were annotation jobs, I could click here on the "Down arrow" next to All Services, and click on "Annotation". This will show me all the annotation jobs I submitted in PATRIC. If I click on a row with any job, you'll notice the vertical green bars suddenly populated with possible downstream functions. Any job you have an issue within PATRIC, it failed or you don't understand something about it, you can click on the "Report Issue icon", put in some information here about what your question is. This will give us the information about this job, including the job ID, and you can submit that and then developers will be able to look at it and figure out what went wrong. This job didn't fail, I'm going to click the "View" icon. This takes me to the landing page for this job. Notice that across the top, you have a breadcrumb that tells you where the job is placed in your Home directory. You also have some hyperlink icons here that we'll discuss in another video. Up here, is a table that shows you the results of your annotation job. It tells you it was a genome, submitted under these parameters. That it had this job ID, that it started and ended at these times, and it took this long to run. Annotation jobs generally run pretty quickly. You can click on the "Parameters" and see which parameters you selected when you submitted the annotation job. Down here, are a variety of folders and files that come as part of any annotation job in PATRIC. We've already discussed the genome report. Let's talk a little bit more about the other files that come when you submit an annotation job. When I select a row that contains a specific file, you notice it tells me what the size is, who owns it. If I've shared this genome, you can see that here, and the time I created it. If I wanted to look at this particular file, I click the "Download" icon, and that will download it onto my computer and I can view it by clicking here. This is what a contig file looks like, a FASTA file. It lists each of the contigs. Each contig is separated by this carat, followed by sequence. That's an important file for many different downstream functions. You may like your data in different ways. You may have different tools that you want to examine this data in, and we try to wrap it up for you in any kind of format that you might need. EMBL file contains the sequence data of the annotated genome. It's actually an EMBL dump, the annotated genome. The feature_dna.fasta file, this contains all of the feature sequences or genes of the genome in DNA FASTA format. If you wanted to look at what that looked like, the first row includes the gene ID. This is the unique identifier for the gene, it starts at the top of the genome, the first gene annotated. The functional description of it, which has a really long. This is what this gene is named in PATRIC. The genomic came from and the genome ID, and then followed by the sequence in each gene, starting with 1, 2, 3 as the same format. The protein FASTA file looks just the same, except instead of DNA sequence, what you're seeing is amino acid sequence for each of those genes. The features.txt is a tab delimited table listing all the features of the genome. All of our folders have some good information. But this is one place where you can only get a certain type of information. For each feature, it contains the PATRIC ID here, the location string. Where it's located on that contig, the start and stop locations. The feature type. This is a coding sequence. The functional assignment, which is what the gene is named. If there was any aliases or altered IDs for that, it would be listed next. For protein coding genes, it has the protein MD5 checksum. This is the only place you can find this. This was a good one for us to open. The GenBank format is pretty similar to the EMBL format, but let's look at that real quick. That's for anybody who's downloaded anything from GenBank. This is what you pretty used to seeing. The genome format. I'm not going to open this. But because the genome format is JSON formatted file. JSON formatted files have a lot of information, but they're designed for computers to read and not necessarily the human eye where you can look at it. But it contains genome typed objects in the JSON formatted file, and it's encapsulating all the data from the annotated genome. Then we have the General Features Format, GFF. This lists all the features of the genome in the General Feature format. We want to look at what that looks like, you can download it. Some of these are just interesting to look at but it's just another file of data. The merged GenBank format has all the contigs combined into a single GenBank record. In this format, the file can be opened in Artemis, for those of you who like that particular tool, you would need to get the merge GenBank format. Tar.gz is a collection of all the files wrapped up into one single file for easy storage. It's got everything on this page in it. The text format contains all the information for all the genes, including sequences in the text format, and let's quickly look at what that looks like. If one wants to look at something like that. It's got all the format here, including the nucleotide and amino acid sequences that you can see here. We also provide that in an Excel format that you can see here. Genome quality details, this has the raw output of the genome evaluation tool, which we've discussed in the genome report and pointed you to the publication that describes it. Then you notice there's one additional folder and it says Load Files, and if you double-click on that, you'll notice that all of the files within this folder and then JSON. These are once again, those types of files, they store simple data, structures, and objects in JavaScript, Object Notation, which is a standard data interchange format. It's primarily used for transmitting data between a web application and the server. It's not recommended for viewing, it's a computer readable file, but the file is downloadable and readable. I'm not going to open all of this, but this is all here for you. These are the type of files that come with any genome that you have annotated in PATRIC. They can be used by a variety of different tools. Or you can see the data in PATRIC itself which we'll talk about next time. The third assignment, right about in the middle of the whole annotation series, you completed your annotation job. There are a couple of things I want you to do this time. I want you to download files and take a look at them. The download files are available within annotation job in PATRIC. I also want you to go into the Genome record for each of those annotation jobs that you submitted and enter the data on that. What do I mean about the downloadable files? The ones that I've starred here are the ones that I think you should download and take a look at. The ones that end in TXT, you can view within the PATRIC page. The other ones you'll need to download and open with either a text editor or with Excel. I particularly like these two files, the.txts and the.xls file with all of the information of all the gene annotations. It could very well be my favorite of these files that you download. I'm not asking you to download the tar zip file in case you really want to. What it is is everything that's in here, also, the alignments that's just shows you the alignment of one gene. If you want to download that, knock yourself out. Go ahead. I find it a little bit boring. I'm not asking you to do anything on the JSON files because those are computer readable files and I find them quite boring. But if you're into that kind of thing, look at it, have fun with it. But what's even more interesting to me, if anything could be more interesting? This is the spreadsheet that I created for each of the annotation jobs from each of those different assemblies. The headers across the top come in that Genome table. I just copied that, pasted it in Excel, and then transposed it. It'll be so easy for you to do that. You can fill in all the roles and then you'll have a good sense of what looks good and what doesn't and how did that affect the downstream annotation? I think that this can be a really good exercise for you to start thinking about what's happening during the assembly and the annotation process and when things are different in an assembly, what is that difference? Are they genes that matter or not? We'll be exploring that in later exercises. Good luck. Bye.