In the honors lessons in the previous weeks, you learned about Beeline and Impala Shell. You learned how to use these two command line tools interactively and non-interactively. In the videos in this honors lesson, you'll learn about some additional capabilities of these command line tools and you'll see how you can call them from shell scripts. You'll also learn about some different options for integrating Hive and Impala with scripts and applications. In this video, I'll show you how to use a feature in Beeline and Impala shell called variable substitution. This feature enables you to parameterize queries. In other words, it enables you to take names or values that are hard-coded into your SQL statements and replace them with variables. This feature is implemented differently in Hive and Impala and the syntax differs slightly between them. First, I'll describe variables substitution in Hive, I'll use a couple different examples to demonstrate different applications of it. One situation where variables substitution is useful is when you have two or more statements in a SQL script and there's a particular literal value that's used in multiple places in the statements. In the example shown here, there are two select statements in the SQL script file game_prices.sql and the literal string, Monopoly, is used in both of these statements. The trouble with this is that if you want to change this literal string to something else, say from Monopoly to Clue, then you need to change it in more than one place. With only two places it's not so bad, but imagine there were dozens of places where you needed to change the value that would be cumbersome. Variable substitution provides a solution to this. At the top of the SQL scripts, you can add a set statement to assign a value to a variable. In this example, the variable named game is assigned the value Monopoly. The syntax of the set statement is unlike other SQL syntax, it begins with the keyword SET followed by a space, then the word hivevar and a colon. After the colon is the name of the variable you want to assign, in this example, it's game. Then there's an equal sign then the value you want to assign to the variable. You should not use quotes around the value in a set statement even if it's a character string. Elsewhere in SQL, you do need to use quotes around literal strings but the SET statement is special. You can use spaces on either side of the equal sign if you want, Hive will trim any white-space from the beginning and end of the value after the equal sign. White-space inside the value like spaces between words is retained. Finally, you terminate the set statement with a semicolon. On the lines below the SET statement you can use this variable in your SQL statements, to do this you use the syntax shown here, ${hivevar:game}. Then when you execute the statements in the SQL file using the Beeline command with the dash f option. Hive replaces each instance of this dollar sign curly brace hivevar placeholder with the value that's assigned to the variable. In other words, it substitutes the assigned value in each of these places. This example assigns just one variable then uses it in two places, but you can use more than one set statement to assign multiple variables with different names and you can use those variables in as many places as you need following the set statements. A different situation where variable substitution is useful is when you have a SQL statement in a file and you want to run it many times but with a different literal value substituted in each time. In the example shown here, there is just one SELECT statement that returns the hexadecimal color code for the crayon with the specified name, it's red in this case. If you wanted to run the same query for many different colors it would be cumbersome to keep editing the SQL script file to change the name of the color in the where clause. Variable substitution provides a solution to this. First, you replace the hard-coded value in the SQL script with a variable using the same dollar sign curly brace hivevar syntax I described earlier. The variable is named color in this example. Then instead of using a SET statement to assign a value to the variable, you use a command line argument to assign it. At the command line, you use the beeline command with the dash f option to execute the statement in the SQL file. In this example, it's hex_color.sql. Right before the dash f option, you use dash dash hivevar followed by the name and the value of the variable. The syntax is dash dash hivevar a space, then the name of the variable an equals sign and the value of the variable enclosed in quotes. Unlike with the SET statement you should use quotes around the value in this case, so the operating system shell passes it correctly to Beeline. When you run the beeline command, Hive replaces the dollar sign curly brace hivevar placeholder with the value specified on the command line. In this example, there was just one statement in the SQL file but you can have more than one. You can also use multiple different variables. To do that you need to specify dash dash hivevar on the command line once for each name-value pair as shown here. This example returns the name of the crayon that has the specified red green and blue values. So that's how you use variable substitution with Beeline. For Impala shell it's very similar, there's actually only one difference, with Impala shell you use var instead of hivevar. Aside from that, everything works the same as I described with Beeline. When choosing a name for a variable, stick to the same rules for valid identifiers that I introduced earlier in the course and you'll be safe. You should use all lowercase letters to avoid any questions about case sensitivity. Also, whenever you are assigning literal string values that include apostrophes or quotation marks you should always escape them with a backslash. You should do this regardless of whether it's Beeline or Impala shell and regardless of whether you are assigning the variable with a SET statement or on the command line.