Bash User Input Validation
If you are writing your latest and greatest Bash shell script that requires careful user input, then you are probably looking for a way to validate or sanitize the input before using the data in commands or subroutines. Here’s an example shell script that reads user input into a variable, which we in turn echo and sanitize into a new variable. The new variable will then be used to perform whatever function is required, in this case displaying the new value.
#!/bin/bash read -p "Enter variable: " VAR_INPUT # Sanitize input and assign to new variable export VAR_CLEAN="`echo "${VAR_INPUT}" | tr -cd '[:alnum:] [:space:]'`" echo "New Variable: ${VAR_CLEAN}"
Notice, we use the tr command to delete everything except alphanumeric and space characters. You can also perform further manipulation with any other command that comes to mind. For example, if you would like to also limit the number of characters to 10, use the cut command.
export VAR_CLEAN="`echo "${VAR_INPUT}" | tr -cd '[:alnum:] [:space:]' | cut -c -10`"
I like using tr in this fashion, because instead of trying to exclude specific characters, you have the option to enforce a deny all policy, making it easier for you to allow only what you want.
As one of our readers mentioned, there is an even simpler method using only Bash search and replace! This eliminates the need for the execution of tr. In the following example, we sanitize the input allowing for only alphanumeric characters and spaces. I also show how to trim the string length to a maximum character limit of 10.
#!/bin/bash read -p "Enter variable: " VAR_INPUT # Sanitize input and assign to new variable export VAR_CLEAN_1="${VAR_INPUT//[^a-zA-Z0-9 ]/}" echo "New Variable 1: ${VAR_CLEAN_1}" # Sanitize input, assign to new variable but limit it to 10 characters export VAR_CLEAN_2="`echo "${VAR_INPUT//[^a-zA-Z0-9 ]/}" | cut -c -10`" echo "New Variable 2: ${VAR_CLEAN_2}"
For more information, be sure to check out the man pages for tr and take a look at the Advanced Bash-Scripting Guide. Additional comments and ideas welcome!
Comments
Amarendra said,
Interesting, and neat, especially the white-listing part.
al said,
I had something similar to complete not too long ago. I found it very hard to manipulate last characters going backwards in order to edit spaces. Example 10 character password al——–, where “-” is equal to space. In order to check for that and concatenate to just “al” instead of “al ” – what approach would you use?
gmendoza said,
Hi there. The example in the post explains that alphanumeric and spaces are allowed. Simply omit the [space] value, and you’ll be left strictly with alphanumeric. For example:
tr -cd ‘[:alnum:]‘
al said,
I actually did that. But please consider following – someone puts in the username “al_the_on___the_sea”. I wouldn’t want to shrink it to “altheon” after cutting 10 spaces, but would rather have “al_the_on” . This would make it 9 char, but if the last space was left “al_the_on_” it would be user difficult to use such a name. I tried different ways, but nothing easy came about. If you have some sort of solution, I would be all ears. Also from programming perspective, one would have to check from the back of the string moving forward until first alphanumeric character was found.
gmendoza said,
Easy… use sed to strip beginning and trailing spaces:
tr -cd ‘[:alnum:] [:space:]‘ | sed -e ’s/^[ ]*//’ -e ’s/[ ]*$//’
al said,
Thanks a bunch – I’ll remember this one.
buddyh said,
As a learning admin this is great info. I thought there was a way to limit the input to a specific set of characters. I just need to have the user input a Y or N in either upper or lower case and reject any other entry. Thinking of using a while loop till a correct char is entered as an alternative.
Tx in advance
Avery said,
any clues for using tr to also allow utf-8 characters – the alnum doesn’t seem to okay things like… ñáéíóúçäüö etc…
gmendoza said,
Apparently tr does not support multibyte characters (yet). Found this post with comments that suggest support for UTF-8 and UTF-16 is coming. See http://tinyurl.com/mqsb69 for more details.
Pawan Jaitly said,
A bash way of doing this:
DIRTY=”a dirty string”
CLEAN=${DIRTY//[^a-zA-Z0-9]/}
In this example the dirty characters are spaces. Modify to taste.
gmendoza said,
@Pawan Jaitly: Thanks for your input. I’ll add the Bash method as it certainly is the simplest method!
Add A Comment