Validating Data Types

Perl is not a strongly-typed language. A scalar can hold a string or a number or a reference. But sometimes you need to know what it contains, for example if you are communicating with a strongly-typed system like a relational database, or even if you just want to make sure the user entered a number for his age rather than "old enough" or something of that kind.

The easiest way to do this is by regular expressions. There are two main ways of doing it:

  1. Positive assertion. Use a regular expression that tests if the data matches the desired pattern. For example:

    Is it an integer?
    /^\d+$/
    Is it a number?
    /^\d+(\.\d+)?$/
    Is it a phone number?
    /^\D*\d\d\d\D+\d\d\d\D+\d\d\d\d\D*$/

    Note that each one starts with ^ and ends with $. Without that, you wouldn’t be asking "is it …"; you would be asking "does it contain …" And that’s a different question entirely. For example, the pattern /\d+/ will match "over 30" for a person’s age.

  2. Negative assertion. Test for the thing you do not want. For example:

    Is it an integer?
    /\D/
    Is it not blank?
    /\S/

    Here, there must be some particular thing that disqualifies the match. This kind of test can be faster, because as soon as it finds a character that is not a digit or not a space (in these examples, respectively) then it will stop looking.

Both of these types of assertions are useful. Which one to use depends largely on your specific needs. A negative assertion for a phone number would be tricky if not downright impossible; while a positive assertion for "anything but X" is better done as a negative assertion.

Leave a Reply