• home
  • forum
  • my
  • kt
  • download
  • PHP Forms and Regular Expressions

    Author: 2009-03-10 09:52:44 From:

    Regular expression is the basic functionality of pattern comparison. PHP offers two sets of functions for regular expressions - POSIX style and Perl style. Both types have slightly different syntax and this post should give basic overview of the POSIX one.

    Regular expression (called regex) is nothing more just sequence of characters (called pattern) which is compared agains a text in which we search. Patterns contain a combination of metacharacters and literals. Metacharacters (also called operators) define how literals (also called constants) should be treated on pattern evaluation against evaluated expression. For example, POSIX pattern [a-z0-9] which determines valid expression containig lowercase letters or number 0-9 has two metacharacters (opening square bracket and closing square bracket) and two literal ranges (a-z and 0-9, also called classes). In other words, literal means character itself whilst metacharacter means control character. Why it’s so important to distinguish between metacharacters and literals? The reason is that if you need to use metacharacters in pattern as a literal you must precede it by \ (backslash), very often said: it must be escaped. For example, if you need to add a dot in the regular expression pattern and don’t want to use this dot as a control character with meaning “any character” it is necessary to escape it - use it with backslash (see the table below for an example).

    Following table lists POSIX metacharacters:

    MetacharacterDescriptionExample
    ^matches the starting position within the string^(([A-Za-z0-9_-]+)…
    .matches any one charactera.c matches “abc”
    *matches the preceding element zero or more timesab*c matches “ac”, “abc”, “abbbc”
    [xyz]* matches “”, “x”, “y”, “z”, “zx”, “zyx”, “xyzzy”
    +matches the preceding element one or more timesba+ matches “ba”, “baa”, “baaa”
    ?matches the preceding element zero or one timeba? matches “b” or “ba”
    {m,n}matches the preceding element at least m and not more than n times{3,5} matches only “aaa”, “aaaa”, and “aaaaa”
    ()defines a marked subexpression^(([A-Za-z0-9_-]+)[.]([A-Za-z0-9_-]+))+$
    []defines a class of characters[0-9] matches any one number (range class)
    [a.c] matches only “a” or “.” or “c” (list class)
    [^]matches a single character that is not contained within the brackets[^abc] matches any char other than “a”, “b”, or “c”
    [^a-z] matches any single char that is not a lowercase letter from “a” to “z”
    $matches the ending position of the string or the position just before a string-ending newline…[.]([A-Za-z0-9_-]+))+$
    |matches either the expression before or the expression after the operatorabc|def matches “abc” or “def”
    \changes metacharacter to literal(.+) matches any expression containing at least one arbitrary character
    (\.+) matches any expression containing at least one dot character

    Following table lists POSIX character classes for more comfortable programming:

    ClassDescriptionAlternative
    [:alpha:]uppercase and lowercase letters[A-Za-z]
    [:alnum:]uppercase and lowercase letters and numbers[A-Za-z0-9]
    [:cntrl:]control characters like TAB, ESC or Backspace-
    [:digit:]numbers from zero to nine[0-9]
    [:graph:]ASCII (33-126) printable characters-
    [:lower:]lowercase letters[a-z]
    [:punct:]punctual characters: ~`!@#$%^&*()-_+={}[]:;’<>,.?/-
    [:upper:]uppercase letters[A-Z]
    [:space:]empty characters like space, newline, carriage return-
    [:xdigit:]hexadecimal numbers[a-fA-F0-9]

    This table lists PHP POSIX regex functions:

    PrototypeDescription
    int ereg (string $pattern, string $string [, array &$regs])Searches a string for matches to the regular expression given in pattern in a case-sensitive way.
    int eregi (string $pattern, string $string [, array &$regs])This function is identical to ereg() except that it ignores case distinction when matching alphabetic characters.
    string ereg_replace (string $pattern, string $replacement, string $string)This function scans string for matches to pattern, then replaces the matched text with replacement.
    string eregi_replace (string $pattern, string $replacement, string $string)This function is identical to ereg_replace() except that ignores case distinction when matching alphabetic chars.
    array split (string $pattern, string $string [, int $limit])Splits a string into array by regular expression.
    array spliti (string $pattern, string $string [, int $limit])This function is identical to split() except that this ignores case distinction when matching alphabetic characters.
    string sql_regcase (string $string)Creates a regular expression for a case insensitive match.

    Regular expressions are very usefull when we need to check some user inputs. If you have a contact form on your site which contains mandatory e-mail address field, how would you check whether user input string has valid e-mail format? Use regular expression match! Here are some examples for better understanding:

    • ^(([A-Za-z0-9_-]+)[.]([A-Za-z0-9_-]+))+$ : matches a hostname expression (hostname.example.com)
    • ^([0-9]{1,3})\.([0-9]{1,3})[.]([0-9]{1,3})\.([0-9]{1,3})$ : matches an IP address (192.168.10.122)
    • ^([A-Za-z0-9._-]+)@([A-Za-z0-9._-]+)[.]([a-z]{2,4})$ : matches an e-mail address (mailbox@example.com)

    Maybe you have noticed that sometimes there is a choice how to write regular expression pattern. In the first and third example above the dot character is expressed as a member of list class [.] whilst in the second example (IP address regexp) the dot is expressed as an escaped metacharacter \. at some places (this was done for demonstration purposes).

    Another very important detail which should be noted is the fact that if you need to use metacharacters in a range class or list class it must be placed at the end of a content of such class, right before closing square bracket [... _-].

    You can play with staed above examples by pasting the following code into a regexp.php file and run it in a browser:

    <html>
    <head>
    <title>POSIX Regexp Tester</title>
    </head>
    <body>
    <form action="" method="post">
    <b>Enter String:</b><br>
    <input type="text" name="string"><br>
    <b>Select Pattern:</b><br>
    <input type="radio" name="type" value="host" checked="checked">Hostname<br>
    <input type="radio" name="type" value="ip">IP Address<br>
    <input type="radio" name="type" value="email">Email Address<br>
    <input type="submit" name="submit" value="Check Match">
    </form>
    <?php
    $pattern_host  = "^(([A-Za-z0-9_-]+)[.]([A-Za-z0-9_-]+))+$";
    $pattern_ip    = "^([0-9]{1,3})\.([0-9]{1,3})[.]([0-9]{1,3})\.([0-9]{1,3})$";
    $pattern_email = "^([A-Za-z0-9._-]+)@([A-Za-z0-9._-]+)[.]([a-z]{2,4})$"; 
    
    if (isset($_POST['submit']))
    {
        $string = $_POST['string'];
        $type = "pattern_" . $_POST['type'];
        $pattern = $$type;
    }
    else
        $string = "string";
    
    echo 'Pattern: <samp>' . $pattern . '</samp><br>';
    echo 'String: <samp>' . $string . '</samp><br><br>';
    
    echo 'Match: ';
    if (ereg($pattern, $string))
        echo '<b style="color:#00ff00">OK</b>';
    else
        echo '<b style="color:#ff0000">WRONG</b>';
    ?>
    </body>
    </html>

    I hope this post gave you at least basic overview of POSIX regular expressions and their use in PHP. In some of future articles we will take a look at Perl style regular expressions.

    discuss this topic to forum

    relation tutorial

    No information

    Category

      Ad Management (6)
      Calendars (3)
      Chat Systems (8)
      Content Management (41)
      Cookies and Sessions (12)
      Counters (15)
      Database Related (34)
      Date and Time (15)
      Development (22)
      Discussion Boards (8)
      E Commerce (8)
      Email Systems (14)
      Error Handling (8)
      File Manipulation (36)
      Flash and PHP (6)
      Form Processing (22)
      Guestbooks (12)
      Image Manipulation (26)
      Installing PHP (7)
      Introduction to PHP (29)
      Link Indexing (8)
      Mailing List Management (9)
      Miscellaneous (60)
      Networking (9)
      News Publishing (9)
      OOP (28)
      PEAR (6)
      PHP vs Other Languages (2)
      Polls and Voting (7)
      Postcards (1)
      Randomizing (15)
      Redirection (12)
      Searching (10)
      Security (30)
      Site Navigation (16)
      User Authentication (14)
      WAP and WML (7)
      Web Fetching (8)
      Web Traffic Analysis (15)
      XML and PHP (16)

    New

    Hot