Networking And Scripting : November 2016

The below Article has Regular Expression Explanation + Basic Examples w.r.t to TCL

For more examples please refer to Regular Expression Examples In TCL.

What are Regular Expressions?

A regular expression, or RE, describes strings of characters (words or phrases or any arbitrary text). It's a pattern that matches certain strings and doesn't match others. For example, you could write an RE to tell you if a string contains a URL (World Wide Web Uniform Resource Locator, such as http://somehost/somefile.html). Regular expressions can be either broad and general or focused and precise.

A regular expression uses metacharacters (characters that assume special meaning for matching other characters) such as *, [], $ and.. For example, the RE [Hh]ello!* would match Hello and hello and Hello! (and hello!!!!!). The RE [Hh](ello|i)!* would match Hello and Hi and Hi! (and so on). A backslash (\) disables the special meaning of the following character, so you could match the string [Hello] with the RE \[Hello\]

Regular Expressions:

Regular expressions can be expressed in just a few rules.

`.`	*Match any single character (e.g., `m.d` matches mad, mod, m3d, etc.)*
`[]`	Bracket expression: Match any one of the enclosed characters (e.g., `[a-z0-9_]` matches a lowercase ASCII letter, a digit, or an underscore)
`^`	*Start-of-string anchor: Match only at the start of a string (e.g., `^hi` matches hi and his* but not this)**
`$`	*End-of-string anchor: Match only at the end of a string (e.g., `hi$` matches hi and chi* but not this)**
`*`	*Zero-or-more quantifier: makes the previous part of the RE match zero or more times (e.g., `M.D` matches MD, MAD, MooD, M.D, etc.)**
`?`	*Zero-or-one quantifier: makes the previous part of the RE match zero or one time (e.g., `hi!?` matches hi or hi!)*
`+`	*One-or-more quantifier: makes the previous part of the RE match one or more times (e.g., `hi!+` matches hi!* or hi!! or hi!!! or ...)**
`\|`	Alternation *(vertical bar): Match just one alternative (e.g., `this\|that` matches this* or that)**
`()`	Sub pattern: Group part of the RE. Many uses, such as: *Makes a quantifier apply to a group of text (e.g., `([0-9A-F][0-9A-F])+` matches groups of two hexadecimal digits: A9 or AB03* or 8A6E00, but not A or A2C). Set limits for alternation (e.g., `"Eat (this\|that)!"` matches "Eat this!" or "Eat that!"). Used for subpattern matching in the regexp and regsub commands.**
`\`	Escape: Disables meaning of the following metacharacter (e.g., `a\.` matches a or a. or a..* or etc.). Note that `\` also has special meaning to the Tcl interpreter (and to applications, such as C compilers) Eg: Set TestingDuts 1/2 [regexp {\/} $TestingDuts] } We want to match if the there is a / or not in the above string [1/2] Since / has a different meaning so we need to add \ to remove the meaning of / in match. If want to match \n then we have to give /\n to match \n NOTE: *regexp {([^\/]+)/(.)} $port -- devNum port1 In the above regular expression --is used in the case if we don’t want to match the entire string. First () match will store devNum and second () match will store the second match.** In regular expression parsing, the * symbol matches zero or more occurrences of the character immediately preceding the . For example a would match a, aaaaa, or a blank string. If the character directly before the * is a set of characters within square brackets, then the * will match any quantity of all of these characters. For example, [a-c]* would match aa, abc, aabcabc, or again, an empty string. The + symbol behaves roughly the same as the , except that it requires at least one character to match. For example, [a-c]+ would match a, abc, or aabcabc, but not an empty string. Regular expression parsing also includes a method of selecting any character not in a set. If the first character after the [ is a caret (^), then the regular expression parser will match any character not in the set of characters between the square brackets. A caret can be included in the set of characters to match (or not) by placing it in any position other than the first. The regexp command is similar to the string match command in that it matches an exp against a string. It is different in that it can match a portion of a string, instead of the entire string, and will place the characters matched into the matchVar variable. If a match is found to the portion of a regular expression enclosed within parentheses, regexp will copy the subset of matching characters is to the subSpec argument. This can be used to parse simple strings. Regsub will copy the contents of the string to a new variable, substituting the characters that match exp with the characters in subSpec. If subSpec contains a & or \0, then those characters will be replaced by the characters that matched exp. If the number following a backslash is 1-9, then that backslash sequence will be replaced by the appropriate portion of exp that is enclosed within parentheses Note that the exp argument to regexp or regsub is processed by the Tcl substitution pass. Therefore quite often the expression is enclosed in braces to prevent any special processing by Tcl. Simple Examples: All Examples tested on TCL 8.4* ======================================== EXAMPLE 1. set sample "Where there is a will, There is a way." set result [regexp {[a-z]+} $sample match] puts $match ---prints here as output puts $result ---prints 1 as output In the above regular expression here is matched and stored in match string. If we want to match here there is a will in the above string the regular expression will be as below: set result [regexp {[a-z ]+} $sample match] --prints here there is a will stored in match[space added] To match here there is a will, in the above string the regular expression will be as below: set result [regexp {[a-z , ]+ } $sample match][comma added] To match Where there is a will, There is a way. the regular expression will be as below: set result [regexp {[A-Za-z ,\. ]+} $sample match] To match “Where there” and store “where” and “there” as separate substrings: set result [regexp {([A-Za-z]+) +([a-z]+)} $sample match sub1 sub2 ] puts $match --- Where there puts $sub1 ---- Where puts $sub2 ----there In the Above regular expression match will have complete match i.e Where there And the match between first () will match to Where and store in sub1 and second () match will match there and store in sub2 NOTE: If we don’t want to store the complete match in variable match we can use “--” Command which only save first and second match in sub1 and sub 2. Below regular expression does the same: set result [regexp {([A-Za-z]+) +([a-z]+)} $sample -- sub1 sub2 ] puts $sub1 ---- Where puts $sub2 ----there To match “here there is a will, There is a way” and to match “here there is a will” and “There is a way” and store it in sub1 and sub2 respectively. set result [regexp {([a-z ]+), +([A-Za-z ]+)} $sample match sub1 sub2] puts $match : here there is a will, There is a way puts $sub1 : here there is a will puts $sub2 : There is a way EXAMPLE: 2 set out "Tcl Tutorial" *regexp {([A-Z,a-z]).([A-Z,a-z])} $out a b c* puts "Full Match: $a" puts "Sub Match1: $b" puts "Sub Match2: $c" Output: Full Match: Tcl Tutorial Sub Match1: Tcl Sub Match2: Tutorial set out "Tcl Tutorial" *regexp {([A-Z,a-z].([A-Z,a-z]))} $out a b c* puts "Full Match: $a" puts "Sub Match1: $b" puts "Sub Match2: $c" Output: Full Match: Tcl Tutorial Sub Match1: Tcl Tutorial Sub Match2: Tutorial Switches for Regex Command The list of switches available in Tcl are, nocase − Used to ignore case. indices − Store location of matched sub patterns instead of matched characters. line − New line sensitive matching. Ignores the characters after newline. start index − Sets the offset of start of search pattern. In the above examples, I have deliberately used [A-Z, a-z] for all alphabets, you can easily use -nocase instead of as shown below: set out "Tcl Tutorial" *regexp -nocase {([A-Z].([A-Z]))} $out a b c* puts "Full Match: $a" puts "Sub Match1: $b" puts "Sub Match2: $c" Output: Full Match: Tcl Tutorial Sub Match1: Tcl Tutorial Sub Match2: Tutorial *regexp -nocase -line -- {([A-Z].([A-Z]))} "Tcl \nTutorial" a b* puts "Full Match: $a" puts "Sub Match1: $b" regexp -nocase -start 4 -line -- {([A-Z].([A-Z]))} "Tcl \nTutorial" a b puts "Full Match: $a" puts "Sub Match1: $b" Output: Full Match: Tcl Sub Match1: Tcl Full Match: Tutorial Sub Match1: Tutorial REGSUB: Syntax: regsub exp string subSpec var Searches string for substring that match the regular expression exp and replaces them with subSpec. The resulting string is copied into var Eg: 1 set sample "Where there is a will, There is a way." regsub "way" $sample "lawsuit" sample2 puts $sample : Where there is a will, There is a lawsuit. The above regular expression replaces the string “way” to “lawsuit” in stores it in sample. Eg: 2 set sample "eer dfgdfgf trt dfsdf sfdsf ree" regsub -all { +} $sample " " var puts $sample : eer dfgdfgf trt dfsdf sfdsf ree - Removes tab and inserts spaces. ?: Command Usage Usage:?: is used in sub patterns in a regexp Whenever you don’t want a particular subpattern to be included as a sub-pattern use “?:” in front of the sub-pattern Example: set string "Names: Manish Ajay Aman" regexp "Names: (Manish\|Ajay) (?:Aman\|Raj\|Ajay) (Aman\|Raj)" $string match sub1 sub2 sub3 puts "$match\n$sub1\n$sub2\n$sub3\n" For the above example, the output will be: Names: Manish Ajay Aman Manish Aman The Above regular expression will escape the condition followed by ?: so match will have full match And sub1:Manish sub2:Aman and Sub3: is null the second condition (?:Aman\|Raj\|Ajay) is escaped So here sub 3 acts as a dummy variable.

Networking And Scripting

Monday, 28 November 2016

Ping And Traceroute

Thursday, 17 November 2016

TCL-Regular Expression Explanation

Blog Archive

Technical Links

Non-Technical Links