5.16  Unit: regex

[procedure] (grep REGEX LIST)
Returns all items of LIST that match the regular expression REGEX. This procedure could be defined as follows:

(define (grep regex lst)
  (filter (lambda (x) (string-match regex x)) lst) )

[procedure] (pattern->regexp PATTERN)
Converts the file-pattern PATTERN into a regular expression.

(pattern->regexp "foo.*") ==> "foo\..*"

[procedure] (regexp STRING)
Returns a precompiled regular expression object for string.

[procedure] (regexp? X)
Returns #t if X is a precompiled regular expression, or #f otherwise.

[procedure] (string-match REGEXP STRING [START])
[procedure] (string-match-positions REGEXP STRING [START])
Matches the regular expression in REGEXP (a string or a precompiled regular expression) with STRING and returns either #f if the match failed, or a list of matching groups, where the first element is the complete match. If the optional argument START is supplied, it specifies the starting position in STRING. For each matching group the result-list contains either: #f for a non-matching but optional group; a list of start- and end-position of the match in STRING (in the case of string-match-positions); or the matching substring (in the case of string-match).

[procedure] (string-search REGEXP STRING [START [RANGE]])
[procedure] (string-search-positions REGEXP STRING [START [RANGE]])
Searches for the first match of the regular expression in REGEXP with STRING. The search can be limited to RANGE characters. Otherwise the procedures have the same behavior as string-match / string-match-positions.

[procedure] (string-split-fields REGEXP STRING [MODE [START]])
Splits STRING into a list of fields according to MODE, where MODE can be the keyword #:infix (REGEXP matches field separator), the keyword #:suffix (REGEXP matches field terminator) or #t (REGEXP matches field), which is the default.

[procedure] (string-substitute REGEXP SUBST STRING [INDEX])
Searches substrings in STRING that match REGEXP and substitutes them with the string SUBST. The substitution can contain references to subexpressions in REGEXP with the
NUM
notation, where NUM refers to the NUMth parenthesized expression. The optional argument INDEX defaults to 1 and specifies the number of the match to be substituted. Any non-numeric index specifies that all matches are to be substituted.

(string-substitute "([0-9]+) (eggs|chicks)" "\\2 (\\1)" "99 eggs or 99 chicks" 2)
 ==> "99 eggs or chicks (99)"

[procedure] (string-substitute* STRING SMAP)
Substitutes elements of STRING according to SMAP. SMAP should be an association-list where each element of the list is a pair of the form (MATCH . REPLACEMENT). Every occurrence of the regular expression MATCH in STRING will be replaced by the string REPLACEMENT

(string-substitute* "<h1>Hello, world!</h1>" '(("<[/A-Za-z0-9]+>" . ""))))

==>  "Hello, world!"