Regular Expressions (C)
Regular Expressions in C use POSIX syntax and are a little weird.
How It Works
Compile Pattern
To compile the regex pattern, use int regcomp(regex_t *preg, const char *regex, int cflags)
.
preg
- Destination for your compiled regex pattern.regex
- The string representing your regex pattern.-
cflags
- Which flags you want to enable[3].REG_EXTENDED
- Treat the pattern as an extended regular expression, rather than as a basic regular expression.REG_ICASE
- Ignore case when matching letters.REG_NOSUB
- Don’t bother storing the contents of the matchptr array.REG_NEWLINE
- Treat a newline in string as dividing string into multiple lines, so that$
can match before the newline and^
can match after. Also, don’t permit.
to match a newline, and don’t permit[^…]
to match a newline. Otherwise, newline acts like any other ordinary character.
Match It To String
To match, use int regexec(const regex_t *preg, const char *string, size_t nmatch, regmatch_t pmatch[], int eflags)
.
preg
- Compiled regex patternstring
- String to match againstnmatch
- Maximum number of matches to savepmatch
- The offsets of the subexpression starting at the ith open parenthesis. Eachregmatch_t
withinpmatch
has anrm_so
and anrm_eo
(regex match start/end offset from the start of the string).-
eflags
- eflags may be the bitwise-or of one or both ofREG_NOTBOL
- The match-beginning-of-line operator always fails to match. This flag may be used when different portions of a string are passed toregexec
and the beginning of the string should not be interpreted as the beginning of the line.REG_NOTEOL
- The match-end-of-line operator always fails to match.
Catch Any Errors
Use size_t regerror(int errcode, const regex_t *preg, char *errbuf, size_t errbuf_size)
[4].
Free Up Memory When Finished
Use void regfree(regex_t *preg)
.
Example
Check out the thorough example from Ben Bullock[1] using the link below, or if the page doesn't work, you can find it locally here. A more basic example can be found below, from Per-Olof Pettersson[5]:
#include <sys/types.h>
#include <regex.h>
#include <stdio.h>
int main (int argc, char *argv[]) {
regex_t regex;
int reti;
char msgbuf[100];
/* Compile regular expression */
reti = regcomp(®ex, "^a[[:alnum:]]", 0);
if (reti) {
fprintf(stderr, "Could not compile regex\n");
exit(1);
}
/* Execute regular expression */
reti = regexec(®ex, "abc", 0, NULL, 0);
if (!reti) {
puts("Match");
} else if (reti == REG_NOMATCH) {
puts("No match");
} else {
regerror(reti, ®ex, msgbuf, sizeof(msgbuf));
fprintf(stderr, "Regex match failed: %s\n", msgbuf);
exit(1);
}
/* Free compiled regular expression if you want to use the regex_t again */
regfree(®ex);
return 0;
}
References
- https://www.lemoda.net/c/unix-regex/
- http://www.gnu.org/savannah-checkouts/gnu/libc/manual/html_node/Regular-Expressions.html
- http://www.gnu.org/savannah-checkouts/gnu/libc/manual/html_node/Flags-for-POSIX-Regexps.html
- https://linux.die.net/man/3/regexec
- http://web.archive.org/web/20160308115653/http://peope.net/old/regex.html
Last modified: 202401040446