Delphi XE includes Regular Expression support, something that has been requested many times over the years. In this blog post I'll show some basic usage of regular expressions in delphi. I'm assuming you already understand regular expressions and the associated terminology, if not take a look here for some tutorials etc.
The regular expression engine in Delphi XE is PCRE (Perl Compatible Regular Expression). It's a fast and compliant (with generally accepted regex syntax) engine which has been around for many years. Users of earlier versions of delphi can use it with TPerlRegEx, a delphi class wrapper around it.
The XE interface to pcre is a layer of units based on contributions from various people, the pcre api header translations in RegularExpressionsAPI.pas (Florent Ouchet and co), the wrapper class TPerlRegEx (Jan Goyvaerts) in RegularExpressionsCore.pas and the record wrappers on RegularExpressions.pas (myself). This unit is based on code we currently use in FinalBuilder 6 & 7, it's well tested and has proven to be very reliable in our products.
RegularExpressions.pas is what you will use in your code. It's loosely based on the .net regex interfaces.
The main type in RegularExpressions.pas is TRegEx. TRegEx is a record with a bunch of methods and static class methods for matching with regular expressions. The static versions of the methods are provided for convenience, and should only be used for one off matches, if you are matching in a loop or repeating the same search often then you should create an 'instance' of the TRegEx record and use the non static methods.
So lets look at how we might use TRegEx to find some text in a string.
procedure FindSomething(const searchMe : string);
var
regexpr : TRegEx;
match : TMatch;
group : TGroup;
i : integer;
begin
// create our regex instance, and we want to do a case insensitive search, in multiline mode
regexpr := TRegEx.Create('^.*\b(\w+)\b\sworld',[roIgnoreCase,roMultiline]);
match := regexpr.Match(searchMe);
if not match.Success then
begin
WriteLn('No Match Found');
exit;
end;
while match.Success do
begin
WriteLn('Match : [' + match.Value + ']';
//group 0 is the entire match, so count will always be at least 1 for a match
if match.Groups.Count > 1 then
begin
for i := 1 to match.Groups.Count -1 do
WriteLn(' Group[' + IntToStr(i) + '] : [' + match.Groups.Item[i].Value + ']';
end;
match := match.NextMatch;
end;
end;
In the above example, we are trying to extract the word before the "world", and capturing that in a group. The match method will always return a TMatch, even when no match is found, so you should check the Success property of the Match to determine if a match is found. The same applies to Match.NextMatch, this makes it easy to iterate the matches. You could also call TRegEx.Matches, this returns a TMatchCollection which supports enumeration (using the for in construct), e.g :
matches := regexpr.Matches(searchMe);
for match in matches do
begin
if match.success then
begin
//do stuff with match here
end;
end;
Something to remember when working with groups is that a match's Groups collection always returns the entire match as group 0, so the groups from your expression start at 1. You will notice I don't free any of the TRegEx, TMatch or TGroups, that's because they are Records with methods rather than classes. This keeps memory management simple and helps avoid memory leaks, my original code used interfaces and reference counting but Embarcadero preferred to use records (as they have done with other new stuff introduced in recent releases).
I have created a simple example application which will help in testing regular expressions :
The source to this app can be downloaded from here
We'll expand on this app in the next post