VSoft Technologies Blogs

rss

VSoft Technologies Blogs - posts about our products and software development.

Delphi XE includes Regular Expression support, something that has been requested many times over the years. In this blog post I'll show some basic usage of regular expressions in delphi. I'm assuming you already understand regular expressions and the associated terminology, if not take a look here for some tutorials etc.

The regular expression engine in Delphi XE is PCRE (Perl Compatible Regular Expression). It's a fast and compliant (with generally accepted regex syntax) engine which has been around for many years. Users of earlier versions of delphi can use it with TPerlRegEx, a delphi class wrapper around it.

The XE interface to pcre is a layer of units based on contributions from various people, the pcre api header translations in RegularExpressionsAPI.pas (Florent Ouchet and co), the wrapper class TPerlRegEx (Jan Goyvaerts) in RegularExpressionsCore.pas and the record wrappers on RegularExpressions.pas (myself). This unit is based on code we currently use in FinalBuilder 6 & 7, it's well tested and has proven to be very reliable in our products.

RegularExpressions.pas is what you will use in your code. It's loosely based on the .net regex interfaces.

The main type in RegularExpressions.pas is TRegEx. TRegEx is a record with a bunch of methods and static class methods for matching with regular expressions. The static versions of the methods are provided for convenience, and should only be used for one off matches, if you are matching in a loop or repeating the same search often then you should create an 'instance' of the TRegEx record and use the non static methods.

So lets look at how we might use TRegEx to find some text in a string.

procedure FindSomething(const searchMe : string);
var
   regexpr : TRegEx;
   match   : TMatch;
   group   : TGroup;
  i           : integer;
begin
// create our regex instance, and we want to do a case insensitive search, in multiline mode

  regexpr := TRegEx.Create('^.*\b(\w+)\b\sworld',[roIgnoreCase,roMultiline]);
  match := regexpr.Match(searchMe);
  if not match.Success then
  begin
    WriteLn('No Match Found');
    exit;
  end;

  while match.Success do
  begin
    WriteLn('Match : [' + match.Value + ']';
    //group 0 is the entire match, so count will always be at least 1 for a match
    if match.Groups.Count > 1 then
    begin
      for i := 1 to match.Groups.Count -1 do
        WriteLn('     Group[' + IntToStr(i) + '] : [' + match.Groups.Item[i].Value + ']';
    end;
    match := match.NextMatch;
  end;
end;

In the above example, we are trying to extract the word before the "world", and capturing that in a group. The match method will always return a TMatch, even when no match is found, so you should check the Success property of the Match to determine if a match is found. The same applies to Match.NextMatch, this makes it easy to iterate the matches. You could also call TRegEx.Matches, this returns a TMatchCollection which supports enumeration (using the for in construct), e.g :

  matches := regexpr.Matches(searchMe);
  for match in matches do
  begin
     if match.success then
     begin
     //do stuff with match here
     end;
  end;

Something to remember when working with groups is that a match's Groups collection always returns the entire match as group 0, so the groups from your expression start at 1. You will notice I don't free any of the TRegEx, TMatch or TGroups, that's because they are Records with methods rather than classes. This keeps memory management simple and helps avoid memory leaks, my original code used interfaces and reference counting but Embarcadero preferred to use records (as they have done with other new stuff introduced in recent releases).

I have created a simple example application which will help in testing regular expressions :

The source to this app can be downloaded from here

We'll expand on this app in the next post

Showing 2 Comments

Avatar
Vincent Parrett 14 years ago

I'm aware of it (and have been for a while), the bug was not in the original code I gave to Embarcadero.


Avatar
Olaf Hess 14 years ago

Thanks for the examples.<br><br>Unfortunately, there's a bug in the RegularExpressions unit:<br><a href="http://www.regexguru.com/2010/09/bug-in-delphi-xe-regularexpressions-unit/" rel="nofollow">www.regexguru.com/2010/09/bug-in-delphi-xe-regularexpressions-unit/</a><br><br>Update 1 for Delphi XE does not fix this.<br><br>Regards,<br>Olaf



Comments are closed.