RegExp Studio
TRegExpr v.0.952 - Delphi Regular Expressions

Bulgarian
English
French
German
Russian
Spanish
About TRegExpr About What's new What's new Installation Installation Regular Expressions Syntax Syntax What's new Interface What's new FAQ TRegExpr usage demos Demos AnSo@Web Author  
MrDecorator
Here we will discuss how to "decorate url's".
I mean, what if You want to show some plain-text on the HTML-page. The mostly common example - web-forum (BBS board). The user enters the message, for example "the answer You can find at www.RegExpStudio.com" and it must be shown on web-page as text with HTML-link, i.e. converted to "the answer You can find at www.RegExpStudio.com"

There are two ways.

The traditional one - You must make full featured text parser. This is an awful amount of tedious work ! For example, try to implement rules for URL search ;)

The second - look at the text from bird's eye view with help of regular expressions engine. Your application will be implemented very fast and will be robust and easy to support !

Unfortunately, Delphi component palette contains no TRegularExpression component. But there are some third-party implementations (I think You already know at least one 8-))..

The complete source code, ready to run, available in TRegExpr Demos (HyperLinksDecorator unit)

uses 
RegExpr; // Do not forget this line. Actually this is how TRegExpr 'Install' - the 

// only thing You must do - include RegExpr into uses section.


type 
TDecorateURLsFlags = (
 // describes, which parts of hyper-link must be included

 // into VISIBLE part of the link:

  durlProto, // Protocol (like 'ftp://' or 'http://')

  durlAddr,  // TCP address or domain name (like 'anso.da.ru')

  durlPort,  // Port number if specified (like ':8080')

  durlPath,  // Path to document (like 'index.htm')

  durlBMark, // Book mark (like '#mark')

  durlParam  // URL params (like '?ID=2&User=13')

 );

TDecorateURLsFlagSet = set of TDecorateURLsFlags;

function DecorateURLs (
 // can find hyper links like 'http://...' or 'ftp://..'

 // as well as links without protocol, but start with 'www.'


 const AText : string;
 // Input text to find hyper-links


  AFlags : TDecorateURLsFlagSet = [durlAddr, durlPath]
 // Which part of hyper-links found must be included into visible

 // part of URL, for example if [durlAddr] then hyper link

 // 'http://anso.da.ru/index.htm' will be decorated as

 // '<a href="http://anso.da.ru/index.htm">anso.da.ru</a>'


  ) : string;
 // Returns input text with decorated hyper links


const 
  URLTemplate = 
   '(?i)'
 
   + '('
 
   + '(FTP|HTTP)://'
             // Protocol 
   + '|www\.)'
                   // trick to catch links without
                                 // protocol - by detecting of starting 'www.'

   + '([\w\d\-]+(\.[\w\d\-]+)+)'
 // TCP addr or domain name
   + '(:\d\d?\d?\d?\d?)?'
        // port number
   + '(((/[%+\w\d\-\\\.]*)+)*)'
  // unix path
   + '(\?[^\s=&]+=[^\s=&]+(&[^\s=&]+=[^\s=&]+)*)?'

                                 // request (GET) params

   + '(#[\w\d\-%+]+)?'
;          // bookmark
var
  PrevPos : integer;
  s, Proto, Addr, HRef : string;
begin
  Result := ''

  PrevPos := 1

  with TRegExpr.Create do try 
     Expression := URLTemplate; 
     if Exec (AText) then 
      REPEAT 
        s := ''

        if AnsiCompareText (Match [1
], 'www.') = 0 then begin
           Proto := 'http://'
;
           Addr := Match [1
] + Match [3];
           HRef := Proto + Match [0
];
          end
         else begin
           Proto := Match [1
];
           Addr := Match [3
];
           HRef := Match [0
];
          end;
        if durlProto in AFlags
         then s := s + Proto;
        if durlAddr in AFlags
         then s := s + Addr;
        if durlPort in AFlags
         then s := s + Match [5
];
        if durlPath in AFlags
         then s := s + Match [6
];
        if durlParam in AFlags
         then s := s + Match [9
];
        if durlBMark in AFlags
         then s := s + Match [11
];
        Result := Result + System.Copy (AText, PrevPos,
         MatchPos [0
] - PrevPos) + '<a href="' + HRef + '">' + s + '</a>';
        PrevPos := MatchPos [0
] + MatchLen [0];
      UNTIL not ExecNext;
     Result := Result + System.Copy (AText, PrevPos, MaxInt); // Tail

    finally Free;
   end;
end{ of function DecorateURLs
--------------------------------------------------------------}


Note, that You can easely extract any part of URL (see AFlags parameter).


Conclusion

"Free Your mind" ((c) The Matrix ;)) and You'll find many other tasks there regular expressions can save You incredible part of stupid coding work !

P.S. Sorry for terrible english. My native language is Pascal ;)



© 2004 Andrey V. Sorokin, Saint Petersburg, Russia
anso@mail.ru
RegExpStudio.com

Help&Manual - the best help authoring tool!