2 <TITLE>File: regexpr.pas </TITLE>
\r
3 <META NAME="GENERATOR" CONTENT="PasToHTML(Bystricky Vladimir)">
\r
5 <BODY BGCOLOR="#FFFFFF">
\r
6 <A NAME=regexpr.pas><CENTER><H3>regexpr.pas</H3></A><I> from <A HREF=/proyecto/doc/src-html/AlgoWars.html> Project: AlgoWars.dpr</A></I></CENTER>
\r
9 <I><FONT COLOR="Navy">{** Implementación de expresiones regulares para Delphi}</FONT></I>
\r
10 <I><FONT COLOR="Navy">{$B-}</FONT></I>
\r
11 <B>unit</B> RegExpr;
\r
13 <I><FONT COLOR="Navy">(*
\r
16 Regular Expressions for Delphi
\r
23 anso@mail.ru, anso@usa.net
\r
25 http://anso.virtualave.net
\r
27 This library is derived from Henry Spencer sources.
\r
28 I translated the C sources into Object Pascal,
\r
29 implemented object wrapper and some new features.
\r
30 Many features suggested or partially implemented
\r
31 by TRegExpr's users (see Gratitude below).
\r
34 ---------------------------------------------------------------
\r
36 ---------------------------------------------------------------
\r
37 Copyright (c) 1999-00 by Andrey V. Sorokin <anso@mail.ru>
\r
39 This software is provided as it is, without any kind of warranty
\r
40 given. Use it at your own risk.
\r
42 You may use this software in any kind of development, including
\r
43 comercial, redistribute, and modify it freely, under the
\r
44 following restrictions :
\r
45 1. The origin of this software may not be mispresented, you must
\r
46 not claim that you wrote the original software. If you use
\r
47 this software in any kind of product, it would be appreciated
\r
48 that there in a information box, or in the documentation would
\r
49 be an acknowledgmnent like this
\r
50 Partial Copyright (c) 2000 by Andrey V. Sorokin
\r
51 2. You may not have any income from distributing this source
\r
52 to other developers. When you use this product in a comercial
\r
53 package, the source may not be charged seperatly.
\r
56 ---------------------------------------------------------------
\r
57 Legal issues for the original C sources:
\r
58 ---------------------------------------------------------------
\r
59 * Copyright (c) 1986 by University of Toronto.
\r
60 * Written by Henry Spencer. Not derived from licensed software.
\r
62 * Permission is granted to anyone to use this software for any
\r
63 * purpose on any computer system, and to redistribute it freely,
\r
64 * subject to the following restrictions:
\r
65 * 1. The author is not responsible for the consequences of use of
\r
66 * this software, no matter how awful, even if they arise
\r
67 * from defects in it.
\r
68 * 2. The origin of this software must not be misrepresented, either
\r
69 * by explicit claim or by omission.
\r
70 * 3. Altered versions must be plainly marked as such, and must not
\r
71 * be misrepresented as being the original software.
\r
74 ---------------------------------------------------------------
\r
76 ---------------------------------------------------------------
\r
78 found and fixed ugly bug in big string processing
\r
80 testing in CPPB and suggesting/implementing many features
\r
82 implemented Offset parameter
\r
86 Implemented UniCode support, found and fixed some bugs
\r
88 Implemented some features, many optimization suggestions
\r
90 And many others - for big work of bug hunting !
\r
92 I am still looking for person who can help me to translate
\r
93 this documentation into other languages (especially German)
\r
96 ---------------------------------------------------------------
\r
98 ---------------------------------------------------------------
\r
100 -=- VCL-version of TRegExpr - for dummies ;) and TRegExprEdit
\r
101 (replacement for TMaskEdit).
\r
102 Actually, I am writing non-VCL aplications (with web-based
\r
103 interfaces), so I don't need VCL's TRegExpr for myself.
\r
104 Will it be really usefull ?
\r
106 -=- full functiona<A HREF="#llity">llity</A> of braces {}
\r
108 -=- working with pascal-style string.
\r
109 Now pascal-strings converted into PChar, so
\r
110 you can't find r.e. in strings with #0 -chars.
\r
111 (suggested by Pavel O).
\r
113 -=- non-greedy style (suggested by Martin Baur)
\r
115 -=- put precalculated lengths into EXACTLY[CI] !
\r
117 -=- fInputString as string (suggested by Ralf Junker)
\r
119 -=- Add regstart optimization for case-insensitive mode ?
\r
120 Or complitely remove because FirstCharSet is faster ?
\r
122 -=- "Russian Ranges" --> National ranges (use property WordChars ?
\r
123 for ordering letters in ranges by its order in WirdsChars if modifier /r is On)
\r
125 -=- FirstCharSet as array [#0 .. #255] of REChar ?
\r
126 (2x faster then set of REChar)
\r
128 -=- p-code optimization (remove BRANCH-to-EEND, COMMENT, BACK(?)
\r
129 merge EXACTLY etc).
\r
131 I need your suggestions !
\r
132 What are more importent in this list ?
\r
133 Did I forget anything ?
\r
136 ---------------------------------------------------------------
\r
138 ---------------------------------------------------------------
\r
142 (^) upgraded implementation
\r
144 v. 0.938 2000.07.23
\r
145 -=- (^) Exeptions now jump to appropriate source line, not
\r
146 to Error procedure (I am not quite sure this is safe for
\r
147 all compiler versions. You can turn it off - remove
\r
148 reRealExceptionAddr definition below).
\r
149 -=- (^) Forgotten BSUBEXP[CI] in FillFirstCharSet caused
\r
150 exeption 'memory corruption' in case if back reference can
\r
151 be first op, like this: (a)*/1 (first subexpression can be
\r
152 skipped and we'll start matching with back reference..).
\r
154 v. 0.937 2000.06.12
\r
155 -=- (-) Bug in optimization engine (since v.0.934). In some cases
\r
156 TRegExpr didn't catch right strings.
\r
157 Thanks to Matthias Fichtner
\r
159 v. 0.936 2000.04.22
\r
160 -=- (+) Back references, like <font size=(['"]?)(/d+)/1>, see
\r
161 manual for details
\r
162 -=- (+) Wide hex char support, like '/x{263a}'
\r
164 v. 0.935 2000.04.19 (by Yury Finkel)
\r
165 -=- (-) fInvertCase now isn't readonly ;)
\r
166 -=- (-) UniCode mode compiling errors
\r
168 v. 0.934 2000.04.17
\r
169 -=- (^) New ranges implementation (range matching now is very fast
\r
170 - uses one(!) CPU instruction)
\r
171 -=- (^) Internal p-code structure converted into 32-bits - works
\r
172 faster and now there is no 64K limit for compiled r.e.
\r
173 -=- (^) '{m,n}' now use 32-bits arguments (up to 2147483646) - specially
\r
174 for Dmitry Veprintsev ;)
\r
175 -=- (^) Ranges now support metachars: [/n-/x0D] -> #10,#11,#12,#13;
\r
176 Changed '-' processing, now it's like in Perl:
\r
177 [/d-t] -> '0'..'9','-','t'; []-a] -> ']'..'a'
\r
178 -=- (-) Bug with /t and etc macro (they worked only in ranges)
\r
179 Thanks to Yury Finkel
\r
180 -=- (^) Added new preprocessing optimization (see FirstCharSet).
\r
181 Incredible fast (!). But be carefull it isn's properly tested.
\r
182 You can switch it Off - remove UseFirstCharSet definition.
\r
183 -=- (^) Many other speed optimizations
\r
184 -=- (-) Case-insensitive mode now support system-defined national
\r
185 charset (due to bug in v.0.90 .. 0.926 supported only english one)
\r
186 -=- (^) Case-insensitive mode implemented with InvertCase (param &
\r
187 result of REChar type) - works 10 .. 100 times faster.
\r
188 -=- (^) Match and ExecNext interfaces optimized, added IsProgrammOk
\r
190 -=- (^) Increased NSUBEXP (now 15) and fixed code for this, now you
\r
191 can simply increase NSUBEXP constant by yourself.
\r
192 Suggested by Alexander V. Akimov.
\r
193 -=- (^+) Substitute adapted for NSUBEXP > 10 and significant (!)
\r
194 optimized, improved error checking.
\r
195 ATTENTION! Read new Substitute description - syntax was changed !
\r
196 -=- (+) SpaceChars & WordChars property - now you may change chars
\r
197 treated as /s & /w. By defauled assigned RegExprSpaceChars/WordChars
\r
198 -=- (+) Now /s and /w supported in ranges
\r
199 -=- (-) Infinite loop if end of range=#$FF
\r
200 Thanks to Andrey Kolegov
\r
201 -=- (+) Function QuoteRegExprMetaChars (see description)
\r
202 -=- (+) UniCode support - sorry, works VERY slow (remove '.' from
\r
203 {.$DEFINE UniCode} after this comment for unicode version).
\r
204 Implemented by Yury Finkel
\r
206 v. 0.926 2000.02.26
\r
207 -=- (-) Old bug derived from H.Spencer sources - SPSTART was
\r
208 set for '?' and '*' instead of '*', '{m,n}' and '+'.
\r
209 -=- (-^) Now {m,n} works like Perl's one - error occures only
\r
210 if m > n or n > BracesMax (BracesMax = 255 in this version).
\r
211 In other cases (no m or nondigit symbols in m or n values,
\r
212 or no '}') symbol '{' will be compiled as literal.
\r
213 Note: so, you must include m value (use {0,n} instead of {,n}).
\r
214 Note: {m,} will be compiled as {m,BracesMax}.
\r
215 -=- (-^) CaseInsensitive mode now support ranges
\r
216 '(?i)[a]' == '[aA]'
\r
217 -=- (^) Roman-number template in TestRExp ;)
\r
218 -=- (+^) Beta version of complex-braces - like ((abc){1,2}|d){3}
\r
219 By default its turned off. If you want take part in beta-testing,
\r
220 please, remove '.' from {.$DEFINE ComplexBraces} below this comments.
\r
221 -=- (-^) Removed /b metachar (in Perl it isn't BS as in my implementation,
\r
223 -=- (+) Add /s modifier. Bu I am not sure that it's ok for Windows.
\r
224 I implemented it as [^/n] for '.' metachar in non-/s mode.
\r
225 But lines separated by /n/r in windows. I need you suggestions !
\r
226 -=- (^) Sorry, but I had to rename Modifiers to ModifierStr
\r
227 (ModifierS uses for /s now)
\r
229 v. 0.91 2000.02.02
\r
230 -=- (^) some changes in documentation and demo-project.
\r
232 v. 0.90 2000.01.31
\r
233 -=- (+) implemented braces repetitions {min,max}.
\r
234 Sorry - only simple cases now - like '/d{2,3}'
\r
235 or '[a-z1-9]{,7}', but not (abc){2,3} ..
\r
236 I still too short in time.
\r
237 Wait for future versions of TRegExpr or
\r
238 implement it by youself and share with me ;)
\r
239 -=- (+) implemented case-insensitive modifier and way
\r
240 to work with other modifiers - see properties
\r
241 Modifiers, Modifier, ModifierI
\r
242 and (?ismx-ismx) Perl extension.
\r
243 You may use global variables RegExpr* for assigning
\r
244 default modifier values.
\r
245 -=- (+) property ExtSyntaxEnabled changed to 'r'-modifier
\r
246 (russian extensions - see documentation)
\r
247 -=- (+) implemented (?#comment) Perl extension - very hard
\r
248 and usefull work ;)
\r
249 -=- (^) property MatchCount renamed to SubExprMatchCount.
\r
250 Sorry for any inconvenients, but it's because new
\r
251 version works slightly different and if you used
\r
252 MatchCount in your programms you have to rethink
\r
253 it ! (see comments to this property)
\r
254 -=- (+) add InputString property - stores input string
\r
255 from last Exec call. You may directly assign values
\r
256 to this property for using in ExecPos method.
\r
257 -=- (+) add ExecPos method - for working with assigned
\r
258 to InputString property. You may use it like this
\r
259 InputString := AString;
\r
262 InputString := AString;
\r
263 ExecPos (AOffset);
\r
264 Note: ExecPos without parameter works only in
\r
265 Delphi 4 or higher.
\r
266 -=- (+) add ExecNext method - simple and fast (!) way to finding
\r
267 multiple occurences of r.e. in big input string.
\r
268 -=- (^) Offset parameter removed from Exec method, if you
\r
269 used it in your programs, please replace all
\r
270 Exec (AString, AOffset)
\r
272 InputString := AString; ExecPos (AOffset)
\r
273 Sorry for any inconvenients, but old design
\r
274 (see v.0.81) was too ugly :(
\r
275 In addition, multiple Exec calls with same input
\r
276 string produce fool overhead because each Exec
\r
277 reallocate input string buffer.
\r
278 -=- (^) optimized implementation of Substitution,
\r
279 Replace and Split methods
\r
280 -=- (-) fixed minor bug - if r.e. compilation raise error
\r
281 during second pass (!!! I think it's impossible
\r
282 in really practice), TRegExpr stayed in 'compiled'
\r
284 -=- (-) fixed bug - Dump method didn't check program existance
\r
285 and raised 'access violation' if previouse Exec
\r
286 was finished with error.
\r
287 -=- (+) changed error handling (see functions Error, ErrorMsg,
\r
288 LastError, property CompilerErrorPos, type ERegExpr).
\r
289 -=- (-^) TRegExpr.Replace, Split and ExecNext made a infinite
\r
290 loop in case of r.e. match empty-string.
\r
291 Now ExecNext moves by MatchLen if MatchLen <> 0
\r
292 and by +1 if MatchLen = 0
\r
293 Thanks to Jon Smith and George Tasker for bugreports.
\r
294 -=- (-) While playing with null-matchs I discovered, that
\r
295 null-match at tail of input string is never found.
\r
296 Well, I fixed this, but I am not sure this is safe
\r
297 (MatchPos[0]=length(AInputString)+1, MatchLen = 0).
\r
298 Any suggetions are very appreciated.
\r
299 -=- (^) Demo project and documentation was upgraded
\r
300 -=- (^) Documentation and this version was published on my home page
\r
304 v. 0.81 1999.12.25 // Merry Christmas ! :)
\r
305 -=- added /s (AnySpace) and /S (NotSpace) meta-symbols
\r
306 - implemented by Stephan Klimek with minor fixes by AVS
\r
307 -=- added /f, /a and /b chars (translates into FF, BEL, BS)
\r
308 -=- removed meta-symbols 'ö' & 'Ö' - sorry for any inconvenients
\r
309 -=- added Match property (== copy (InputStr, MatchPos [Idx], MatchLen [Idx]))
\r
310 -=- added extra parameter Offset to Exec method
\r
311 (thanks to Steve Mudford)
\r
314 -=- fixed bug - in some cases the r.e. [^...]
\r
315 incorrectly processed (as any symbol)
\r
316 (thanks to Jan Korycan)
\r
317 -=- Some changes and improvements in TestRExp.dpr
\r
319 v. 0.6 1999.08.13 (Friday 13 !)
\r
320 -=- changed header of TRegExpr.Substitute
\r
321 -=- added Split, Replace & appropriate
\r
322 global wrappers (thanks to Stephan Klimek for suggetions)
\r
325 -=- TRegExpr.Substitute routine added
\r
326 -=- Some changes and improvements in TestRExp.dpr
\r
327 -=- Fixed bug in english version of documentation
\r
328 (Thanks to Jon Buckheit)
\r
331 -=- Fixed bug with parsing of strings longer then 255 bytes
\r
332 (thanks to Guido Muehlwitz)
\r
333 -=- Fixed bug in RegMatch - mathes only first occurence of r.e.
\r
334 (thanks to Stephan Klimek)
\r
337 -=- ExecRegExpr function
\r
340 -=- packed into object-pascal class
\r
341 -=- code slightly rewriten for pascal
\r
342 -=- now macro correct proceeded in ranges
\r
343 -=- r.e.ranges syntax extended for russian letters ranges:
\r
344 à-ÿ - replaced with all small russian letters (Win1251)
\r
345 À-ß - replaced with all capital russian letters (Win1251)
\r
346 à-ß - replaced with all russian letters (Win1251)
\r
347 -=- added macro '/d' (opcode ANYDIGIT) - match any digit
\r
348 -=- added macro '/D' (opcode NOTDIGIT) - match not digit
\r
349 -=- added macro '/w' (opcode ANYLETTER) - match any english letter or '_'
\r
350 -=- added macro '/W' (opcode NOTLETTER) - match not english letter or '_'
\r
351 (all r.e.syntax extensions may be turned off by flag ExtSyntax)
\r
354 first version, with bugs, without help => must die :(
\r
358 <I><FONT COLOR="Navy">{$DEFINE DebugRegExpr}</FONT></I> <I><FONT COLOR="Navy">// define for dump/trace enabling </FONT></I>
\r
360 <I><FONT COLOR="Navy">{$DEFINE reRealExceptionAddr}</FONT></I> <I><FONT COLOR="Navy">// if defined then exceptions will </FONT></I>
\r
361 <I><FONT COLOR="Navy">// jump to appropriate source line, not to Error procedure </FONT></I>
\r
363 <I><FONT COLOR="Navy">{.$DEFINE ComplexBraces}</FONT></I> <I><FONT COLOR="Navy">// define for beta-version of braces </FONT></I>
\r
364 <I><FONT COLOR="Navy">// (in stable version it works only for simple cases) </FONT></I>
\r
366 <I><FONT COLOR="Navy">{.$DEFINE UniCode}</FONT></I> <I><FONT COLOR="Navy">// define for Unicode support </FONT></I>
\r
368 <I><FONT COLOR="Navy">{$IFNDEF UniCode}</FONT></I> <I><FONT COLOR="Navy">// optionts applicable only for non-UniCode </FONT></I>
\r
369 <I><FONT COLOR="Navy">{$DEFINE UseSetOfChar}</FONT></I> <I><FONT COLOR="Navy">// Significant optimization by using set of char </FONT></I>
\r
370 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
372 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I>
\r
373 <I><FONT COLOR="Navy">{$DEFINE UseFirstCharSet}</FONT></I> <I><FONT COLOR="Navy">// Significant optimization inm some cases </FONT></I>
\r
374 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
378 <I><FONT COLOR="Navy">// Determine version (for using 'params by default') </FONT></I>
\r
379 <I><FONT COLOR="Navy">{$IFNDEF VER80}</FONT></I> <I><FONT COLOR="Navy">{ Delphi 1.0}</FONT></I>
\r
380 <I><FONT COLOR="Navy">{$IFNDEF VER90}</FONT></I> <I><FONT COLOR="Navy">{ Delphi 2.0}</FONT></I>
\r
381 <I><FONT COLOR="Navy">{$IFNDEF VER93}</FONT></I> <I><FONT COLOR="Navy">{ C++Builder 1.0}</FONT></I>
\r
382 <I><FONT COLOR="Navy">{$IFNDEF VER100}</FONT></I> <I><FONT COLOR="Navy">{ Borland Delphi 3.0}</FONT></I>
\r
383 <I><FONT COLOR="Navy">{$DEFINE D4_}</FONT></I> <I><FONT COLOR="Navy">{ Delphi 4.0 or higher}</FONT></I>
\r
384 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
385 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
386 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
387 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
388 <I><FONT COLOR="Navy">{.$IFNDEF VER110}</FONT></I> <I><FONT COLOR="Navy">{ Borland C++Builder 3.0}</FONT></I>
\r
389 <I><FONT COLOR="Navy">{.$IFNDEF VER120}</FONT></I> <I><FONT COLOR="Navy">{Borland Delphi 4.0}</FONT></I>
\r
393 Classes, <I><FONT COLOR="Navy">// TStrings in Split method </FONT></I>
\r
394 SysUtils; <I><FONT COLOR="Navy">// Exception </FONT></I>
\r
398 <I><FONT COLOR="Navy">{$IFDEF UniCode}</FONT></I>
\r
399 PRegExprChar = PWideChar;
\r
400 RegExprString = WideString;
\r
401 REChar = WideChar;
\r
402 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
403 PRegExprChar = PChar;
\r
404 RegExprString = <B>string</B>;
\r
406 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
407 TREOp = REChar; <I><FONT COLOR="Navy">// internal p-code type //###0.933 </FONT></I>
\r
409 TRENextOff = integer; <I><FONT COLOR="Navy">// internal Next "pointer" (offset to current p-code) //###0.933 </FONT></I>
\r
410 PRENextOff = ^TRENextOff; <I><FONT COLOR="Navy">// used for extracting Next "pointers" from compiled r.e. //###0.933 </FONT></I>
\r
411 TREBracesArg = integer; <I><FONT COLOR="Navy">// type of {m,n} arguments </FONT></I>
\r
412 PREBracesArg = ^TREBracesArg;
\r
415 REOpSz = SizeOf (TREOp) <B>div</B> SizeOf (REChar); <I><FONT COLOR="Navy">// size of p-code in RegExprString units </FONT></I>
\r
416 RENextOffSz = SizeOf (TRENextOff) <B>div</B> SizeOf (REChar); <I><FONT COLOR="Navy">// size of Next 'pointer' -"- </FONT></I>
\r
417 REBracesArgSz = SizeOf (TREBracesArg) <B>div</B> SizeOf (REChar); <I><FONT COLOR="Navy">// size of BRACES arguments -"- </FONT></I>
\r
420 TRegExprInvertCaseFunction = <B>function</B> (<B>const</B> Ch : REChar) : REChar
\r
421 <B>of</B> <B>object</B>;
\r
424 RegExprModifierI : boolean = False;
\r
425 <I><FONT COLOR="Navy">// default value for ModifierI </FONT></I>
\r
427 RegExprModifierR : boolean = True;
\r
428 <I><FONT COLOR="Navy">// default value for ModifierR </FONT></I>
\r
430 RegExprModifierS : boolean = True;
\r
431 <I><FONT COLOR="Navy">// default value for ModifierS </FONT></I>
\r
433 RegExprSpaceChars : RegExprString = <I><FONT COLOR="Navy">// chars for /s & /S </FONT></I>
\r
434 ' '#$9#$A#$D#$C; <I><FONT COLOR="Navy">// default for SpaceChars property </FONT></I>
\r
436 RegExprWordChars : RegExprString = <I><FONT COLOR="Navy">// chars for /w & /W </FONT></I>
\r
437 <I><FONT COLOR="Navy">// Ampliado por Leandro Lucarella (11/00) </FONT></I>
\r
438 'abcdefghijklmnopqrstuvwxyzáéíóúýäëïöüÿàèìòùâêîôûñç'
\r
439 + 'ABCDEFGHIJKLMNOPQRSTUVWXYZÁÉÍÓÚÝÄËÏÖÜ
\9fÀÈÌÒÙÂÊÎÔÛÑÇ_'; <I><FONT COLOR="Navy">// default for WordChars property </FONT></I>
\r
443 NSUBEXP = 15; <I><FONT COLOR="Navy">// max number of subexpression //###0.929 </FONT></I>
\r
444 <I><FONT COLOR="Navy">// Be carefull - don't use values which overflow CLOSE opcode </FONT></I>
\r
445 <I><FONT COLOR="Navy">// (in this case you'll get compiler erorr). </FONT></I>
\r
446 <I><FONT COLOR="Navy">// Big NSUBEXP will cause more slow work and more stack required </FONT></I>
\r
447 MaxBracesArg = $7FFFFFFF - 1; <I><FONT COLOR="Navy">// max value for {n,m} arguments //###0.933 </FONT></I>
\r
448 <I><FONT COLOR="Navy">{$IFDEF ComplexBraces}</FONT></I>
\r
449 LoopStackMax = 10; <I><FONT COLOR="Navy">// max depth of loops stack //###0.925 </FONT></I>
\r
450 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
452 <I><FONT COLOR="Navy">// if range includes more then TinySetLen chars, //###0.934 </FONT></I>
\r
453 <I><FONT COLOR="Navy">// then use full (32 bytes) ANYOFFULL instead of ANYOF[BUT]TINYSET </FONT></I>
\r
454 <I><FONT COLOR="Navy">// !!! Attension ! If you change TinySetLen, you must </FONT></I>
\r
455 <I><FONT COLOR="Navy">// change code marked as "//!!!TinySet" </FONT></I>
\r
460 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I>
\r
461 PSetOfREChar = ^TSetOfREChar;
\r
462 TSetOfREChar = <B>set</B> <B>of</B> REChar;
\r
463 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
465 <I><FONT COLOR="Navy">{** Clase auxiliar que implementa expresiones Regulares. Se ultiliza para validar los campos de texto}</FONT></I>
\r
466 TRegExpr = <B>class</B>
\r
468 startp : <B>array</B> [0 .. NSUBEXP - 1] <B>of</B> PRegExprChar; <I><FONT COLOR="Navy">// founded expr starting points </FONT></I>
\r
469 endp : <B>array</B> [0 .. NSUBEXP - 1] <B>of</B> PRegExprChar; <I><FONT COLOR="Navy">// founded expr end points </FONT></I>
\r
471 <I><FONT COLOR="Navy">{$IFDEF ComplexBraces}</FONT></I>
\r
472 LoopStack : <B>array</B> [1 .. LoopStackMax] <B>of</B> integer; <I><FONT COLOR="Navy">// state before entering loop </FONT></I>
\r
473 LoopStackIdx : integer; <I><FONT COLOR="Navy">// 0 - out of all loops </FONT></I>
\r
474 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
476 <I><FONT COLOR="Navy">// The "internal use only" fields to pass info from compile </FONT></I>
\r
477 <I><FONT COLOR="Navy">// to execute that permits the execute phase to run lots faster on </FONT></I>
\r
478 <I><FONT COLOR="Navy">// simple cases. </FONT></I>
\r
479 regstart : REChar; <I><FONT COLOR="Navy">// char that must begin a match; '/0' if none obvious </FONT></I>
\r
480 reganch : REChar; <I><FONT COLOR="Navy">// is the match anchored (at beginning-of-line only)? </FONT></I>
\r
481 regmust : PRegExprChar; <I><FONT COLOR="Navy">// string (pointer into program) that match must include, or nil </FONT></I>
\r
482 regmlen : integer; <I><FONT COLOR="Navy">// length of regmust string </FONT></I>
\r
483 <I><FONT COLOR="Navy">// Regstart and reganch permit very fast decisions on suitable starting points </FONT></I>
\r
484 <I><FONT COLOR="Navy">// for a match, cutting down the work a lot. Regmust permits fast rejection </FONT></I>
\r
485 <I><FONT COLOR="Navy">// of lines that cannot possibly match. The regmust tests are costly enough </FONT></I>
\r
486 <I><FONT COLOR="Navy">// that regcomp() supplies a regmust only if the r.e. contains something </FONT></I>
\r
487 <I><FONT COLOR="Navy">// potentially expensive (at present, the only such thing detected is * or + </FONT></I>
\r
488 <I><FONT COLOR="Navy">// at the start of the r.e., which can involve a lot of backup). Regmlen is </FONT></I>
\r
489 <I><FONT COLOR="Navy">// supplied because the test in regexec() needs it and regcomp() is computing </FONT></I>
\r
490 <I><FONT COLOR="Navy">// it anyway. </FONT></I>
\r
491 <I><FONT COLOR="Navy">{$IFDEF UseFirstCharSet}</FONT></I> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
492 FirstCharSet : TSetOfREChar;
\r
493 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
495 <I><FONT COLOR="Navy">// work variables for Exec's routins - save stack in recursion} </FONT></I>
\r
496 reginput : PRegExprChar; <I><FONT COLOR="Navy">// String-input pointer. </FONT></I>
\r
497 fInputStart : PRegExprChar; <I><FONT COLOR="Navy">// Pointer to first char of input string. </FONT></I>
\r
498 fInputEnd : PRegExprChar; <I><FONT COLOR="Navy">// Pointer to char AFTER last char of input string </FONT></I>
\r
500 <I><FONT COLOR="Navy">// work variables for compiler's routines </FONT></I>
\r
501 regparse : PRegExprChar; <I><FONT COLOR="Navy">// Input-scan pointer. </FONT></I>
\r
502 regnpar : integer; <I><FONT COLOR="Navy">// count. </FONT></I>
\r
504 regcode : PRegExprChar; <I><FONT COLOR="Navy">// Code-emit pointer; @regdummy = don't. </FONT></I>
\r
505 regsize : integer; <I><FONT COLOR="Navy">// Code size. </FONT></I>
\r
507 regexpbeg : PRegExprChar; <I><FONT COLOR="Navy">// only for error handling. Contains </FONT></I>
\r
508 <I><FONT COLOR="Navy">// pointer to beginning of r.e. while compiling </FONT></I>
\r
509 fExprIsCompiled : boolean; <I><FONT COLOR="Navy">// true if r.e. successfully compiled </FONT></I>
\r
511 <I><FONT COLOR="Navy">// programm is essentially a linear encoding </FONT></I>
\r
512 <I><FONT COLOR="Navy">// of a nondeterministic finite-state machine (aka syntax charts or </FONT></I>
\r
513 <I><FONT COLOR="Navy">// "railroad normal form" in parsing technology). Each node is an opcode </FONT></I>
\r
514 <I><FONT COLOR="Navy">// plus a "next" pointer, possibly plus an operand. "Next" pointers of </FONT></I>
\r
515 <I><FONT COLOR="Navy">// all nodes except BRANCH implement concatenation; a "next" pointer with </FONT></I>
\r
516 <I><FONT COLOR="Navy">// a BRANCH on both ends of it is connecting two alternatives. (Here we </FONT></I>
\r
517 <I><FONT COLOR="Navy">// have one of the subtle syntax dependencies: an individual BRANCH (as </FONT></I>
\r
518 <I><FONT COLOR="Navy">// opposed to a collection of them) is never concatenated with anything </FONT></I>
\r
519 <I><FONT COLOR="Navy">// because of operator precedence.) The operand of some types of node is </FONT></I>
\r
520 <I><FONT COLOR="Navy">// a literal string; for others, it is a node leading into a sub-FSM. In </FONT></I>
\r
521 <I><FONT COLOR="Navy">// particular, the operand of a BRANCH node is the first node of the branch. </FONT></I>
\r
522 <I><FONT COLOR="Navy">// (NB this is *not* a tree structure: the tail of the branch connects </FONT></I>
\r
523 <I><FONT COLOR="Navy">// to the thing following the set of BRANCHes.) The opcodes are: </FONT></I>
\r
524 programm : PRegExprChar; <I><FONT COLOR="Navy">// Unwarranted chumminess with compiler. </FONT></I>
\r
526 fExpression : PRegExprChar; <I><FONT COLOR="Navy">// source of compiled r.e. </FONT></I>
\r
527 fInputString : PRegExprChar; <I><FONT COLOR="Navy">// input string </FONT></I>
\r
529 fLastError : integer; <I><FONT COLOR="Navy">// see Error, LastError </FONT></I>
\r
531 fModifiers : integer; <I><FONT COLOR="Navy">// modifiers </FONT></I>
\r
532 fCompModifiers : integer; <I><FONT COLOR="Navy">// compiler's copy of modifiers </FONT></I>
\r
533 fProgModifiers : integer; <I><FONT COLOR="Navy">// values modifiers from last programm compilation </FONT></I>
\r
535 fSpaceChars : RegExprString; <I><FONT COLOR="Navy">//###0.927 </FONT></I>
\r
536 fWordChars : RegExprString; <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
537 fInvertCase : TRegExprInvertCaseFunction; <I><FONT COLOR="Navy">//###0.927 </FONT></I>
\r
539 <B>function</B> IsProgrammOk : boolean; <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
541 <B>procedure</B> CheckCompModifiers;
\r
542 <I><FONT COLOR="Navy">// if modifiers was changed after programm compilation - recompile it ! </FONT></I>
\r
544 <B>function</B> GetExpression : RegExprString;
\r
545 <B>procedure</B> SetExpression (<B>const</B> s : RegExprString);
\r
547 <B>function</B> GetModifierStr : RegExprString;
\r
548 <B>function</B> SetModifiersInt (<B>const</B> AModifiers : RegExprString; <B>var</B> AModifiersInt : integer) : boolean;
\r
549 <B>procedure</B> SetModifierStr (<B>const</B> AModifiers : RegExprString);
\r
551 <B>function</B> GetModifier (AIndex : integer) : boolean;
\r
552 <B>procedure</B> SetModifier (AIndex : integer; ASet : boolean);
\r
554 <B>procedure</B> Error (AErrorID : integer); <B>virtual</B>; <I><FONT COLOR="Navy">// error handler. </FONT></I>
\r
555 <I><FONT COLOR="Navy">// Default handler raise exception ERegExpr with </FONT></I>
\r
556 <I><FONT COLOR="Navy">// Message = ErrorMsg (AErrorID), ErrorCode = AErrorID </FONT></I>
\r
557 <I><FONT COLOR="Navy">// and CompilerErrorPos = value of property CompilerErrorPos. </FONT></I>
\r
560 <I><FONT COLOR="Navy">{==================== Compiler section ===================}</FONT></I>
\r
561 <B>function</B> CompileRegExpr (exp : PRegExprChar) : boolean;
\r
562 <I><FONT COLOR="Navy">// compile a regular expression into internal code </FONT></I>
\r
564 <B>procedure</B> Tail (p : PRegExprChar; val : PRegExprChar);
\r
565 <I><FONT COLOR="Navy">// set the next-pointer at the end of a node chain </FONT></I>
\r
567 <B>procedure</B> OpTail (p : PRegExprChar; val : PRegExprChar);
\r
568 <I><FONT COLOR="Navy">// regoptail - regtail on operand of first argument; nop if operandless </FONT></I>
\r
570 <B>function</B> EmitNode (op : TREOp) : PRegExprChar;
\r
571 <I><FONT COLOR="Navy">// regnode - emit a node, return location </FONT></I>
\r
573 <B>procedure</B> EmitC (b : REChar);
\r
574 <I><FONT COLOR="Navy">// emit (if appropriate) a byte of code </FONT></I>
\r
576 <B>procedure</B> InsertOperator (op : TREOp; opnd : PRegExprChar; sz : integer); <I><FONT COLOR="Navy">//###0.90 </FONT></I>
\r
577 <I><FONT COLOR="Navy">// insert an operator in front of already-emitted operand </FONT></I>
\r
578 <I><FONT COLOR="Navy">// Means relocating the operand. </FONT></I>
\r
580 <B>function</B> ParseReg (paren : integer; <B>var</B> flagp : integer) : PRegExprChar;
\r
581 <I><FONT COLOR="Navy">// regular expression, i.e. main body or parenthesized thing </FONT></I>
\r
583 <B>function</B> ParseBranch (<B>var</B> flagp : integer) : PRegExprChar;
\r
584 <I><FONT COLOR="Navy">// one alternative of an | operator </FONT></I>
\r
586 <B>function</B> ParsePiece (<B>var</B> flagp : integer) : PRegExprChar;
\r
587 <I><FONT COLOR="Navy">// something followed by possible [*+?] </FONT></I>
\r
589 <B>function</B> ParseAtom (<B>var</B> flagp : integer) : PRegExprChar;
\r
590 <I><FONT COLOR="Navy">// the lowest level </FONT></I>
\r
592 <B>function</B> GetCompilerErrorPos : integer;
\r
593 <I><FONT COLOR="Navy">// current pos in r.e. - for error hanling </FONT></I>
\r
595 <I><FONT COLOR="Navy">{$IFDEF UseFirstCharSet}</FONT></I> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
596 <B>procedure</B> FillFirstCharSet (prog : PRegExprChar);
\r
597 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
599 <I><FONT COLOR="Navy">{===================== Mathing section ===================}</FONT></I>
\r
600 <B>function</B> regrepeat (p : PRegExprChar; AMax : integer) : integer;
\r
601 <I><FONT COLOR="Navy">// repeatedly match something simple, report how many </FONT></I>
\r
603 <B>function</B> regnext (p : PRegExprChar) : PRegExprChar;
\r
604 <I><FONT COLOR="Navy">// dig the "next" pointer out of a node </FONT></I>
\r
606 <B>function</B> MatchPrim (prog : PRegExprChar) : boolean;
\r
607 <I><FONT COLOR="Navy">// recursively matching routine </FONT></I>
\r
609 <B>function</B> RegMatch (str : PRegExprChar) : boolean;
\r
610 <I><FONT COLOR="Navy">// try match at specific point, uses MatchPrim for real work </FONT></I>
\r
612 <B>function</B> ExecPrim (AOffset: integer) : boolean;
\r
613 <I><FONT COLOR="Navy">// Exec for stored InputString </FONT></I>
\r
615 <I><FONT COLOR="Navy">{$IFDEF DebugRegExpr}</FONT></I>
\r
616 <B>function</B> DumpOp (op : REChar) : RegExprString;
\r
617 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
619 <B>function</B> GetSubExprMatchCount : integer;
\r
620 <B>function</B> GetMatchPos (Idx : integer) : integer;
\r
621 <B>function</B> GetMatchLen (Idx : integer) : integer;
\r
622 <B>function</B> GetMatch (Idx : integer) : RegExprString;
\r
624 <B>function</B> GetInputString : RegExprString;
\r
625 <B>procedure</B> SetInputString (<B>const</B> AInputString : RegExprString);
\r
627 <I><FONT COLOR="Navy">{$IFNDEF UseSetOfChar}</FONT></I>
\r
628 <B>function</B> StrScanCI (s : PRegExprChar; ch : REChar) : PRegExprChar; <I><FONT COLOR="Navy">//###0.928 </FONT></I>
\r
629 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
632 <B>constructor</B> Create;
\r
633 <B>destructor</B> Destroy; <B>override</B>;
\r
635 <B>property</B> Expression : RegExprString <B>read</B> GetExpression <B>write</B> SetExpression;
\r
636 <I><FONT COLOR="Navy">// regular expression </FONT></I>
\r
637 <I><FONT COLOR="Navy">// When you assign r.e. to this property, TRegExpr will automatically </FONT></I>
\r
638 <I><FONT COLOR="Navy">// compile it and store in internal structures. </FONT></I>
\r
639 <I><FONT COLOR="Navy">// In case of compilation error, Error method will be called </FONT></I>
\r
640 <I><FONT COLOR="Navy">// (by default Error method raises exception ERegExpr - see below) </FONT></I>
\r
642 <B>property</B> ModifierStr : RegExprString <B>read</B> GetModifierStr <B>write</B> SetModifierStr;
\r
643 <I><FONT COLOR="Navy">// Set/get default values of r.e.syntax modifiers. Modifiers in </FONT></I>
\r
644 <I><FONT COLOR="Navy">// r.e. (?ismx-ismx) will replace this default values. </FONT></I>
\r
645 <I><FONT COLOR="Navy">// If you try to set unsupported modifier, Error will be called </FONT></I>
\r
646 <I><FONT COLOR="Navy">// (by defaul Error raises exception ERegExpr). </FONT></I>
\r
648 <B>property</B> ModifierI : boolean <B>index</B> 1 <B>read</B> GetModifier <B>write</B> SetModifier;
\r
649 <I><FONT COLOR="Navy">// Modifier /i - caseinsensitive, false by default </FONT></I>
\r
651 <B>property</B> ModifierR : boolean <B>index</B> 2 <B>read</B> GetModifier <B>write</B> SetModifier;
\r
652 <I><FONT COLOR="Navy">// Modifier /r - use r.e.syntax extended for russian, true by default </FONT></I>
\r
653 <I><FONT COLOR="Navy">// (was property ExtSyntaxEnabled in previous versions) </FONT></I>
\r
654 <I><FONT COLOR="Navy">// If true, then à-ÿ additional include russian letter '¸', </FONT></I>
\r
655 <I><FONT COLOR="Navy">// À-ß additional include '¨', and à-ß include all russian symbols. </FONT></I>
\r
656 <I><FONT COLOR="Navy">// You have to turn it off if it may interfere with you national alphabet. </FONT></I>
\r
658 <B>property</B> ModifierS : boolean <B>index</B> 3 <B>read</B> GetModifier <B>write</B> SetModifier;
\r
659 <I><FONT COLOR="Navy">// Modifier /s - '.' works as any char (else as [^/n]), </FONT></I>
\r
660 <I><FONT COLOR="Navy">// true by default </FONT></I>
\r
662 <B>function</B> Exec (<B>const</B> AInputString : RegExprString) : boolean;
\r
663 <I><FONT COLOR="Navy">// match a programm against a string AInputString </FONT></I>
\r
664 <I><FONT COLOR="Navy">// !!! Exec store AInputString into InputString property </FONT></I>
\r
666 <B>function</B> ExecNext : boolean;
\r
667 <I><FONT COLOR="Navy">// find next match: </FONT></I>
\r
668 <I><FONT COLOR="Navy">// Exec (AString); ExecNext; </FONT></I>
\r
669 <I><FONT COLOR="Navy">// works same as </FONT></I>
\r
670 <I><FONT COLOR="Navy">// Exec (AString); </FONT></I>
\r
671 <I><FONT COLOR="Navy">// if MatchLen [0] = 0 then ExecPos (MatchPos [0] + 1) </FONT></I>
\r
672 <I><FONT COLOR="Navy">// else ExecPos (MatchPos [0] + MatchLen [0]); </FONT></I>
\r
673 <I><FONT COLOR="Navy">// but it's more simpler ! </FONT></I>
\r
675 <B>function</B> ExecPos (AOffset: integer <I><FONT COLOR="Navy">{$IFDEF D4_}</FONT></I>= 1<I><FONT COLOR="Navy">{$ENDIF}</FONT></I>) : boolean;
\r
676 <I><FONT COLOR="Navy">// find match for InputString starting from AOffset position </FONT></I>
\r
677 <I><FONT COLOR="Navy">// (AOffset=1 - first char of InputString) </FONT></I>
\r
679 <B>property</B> InputString : RegExprString <B>read</B> GetInputString <B>write</B> SetInputString;
\r
680 <I><FONT COLOR="Navy">// returns current input string (from last Exec call or last assign </FONT></I>
\r
681 <I><FONT COLOR="Navy">// to this property). </FONT></I>
\r
682 <I><FONT COLOR="Navy">// Any assignment to this property clear Match* properties ! </FONT></I>
\r
684 <B>function</B> Substitute (<B>const</B> ATemplate : RegExprString) : RegExprString;
\r
685 <I><FONT COLOR="Navy">// Returns ATemplate with '$&' or '$0' replaced by whole r.e. </FONT></I>
\r
686 <I><FONT COLOR="Navy">// occurence and '$n' replaced by occurence of subexpression #n. </FONT></I>
\r
687 <I><FONT COLOR="Navy">// Since v.0.929 '$' used instead of '/' (for future extensions </FONT></I>
\r
688 <I><FONT COLOR="Navy">// and for more Perl-compatibility) and accept more then one digit. </FONT></I>
\r
689 <I><FONT COLOR="Navy">// If you want place into template raw '$' or '/', use prefix '/' </FONT></I>
\r
690 <I><FONT COLOR="Navy">// Example: '1/$ is $2//rub//' -> '1$ is <Match[2]>/rub/' </FONT></I>
\r
691 <I><FONT COLOR="Navy">// If you want to place raw digit after '$n' you must delimit </FONT></I>
\r
692 <I><FONT COLOR="Navy">// n with curly braces '{}'. </FONT></I>
\r
693 <I><FONT COLOR="Navy">// Example: 'a$12bc' -> 'a<Match[12]>bc' </FONT></I>
\r
694 <I><FONT COLOR="Navy">// 'a${1}2bc' -> 'a<Match[1]>2bc'. </FONT></I>
\r
696 <B>procedure</B> Split (AInputStr : RegExprString; APieces : TStrings);
\r
697 <I><FONT COLOR="Navy">// Split AInputStr into APieces by r.e. occurencies </FONT></I>
\r
699 <B>function</B> Replace (AInputStr : RegExprString; <B>const</B> AReplaceStr : RegExprString) : RegExprString;
\r
700 <I><FONT COLOR="Navy">// Returns AInputStr with r.e. occurencies replaced by AReplaceStr </FONT></I>
\r
702 <B>property</B> SubExprMatchCount : integer <B>read</B> GetSubExprMatchCount;
\r
703 <I><FONT COLOR="Navy">// Number of subexpressions has been found in last Exec* call. </FONT></I>
\r
704 <I><FONT COLOR="Navy">// If there are no subexpr. but whole expr was found (Exec* returned True), </FONT></I>
\r
705 <I><FONT COLOR="Navy">// then SubExprMatchCount=0, if no subexpressions nor whole </FONT></I>
\r
706 <I><FONT COLOR="Navy">// r.e. found (Exec* returned false) then SubExprMatchCount=-1. </FONT></I>
\r
707 <I><FONT COLOR="Navy">// Note, that some subexpr. may be not found and for such </FONT></I>
\r
708 <I><FONT COLOR="Navy">// subexpr. MathPos=MatchLen=-1 and Match=''. </FONT></I>
\r
709 <I><FONT COLOR="Navy">// For example: Expression := '(1)?2(3)?'; </FONT></I>
\r
710 <I><FONT COLOR="Navy">// Exec ('123'): SubExprMatchCount=2, Match[0]='123', [1]='1', [2]='3' </FONT></I>
\r
711 <I><FONT COLOR="Navy">// Exec ('12'): SubExprMatchCount=1, Match[0]='23', [1]='1' </FONT></I>
\r
712 <I><FONT COLOR="Navy">// Exec ('23'): SubExprMatchCount=2, Match[0]='23', [1]='', [2]='3' </FONT></I>
\r
713 <I><FONT COLOR="Navy">// Exec ('2'): SubExprMatchCount=0, Match[0]='2' </FONT></I>
\r
714 <I><FONT COLOR="Navy">// Exec ('7') - return False: SubExprMatchCount=-1 </FONT></I>
\r
716 <B>property</B> MatchPos [Idx : integer] : integer <B>read</B> GetMatchPos;
\r
717 <I><FONT COLOR="Navy">// pos of entrance subexpr. #Idx into tested in last Exec* </FONT></I>
\r
718 <I><FONT COLOR="Navy">// string. First subexpr. have Idx=1, last - MatchCount, </FONT></I>
\r
719 <I><FONT COLOR="Navy">// whole r.e. have Idx=0. </FONT></I>
\r
720 <I><FONT COLOR="Navy">// Returns -1 if in r.e. no such subexpr. or this subexpr. </FONT></I>
\r
721 <I><FONT COLOR="Navy">// not found in input string. </FONT></I>
\r
723 <B>property</B> MatchLen [Idx : integer] : integer <B>read</B> GetMatchLen;
\r
724 <I><FONT COLOR="Navy">// len of entrance subexpr. #Idx r.e. into tested in last Exec* </FONT></I>
\r
725 <I><FONT COLOR="Navy">// string. First subexpr. have Idx=1, last - MatchCount, </FONT></I>
\r
726 <I><FONT COLOR="Navy">// whole r.e. have Idx=0. </FONT></I>
\r
727 <I><FONT COLOR="Navy">// Returns -1 if in r.e. no such subexpr. or this subexpr. </FONT></I>
\r
728 <I><FONT COLOR="Navy">// not found in input string. </FONT></I>
\r
729 <I><FONT COLOR="Navy">// Remember - MatchLen may be 0 (if r.e. match empty string) ! </FONT></I>
\r
731 <B>property</B> Match [Idx : integer] : RegExprString <B>read</B> GetMatch;
\r
732 <I><FONT COLOR="Navy">// == copy (InputString, MatchPos [Idx], MatchLen [Idx]) </FONT></I>
\r
733 <I><FONT COLOR="Navy">// Returns '' if in r.e. no such subexpr. or this subexpr. </FONT></I>
\r
734 <I><FONT COLOR="Navy">// not found in input string. </FONT></I>
\r
736 <B>function</B> LastError : integer;
\r
737 <I><FONT COLOR="Navy">// Returns ID of last error, 0 if no errors (unusable if </FONT></I>
\r
738 <I><FONT COLOR="Navy">// Error method raises exception) and clear internal status </FONT></I>
\r
739 <I><FONT COLOR="Navy">// into 0 (no errors). </FONT></I>
\r
741 <B>function</B> ErrorMsg (AErrorID : integer) : RegExprString; <B>virtual</B>;
\r
742 <I><FONT COLOR="Navy">// Returns Error message for error with ID = AErrorID. </FONT></I>
\r
744 <B>property</B> CompilerErrorPos : integer <B>read</B> GetCompilerErrorPos;
\r
745 <I><FONT COLOR="Navy">// Returns pos in r.e. there compiler stopped. </FONT></I>
\r
746 <I><FONT COLOR="Navy">// Usefull for error diagnostics </FONT></I>
\r
748 <B>property</B> SpaceChars : RegExprString <B>read</B> fSpaceChars <B>write</B> fSpaceChars; <I><FONT COLOR="Navy">//###0.927 </FONT></I>
\r
749 <I><FONT COLOR="Navy">// Contains chars, treated as /s (initially filled with RegExprSpaceChars </FONT></I>
\r
750 <I><FONT COLOR="Navy">// global constant) </FONT></I>
\r
752 <B>property</B> WordChars : RegExprString <B>read</B> fWordChars <B>write</B> fWordChars; <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
753 <I><FONT COLOR="Navy">// Contains chars, treated as /w (initially filled with RegExprWordChars </FONT></I>
\r
754 <I><FONT COLOR="Navy">// global constant) </FONT></I>
\r
756 <B>class</B> <B>function</B> InvertCaseFunction (<B>const</B> Ch : REChar) : REChar;
\r
757 <I><FONT COLOR="Navy">// Converts Ch into upper case if it in lower case or in lower </FONT></I>
\r
758 <I><FONT COLOR="Navy">// if it in upper (uses current system local setings) </FONT></I>
\r
760 <B>property</B> InvertCase : TRegExprInvertCaseFunction <B>read</B> fInvertCase <B>write</B> fInvertCase; <I><FONT COLOR="Navy">//##0.935 </FONT></I>
\r
761 <I><FONT COLOR="Navy">// Set this property if you want to override case-insensitive functionality. </FONT></I>
\r
762 <I><FONT COLOR="Navy">// Create set it to RegExprInvertCaseFunction (InvertCaseFunction by default) </FONT></I>
\r
764 <I><FONT COLOR="Navy">{$IFDEF DebugRegExpr}</FONT></I>
\r
765 <B>function</B> Dump : RegExprString;
\r
766 <I><FONT COLOR="Navy">// dump a compiled regexp in vaguely comprehensible form </FONT></I>
\r
767 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
770 <I><FONT COLOR="Navy">{** Excepción de expresiones regulares}</FONT></I>
\r
771 ERegExpr = <B>class</B> (Exception)
\r
773 ErrorCode : integer;
\r
774 CompilerErrorPos : integer;
\r
778 RegExprInvertCaseFunction : TRegExprInvertCaseFunction = TRegExpr.InvertCaseFunction;
\r
779 <I><FONT COLOR="Navy">// defaul for InvertCase property </FONT></I>
\r
781 <I><FONT COLOR="Navy">{** true if string AInputString match regular expression ARegExpr
\r
782 ! will raise exeption if syntax errors in ARegExpr}</FONT></I>
\r
783 <B>function</B> ExecRegExpr (<B>const</B> ARegExpr, AInputStr : RegExprString) : boolean;
\r
785 <I><FONT COLOR="Navy">{** Split AInputStr into APieces by r.e. ARegExpr occurencies}</FONT></I>
\r
786 <B>procedure</B> SplitRegExpr (<B>const</B> ARegExpr, AInputStr : RegExprString; APieces : TStrings);
\r
788 <I><FONT COLOR="Navy">{** Returns AInputStr with r.e. occurencies replaced by AReplaceStr}</FONT></I>
\r
789 <B>function</B> ReplaceRegExpr (<B>const</B> ARegExpr, AInputStr, AReplaceStr : RegExprString) : RegExprString;
\r
791 <I><FONT COLOR="Navy">{** Replace all metachars with its safe representation,
\r
792 for example 'abc$cd.(' converts into 'abc/$cd/./('
\r
793 This function usefull for r.e. autogeneration from
\r
794 user input}</FONT></I>
\r
795 <B>function</B> QuoteRegExprMetaChars (<B>const</B> AStr : RegExprString) : RegExprString;
\r
797 <B>implementation</B>
\r
800 Windows; <I><FONT COLOR="Navy">// CharUpper/Lower </FONT></I>
\r
803 MaskModI = 1; <I><FONT COLOR="Navy">// modifier /i bit in fModifiers </FONT></I>
\r
804 MaskModR = 2; <I><FONT COLOR="Navy">// -"- /r </FONT></I>
\r
805 MaskModS = 4; <I><FONT COLOR="Navy">// -"- /s </FONT></I>
\r
807 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
808 <I><FONT COLOR="Navy">{=================== WideString functions ====================}</FONT></I>
\r
809 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
811 <I><FONT COLOR="Navy">{$IFDEF UniCode}</FONT></I>
\r
813 <B>function</B> StrPCopy (Dest: PRegExprChar; <B>const</B> Source: RegExprString): PRegExprChar;
\r
817 Len := length (Source); <I><FONT COLOR="Navy">//###0.932 </FONT></I>
\r
818 <B>for</B> i := 1 <B>to</B> Len <B>do</B>
\r
819 Dest [i - 1] := Source [i];
\r
822 <B>end</B>; <I><FONT COLOR="Navy">{ of function StrPCopy
\r
823 --------------------------------------------------------------}</FONT></I>
\r
825 <B>function</B> StrLCopy (Dest, Source: PRegExprChar; MaxLen: Cardinal): PRegExprChar;
\r
826 <B>var</B> i: Integer;
\r
828 <B>for</B> i := 0 <B>to</B> MaxLen - 1 <B>do</B>
\r
829 Dest [i] := Source [i];
\r
831 <B>end</B>; <I><FONT COLOR="Navy">{ of function StrLCopy
\r
832 --------------------------------------------------------------}</FONT></I>
\r
834 <B>function</B> StrLen (Str: PRegExprChar): Cardinal;
\r
837 <B>while</B> Str [result] <> #0
\r
838 <B>do</B> Inc (Result);
\r
839 <B>end</B>; <I><FONT COLOR="Navy">{ of function StrLen
\r
840 --------------------------------------------------------------}</FONT></I>
\r
842 <B>function</B> StrPos (Str1, Str2: PRegExprChar): PRegExprChar;
\r
843 <B>var</B> n: Integer;
\r
845 Result := <B>nil</B>;
\r
846 n := Pos (RegExprString (Str2), RegExprString (Str1));
\r
849 Result := Str1 + n - 1;
\r
850 <B>end</B>; <I><FONT COLOR="Navy">{ of function StrPos
\r
851 --------------------------------------------------------------}</FONT></I>
\r
853 <B>function</B> StrLComp (Str1, Str2: PRegExprChar; MaxLen: Cardinal): Integer;
\r
854 <B>var</B> S1, S2: RegExprString;
\r
858 <B>if</B> Copy (S1, 1, MaxLen) > Copy (S2, 1, MaxLen)
\r
859 <B>then</B> Result := 1
\r
861 <B>if</B> Copy (S1, 1, MaxLen) < Copy (S2, 1, MaxLen)
\r
862 <B>then</B> Result := -1
\r
863 <B>else</B> Result := 0;
\r
864 <B>end</B>; <I><FONT COLOR="Navy">{ function StrLComp
\r
865 --------------------------------------------------------------}</FONT></I>
\r
867 <B>function</B> StrScan (Str: PRegExprChar; Chr: WideChar): PRegExprChar;
\r
869 Result := <B>nil</B>;
\r
870 <B>while</B> (Str^ <> #0) <B>and</B> (Str^ <> Chr)
\r
871 <B>do</B> Inc (Str);
\r
872 <B>if</B> (Str^ <> #0)
\r
873 <B>then</B> Result := Str;
\r
874 <B>end</B>; <I><FONT COLOR="Navy">{ of function StrScan
\r
875 --------------------------------------------------------------}</FONT></I>
\r
877 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
879 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
880 <I><FONT COLOR="Navy">{===================== Global functions ======================}</FONT></I>
\r
881 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
883 <B>function</B> ExecRegExpr (<B>const</B> ARegExpr, AInputStr : RegExprString) : boolean;
\r
884 <B>var</B> r : TRegExpr;
\r
886 r := TRegExpr.Create;
\r
888 r.Expression := ARegExpr;
\r
889 Result := r.Exec (AInputStr);
\r
890 <B>finally</B> r.Free;
\r
892 <B>end</B>; <I><FONT COLOR="Navy">{ of function ExecRegExpr
\r
893 --------------------------------------------------------------}</FONT></I>
\r
895 <B>procedure</B> SplitRegExpr (<B>const</B> ARegExpr, AInputStr : RegExprString; APieces : TStrings);
\r
896 <B>var</B> r : TRegExpr;
\r
899 r := TRegExpr.Create;
\r
901 r.Expression := ARegExpr;
\r
902 r.Split (AInputStr, APieces);
\r
903 <B>finally</B> r.Free;
\r
905 <B>end</B>; <I><FONT COLOR="Navy">{ of procedure SplitRegExpr
\r
906 --------------------------------------------------------------}</FONT></I>
\r
908 <B>function</B> ReplaceRegExpr (<B>const</B> ARegExpr, AInputStr, AReplaceStr : RegExprString) : RegExprString;
\r
909 <B>var</B> r : TRegExpr;
\r
911 r := TRegExpr.Create;
\r
913 r.Expression := ARegExpr;
\r
914 Result := r.Replace (AInputStr, AReplaceStr);
\r
915 <B>finally</B> r.Free;
\r
917 <B>end</B>; <I><FONT COLOR="Navy">{ of function ReplaceRegExpr
\r
918 --------------------------------------------------------------}</FONT></I>
\r
920 <B>function</B> QuoteRegExprMetaChars (<B>const</B> AStr : RegExprString) : RegExprString;
\r
922 RegExprMetaSet : RegExprString = '^$.[()|?+*/{'
\r
923 + ']}'; <I><FONT COLOR="Navy">// - this last are additional to META. </FONT></I>
\r
924 <I><FONT COLOR="Navy">// Very similar to META array, but slighly changed. </FONT></I>
\r
925 <I><FONT COLOR="Navy">// !Any changes in META array must be synchronized with this set. </FONT></I>
\r
927 i, i0, Len : integer;
\r
930 Len := length (AStr);
\r
933 <B>while</B> i <= Len <B>do</B> <B>begin</B>
\r
934 <B>if</B> Pos (AStr [i], RegExprMetaSet) > 0 <B>then</B> <B>begin</B>
\r
935 Result := Result + System.Copy (AStr, i0, i - i0)
\r
941 Result := Result + System.Copy (AStr, i0, MaxInt); <I><FONT COLOR="Navy">// Tail </FONT></I>
\r
942 <B>end</B>; <I><FONT COLOR="Navy">{ of function QuoteRegExprMetaChars
\r
943 --------------------------------------------------------------}</FONT></I>
\r
948 MAGIC = TREOp (216);<I><FONT COLOR="Navy">// programm signature </FONT></I>
\r
950 <I><FONT COLOR="Navy">// name opcode opnd? meaning </FONT></I>
\r
951 EEND = TREOp (0); <I><FONT COLOR="Navy">// - End of program </FONT></I>
\r
952 BOL = TREOp (1); <I><FONT COLOR="Navy">// - Match "" at beginning of line </FONT></I>
\r
953 EOL = TREOp (2); <I><FONT COLOR="Navy">// - Match "" at end of line </FONT></I>
\r
954 ANY = TREOp (3); <I><FONT COLOR="Navy">// - Match any one character </FONT></I>
\r
955 ANYOF = TREOp (4); <I><FONT COLOR="Navy">// Str Match any character in string Str </FONT></I>
\r
956 ANYBUT = TREOp (5); <I><FONT COLOR="Navy">// Str Match any char. not in string Str </FONT></I>
\r
957 BRANCH = TREOp (6); <I><FONT COLOR="Navy">// Node Match this alternative, or the next </FONT></I>
\r
958 BACK = TREOp (7); <I><FONT COLOR="Navy">// - Jump backward (Next < 0) </FONT></I>
\r
959 EXACTLY = TREOp (8); <I><FONT COLOR="Navy">// Str Match string Str </FONT></I>
\r
960 NOTHING = TREOp (9); <I><FONT COLOR="Navy">// - Match empty string </FONT></I>
\r
961 STAR = TREOp (10); <I><FONT COLOR="Navy">// Node Match this (simple) thing 0 or more times </FONT></I>
\r
962 PLUS = TREOp (11); <I><FONT COLOR="Navy">// Node Match this (simple) thing 1 or more times </FONT></I>
\r
963 ANYDIGIT = TREOp (12); <I><FONT COLOR="Navy">// - Match any digit (equiv [0-9]) </FONT></I>
\r
964 NOTDIGIT = TREOp (13); <I><FONT COLOR="Navy">// - Match not digit (equiv [0-9]) </FONT></I>
\r
965 ANYLETTER = TREOp (14); <I><FONT COLOR="Navy">// - Match any letter from property WordChars </FONT></I>
\r
966 NOTLETTER = TREOp (15); <I><FONT COLOR="Navy">// - Match not letter from property WordChars </FONT></I>
\r
967 ANYSPACE = TREOp (16); <I><FONT COLOR="Navy">// - Match any space char (see property SpaceChars) </FONT></I>
\r
968 NOTSPACE = TREOp (17); <I><FONT COLOR="Navy">// - Match not space char (see property SpaceChars) </FONT></I>
\r
969 BRACES = TREOp (18); <I><FONT COLOR="Navy">// Node,Min,Max Match this (simple) thing from Min to Max times. </FONT></I>
\r
970 <I><FONT COLOR="Navy">// Min and Max are TREBracesArg </FONT></I>
\r
971 COMMENT = TREOp (19); <I><FONT COLOR="Navy">// - Comment ;) </FONT></I>
\r
972 EXACTLYCI = TREOp (20); <I><FONT COLOR="Navy">// Str Match string Str case insensitive </FONT></I>
\r
973 ANYOFCI = TREOp (21); <I><FONT COLOR="Navy">// Str Match any character in string Str, case insensitive </FONT></I>
\r
974 ANYBUTCI = TREOp (22); <I><FONT COLOR="Navy">// Str Match any char. not in string Str, case insensitive </FONT></I>
\r
975 LOOPENTRY = TREOp (23); <I><FONT COLOR="Navy">// Node Start of loop (Node - LOOP for this loop) </FONT></I>
\r
976 LOOP = TREOp (24); <I><FONT COLOR="Navy">// Node,Min,Max,LoopEntryJmp - back jump for LOOPENTRY. </FONT></I>
\r
977 <I><FONT COLOR="Navy">// Min and Max are TREBracesArg </FONT></I>
\r
978 <I><FONT COLOR="Navy">// Node - next node in sequence, </FONT></I>
\r
979 <I><FONT COLOR="Navy">// LoopEntryJmp - associated LOOPENTRY node addr </FONT></I>
\r
980 ANYOFTINYSET= TREOp (25); <I><FONT COLOR="Navy">// Chrs Match any one char from Chrs (exactly TinySetLen chars) </FONT></I>
\r
981 ANYBUTTINYSET=TREOp (26); <I><FONT COLOR="Navy">// Chrs Match any one char not in Chrs (exactly TinySetLen chars) </FONT></I>
\r
982 ANYOFFULLSET= TREOp (27); <I><FONT COLOR="Navy">// Set Match any one char from set of char </FONT></I>
\r
983 <I><FONT COLOR="Navy">// - very fast (one CPU instruction !) but takes 32 bytes of p-code </FONT></I>
\r
984 BSUBEXP = TREOp (28); <I><FONT COLOR="Navy">// Idx Match previously matched subexpression #Idx (stored as REChar) //###0.936 </FONT></I>
\r
985 BSUBEXPCI = TREOp (29); <I><FONT COLOR="Navy">// Idx -"- in case-insensitive mode </FONT></I>
\r
988 <I><FONT COLOR="Navy">// !!! Change OPEN value if you add new opcodes !!! </FONT></I>
\r
990 OPEN = TREOp (30); <I><FONT COLOR="Navy">// - Mark this point in input as start of /n </FONT></I>
\r
991 <I><FONT COLOR="Navy">// OPEN + 1 is /1, etc. </FONT></I>
\r
992 CLOSE = TREOp (ord (OPEN) + NSUBEXP);
\r
993 <I><FONT COLOR="Navy">// - Analogous to OPEN. </FONT></I>
\r
995 <I><FONT COLOR="Navy">// !!! Don't add new OpCodes after CLOSE !!! </FONT></I>
\r
997 <I><FONT COLOR="Navy">// We work with p-code thru pointers, compatible with PRegExprChar. </FONT></I>
\r
998 <I><FONT COLOR="Navy">// Note: all code components (TRENextOff, TREOp, TREBracesArg, etc) </FONT></I>
\r
999 <I><FONT COLOR="Navy">// must have lengths that can be divided by SizeOf (REChar) ! </FONT></I>
\r
1000 <I><FONT COLOR="Navy">// A node is TREOp of opcode followed Next "pointer" of TRENextOff type. </FONT></I>
\r
1001 <I><FONT COLOR="Navy">// The Next is a offset from the opcode of the node containing it. </FONT></I>
\r
1002 <I><FONT COLOR="Navy">// An operand, if any, simply follows the node. (Note that much of </FONT></I>
\r
1003 <I><FONT COLOR="Navy">// the code generation knows about this implicit relationship!) </FONT></I>
\r
1004 <I><FONT COLOR="Navy">// Using TRENextOff=integer speed up p-code processing. </FONT></I>
\r
1006 <I><FONT COLOR="Navy">// Opcodes description: </FONT></I>
\r
1007 <I><FONT COLOR="Navy">// </FONT></I>
\r
1008 <I><FONT COLOR="Navy">// BRANCH The set of branches constituting a single choice are hooked </FONT></I>
\r
1009 <I><FONT COLOR="Navy">// together with their "next" pointers, since precedence prevents </FONT></I>
\r
1010 <I><FONT COLOR="Navy">// anything being concatenated to any individual branch. The </FONT></I>
\r
1011 <I><FONT COLOR="Navy">// "next" pointer of the last BRANCH in a choice points to the </FONT></I>
\r
1012 <I><FONT COLOR="Navy">// thing following the whole choice. This is also where the </FONT></I>
\r
1013 <I><FONT COLOR="Navy">// final "next" pointer of each individual branch points; each </FONT></I>
\r
1014 <I><FONT COLOR="Navy">// branch starts with the operand node of a BRANCH node. </FONT></I>
\r
1015 <I><FONT COLOR="Navy">// BACK Normal "next" pointers all implicitly point forward; BACK </FONT></I>
\r
1016 <I><FONT COLOR="Navy">// exists to make loop structures possible. </FONT></I>
\r
1017 <I><FONT COLOR="Navy">// STAR,PLUS,BRACES '?', and complex '*' and '+', are implemented as </FONT></I>
\r
1018 <I><FONT COLOR="Navy">// circular BRANCH structures using BACK. Complex '{min,max}' </FONT></I>
\r
1019 <I><FONT COLOR="Navy">// - as pair LOOPENTRY-LOOP (see below). Simple cases (one </FONT></I>
\r
1020 <I><FONT COLOR="Navy">// character per match) are implemented with STAR, PLUS and </FONT></I>
\r
1021 <I><FONT COLOR="Navy">// BRACES for speed and to minimize recursive plunges. </FONT></I>
\r
1022 <I><FONT COLOR="Navy">// LOOPENTRY,LOOP {min,max} are implemented as special pair </FONT></I>
\r
1023 <I><FONT COLOR="Navy">// LOOPENTRY-LOOP. Each LOOPENTRY initialize loopstack for </FONT></I>
\r
1024 <I><FONT COLOR="Navy">// current level. </FONT></I>
\r
1025 <I><FONT COLOR="Navy">// OPEN,CLOSE are numbered at compile time. </FONT></I>
\r
1028 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
1029 <I><FONT COLOR="Navy">{================== Error handling section ===================}</FONT></I>
\r
1030 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
1034 reeCompNullArgument = 100;
\r
1035 reeCompRegexpTooBig = 101;
\r
1036 reeCompParseRegTooManyBrackets = 102;
\r
1037 reeCompParseRegUnmatchedBrackets = 103;
\r
1038 reeCompParseRegUnmatchedBrackets2 = 104;
\r
1039 reeCompParseRegJunkOnEnd = 105;
\r
1040 reePlusStarOperandCouldBeEmpty = 106;
\r
1041 reeNestedSQP = 107;
\r
1042 reeBadHexDigit = 108;
\r
1043 reeInvalidRange = 109;
\r
1044 reeParseAtomTrailingBackSlash = 110;
\r
1045 reeNoHexCodeAfterBSlashX = 111;
\r
1046 reeHexCodeAfterBSlashXTooBig = 112;
\r
1047 reeUnmatchedSqBrackets = 113;
\r
1048 reeInternalUrp = 114;
\r
1049 reeQPSBFollowsNothing = 115;
\r
1050 reeTrailingBackSlash = 116;
\r
1051 reeRarseAtomInternalDisaster = 119;
\r
1052 reeBRACESArgTooBig = 122;
\r
1053 reeBracesMinParamGreaterMax = 124;
\r
1054 reeUnclosedComment = 125;
\r
1055 reeComplexBracesNotImplemented = 126;
\r
1056 reeUrecognizedModifier = 127;
\r
1057 reeRegRepeatCalledInappropriately = 1000;
\r
1058 reeMatchPrimMemoryCorruption = 1001;
\r
1059 reeMatchPrimCorruptedPointers = 1002;
\r
1060 reeNoExpression = 1003;
\r
1061 reeCorruptedProgram = 1004;
\r
1062 reeNoInpitStringSpecified = 1005;
\r
1063 reeOffsetMustBeGreaterThen0 = 1006;
\r
1064 reeExecNextWithoutExec = 1007;
\r
1065 reeGetInputStringWithoutInputString = 1008;
\r
1066 reeDumpCorruptedOpcode = 1011;
\r
1067 reeExecAfterCompErr = 1012;
\r
1068 reeModifierUnsupported = 1013;
\r
1069 reeLoopStackExceeded = 1014;
\r
1070 reeLoopWithoutEntry = 1015;
\r
1072 <B>function</B> TRegExpr.ErrorMsg (AErrorID : integer) : RegExprString;
\r
1074 <B>case</B> AErrorID <B>of</B>
\r
1075 reeOk: Result := 'No errors';
\r
1076 reeCompNullArgument: Result := 'TRegExpr(comp): Null Argument';
\r
1077 reeCompRegexpTooBig: Result := 'TRegExpr(comp): Regexp Too Big';
\r
1078 reeCompParseRegTooManyBrackets: Result := 'TRegExpr(comp): ParseReg Too Many ()';
\r
1079 reeCompParseRegUnmatchedBrackets: Result := 'TRegExpr(comp): ParseReg Unmatched ()';
\r
1080 reeCompParseRegUnmatchedBrackets2: Result := 'TRegExpr(comp): ParseReg Unmatched ()';
\r
1081 reeCompParseRegJunkOnEnd: Result := 'TRegExpr(comp): ParseReg Junk On End';
\r
1082 reePlusStarOperandCouldBeEmpty: Result := 'TRegExpr(comp): *+ Operand Could Be Empty';
\r
1083 reeNestedSQP: Result := 'TRegExpr(comp): Nested *?+';
\r
1084 reeBadHexDigit: Result := 'TRegExpr(comp): Bad Hex Digit';
\r
1085 reeInvalidRange: Result := 'TRegExpr(comp): Invalid [] Range';
\r
1086 reeParseAtomTrailingBackSlash: Result := 'TRegExpr(comp): Parse Atom Trailing /';
\r
1087 reeNoHexCodeAfterBSlashX: Result := 'TRegExpr(comp): No Hex Code After /x';
\r
1088 reeHexCodeAfterBSlashXTooBig: Result := 'TRegExpr(comp): Hex Code After /x Is Too Big';
\r
1089 reeUnmatchedSqBrackets: Result := 'TRegExpr(comp): Unmatched []';
\r
1090 reeInternalUrp: Result := 'TRegExpr(comp): Internal Urp';
\r
1091 reeQPSBFollowsNothing: Result := 'TRegExpr(comp): ?+*{ Follows Nothing';
\r
1092 reeTrailingBackSlash: Result := 'TRegExpr(comp): Trailing /';
\r
1093 reeRarseAtomInternalDisaster: Result := 'TRegExpr(comp): RarseAtom Internal Disaster';
\r
1094 reeBRACESArgTooBig: Result := 'TRegExpr(comp): BRACES Argument Too Big';
\r
1095 reeBracesMinParamGreaterMax: Result := 'TRegExpr(comp): BRACE Min Param Greater then Max';
\r
1096 reeUnclosedComment: Result := 'TRegExpr(comp): Unclosed (?#Comment)';
\r
1097 reeComplexBracesNotImplemented: Result := 'TRegExpr(comp): If you want take part in beta-testing BRACES ''{min,max}'' for complex cases - remove ''.'' from {.$DEFINE ComplexBraces}';
\r
1098 reeUrecognizedModifier: Result := 'TRegExpr(comp): Urecognized Modifier';
\r
1100 reeRegRepeatCalledInappropriately: Result := 'TRegExpr(exec): RegRepeat Called Inappropriately';
\r
1101 reeMatchPrimMemoryCorruption: Result := 'TRegExpr(exec): MatchPrim Memory Corruption';
\r
1102 reeMatchPrimCorruptedPointers: Result := 'TRegExpr(exec): MatchPrim Corrupted Pointers';
\r
1103 reeNoExpression: Result := 'TRegExpr(exec): Not Assigned Expression Property';
\r
1104 reeCorruptedProgram: Result := 'TRegExpr(exec): Corrupted Program';
\r
1105 reeNoInpitStringSpecified: Result := 'TRegExpr(exec): No Inpit String Specified';
\r
1106 reeOffsetMustBeGreaterThen0: Result := 'TRegExpr(exec): Offset Must Be Greater Then 0';
\r
1107 reeExecNextWithoutExec: Result := 'TRegExpr(exec): ExecNext Without Exec[Pos]';
\r
1108 reeGetInputStringWithoutInputString: Result := 'TRegExpr(exec): GetInputString Without InputString';
\r
1109 reeDumpCorruptedOpcode: Result := 'TRegExpr(dump): Corrupted Opcode';
\r
1110 reeExecAfterCompErr: Result := 'TRegExpr(exec): Exec After Compilation Error';
\r
1111 reeLoopStackExceeded: Result := 'TRegExpr(exec): Loop Stack Exceeded';
\r
1112 reeLoopWithoutEntry: Result := 'TRegExpr(exec): Loop Without LoopEntry !';
\r
1113 <B>else</B> Result := 'Unknown error';
\r
1115 <B>end</B>; <I><FONT COLOR="Navy">{ of procedure TRegExpr.Error
\r
1116 --------------------------------------------------------------}</FONT></I>
\r
1118 <B>function</B> TRegExpr.LastError : integer;
\r
1120 Result := fLastError;
\r
1121 fLastError := reeOk;
\r
1122 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.LastError
\r
1123 --------------------------------------------------------------}</FONT></I>
\r
1126 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
1127 <I><FONT COLOR="Navy">{===================== Common section ========================}</FONT></I>
\r
1128 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
1130 <B>constructor</B> TRegExpr.Create;
\r
1132 <B>inherited</B>;
\r
1133 programm := <B>nil</B>;
\r
1134 fExpression := <B>nil</B>;
\r
1135 fInputString := <B>nil</B>;
\r
1137 regexpbeg := <B>nil</B>;
\r
1138 fExprIsCompiled := false;
\r
1140 ModifierI := RegExprModifierI;
\r
1141 ModifierR := RegExprModifierR;
\r
1142 ModifierS := RegExprModifierS;
\r
1144 SpaceChars := RegExprSpaceChars; <I><FONT COLOR="Navy">//###0.927 </FONT></I>
\r
1145 WordChars := RegExprWordChars; <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
1146 fInvertCase := RegExprInvertCaseFunction; <I><FONT COLOR="Navy">//###0.927 </FONT></I>
\r
1147 <B>end</B>; <I><FONT COLOR="Navy">{ of constructor TRegExpr.Create
\r
1148 --------------------------------------------------------------}</FONT></I>
\r
1150 <B>destructor</B> TRegExpr.Destroy;
\r
1152 <B>if</B> programm <> <B>nil</B>
\r
1153 <B>then</B> FreeMem (programm);
\r
1154 <B>if</B> fExpression <> <B>nil</B>
\r
1155 <B>then</B> FreeMem (fExpression);
\r
1156 <B>if</B> fInputString <> <B>nil</B>
\r
1157 <B>then</B> FreeMem (fInputString);
\r
1158 <B>end</B>; <I><FONT COLOR="Navy">{ of destructor TRegExpr.Destroy
\r
1159 --------------------------------------------------------------}</FONT></I>
\r
1161 <B>class</B> <B>function</B> TRegExpr.InvertCaseFunction (<B>const</B> Ch : REChar) : REChar;
\r
1163 <I><FONT COLOR="Navy">{$IFDEF UniCode}</FONT></I>
\r
1164 <B>if</B> Ch >= #128
\r
1165 <B>then</B> Result := Ch
\r
1167 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
1169 Result := REChar (CharUpper (pointer (Ch)));
\r
1170 <B>if</B> Result = Ch
\r
1171 <B>then</B> Result := REChar (CharLower (pointer (Ch)));
\r
1173 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.InvertCaseFunction
\r
1174 --------------------------------------------------------------}</FONT></I>
\r
1176 <B>function</B> TRegExpr.GetExpression : RegExprString;
\r
1178 <B>if</B> fExpression <> <B>nil</B>
\r
1179 <B>then</B> Result := fExpression
\r
1180 <B>else</B> Result := '';
\r
1181 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.GetExpression
\r
1182 --------------------------------------------------------------}</FONT></I>
\r
1184 <B>procedure</B> TRegExpr.SetExpression (<B>const</B> s : RegExprString);
\r
1186 <B>if</B> (s <> fExpression) <B>or</B> <B>not</B> fExprIsCompiled <B>then</B> <B>begin</B>
\r
1187 fExprIsCompiled := false;
\r
1188 <B>if</B> fExpression <> <B>nil</B> <B>then</B> <B>begin</B>
\r
1189 FreeMem (fExpression);
\r
1190 fExpression := <B>nil</B>;
\r
1192 <B>if</B> s <> '' <B>then</B> <B>begin</B>
\r
1193 GetMem (fExpression, (length (s) + 1) * SizeOf (REChar));
\r
1194 CompileRegExpr (StrPCopy (fExpression, s));
\r
1197 <B>end</B>; <I><FONT COLOR="Navy">{ of procedure TRegExpr.SetExpression
\r
1198 --------------------------------------------------------------}</FONT></I>
\r
1200 <B>function</B> TRegExpr.GetSubExprMatchCount : integer;
\r
1202 <B>if</B> Assigned (fInputString) <B>then</B> <B>begin</B>
\r
1203 Result := NSUBEXP - 1;
\r
1204 <B>while</B> (Result > 0) <B>and</B> ((startp [Result] = <B>nil</B>)
\r
1205 <B>or</B> (endp [Result] = <B>nil</B>))
\r
1206 <B>do</B> dec (Result);
\r
1208 <B>else</B> Result := -1;
\r
1209 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.GetSubExprMatchCount
\r
1210 --------------------------------------------------------------}</FONT></I>
\r
1212 <B>function</B> TRegExpr.GetMatchPos (Idx : integer) : integer;
\r
1214 <B>if</B> (Idx >= 0) <B>and</B> (Idx < NSUBEXP) <B>and</B> Assigned (fInputString)
\r
1215 <B>and</B> Assigned (startp [Idx]) <B>and</B> Assigned (endp [Idx]) <B>then</B> <B>begin</B>
\r
1216 Result := (startp [Idx] - fInputString) + 1;
\r
1218 <B>else</B> Result := -1;
\r
1219 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.GetMatchPos
\r
1220 --------------------------------------------------------------}</FONT></I>
\r
1222 <B>function</B> TRegExpr.GetMatchLen (Idx : integer) : integer;
\r
1224 <B>if</B> (Idx >= 0) <B>and</B> (Idx < NSUBEXP) <B>and</B> Assigned (fInputString)
\r
1225 <B>and</B> Assigned (startp [Idx]) <B>and</B> Assigned (endp [Idx]) <B>then</B> <B>begin</B>
\r
1226 Result := endp [Idx] - startp [Idx];
\r
1228 <B>else</B> Result := -1;
\r
1229 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.GetMatchLen
\r
1230 --------------------------------------------------------------}</FONT></I>
\r
1232 <B>function</B> TRegExpr.GetMatch (Idx : integer) : RegExprString;
\r
1234 <B>if</B> (Idx >= 0) <B>and</B> (Idx < NSUBEXP) <B>and</B> Assigned (fInputString)
\r
1235 <B>and</B> Assigned (startp [Idx]) <B>and</B> Assigned (endp [Idx])
\r
1236 <I><FONT COLOR="Navy">//then Result := copy (fInputString, MatchPos [Idx], MatchLen [Idx]) //###0.929 </FONT></I>
\r
1237 <B>then</B> SetString (Result, startp [idx], endp [idx] - startp [idx])
\r
1238 <B>else</B> Result := '';
\r
1239 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.GetMatch
\r
1240 --------------------------------------------------------------}</FONT></I>
\r
1242 <B>function</B> TRegExpr.IsProgrammOk : boolean;
\r
1245 <B>if</B> programm = <B>nil</B> <I><FONT COLOR="Navy">// No compiled r.e. present </FONT></I>
\r
1246 <B>then</B> Error (reeNoExpression)
\r
1247 <B>else</B> <B>if</B> programm [0] <> MAGIC <I><FONT COLOR="Navy">// Program corrupted. </FONT></I>
\r
1248 <B>then</B> Error (reeCorruptedProgram)
\r
1249 <B>else</B> <B>if</B> <B>not</B> fExprIsCompiled <I><FONT COLOR="Navy">// Previous compilation was finished with error </FONT></I>
\r
1250 <B>then</B> Error (reeExecAfterCompErr)
\r
1251 <B>else</B> Result := true;
\r
1252 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.IsProgrammOk
\r
1253 --------------------------------------------------------------}</FONT></I>
\r
1255 <B>procedure</B> TRegExpr.CheckCompModifiers;
\r
1257 <B>if</B> (programm <> <B>nil</B>) <B>and</B> (fExpression <> <B>nil</B>)
\r
1258 <B>and</B> (fModifiers <> fProgModifiers)
\r
1259 <B>then</B> CompileRegExpr (fExpression);
\r
1260 <B>end</B>; <I><FONT COLOR="Navy">{ of TRegExpr.CheckCompModifiers
\r
1261 --------------------------------------------------------------}</FONT></I>
\r
1263 <B>function</B> TRegExpr.GetModifierStr : RegExprString;
\r
1267 <B>if</B> ModifierI
\r
1268 <B>then</B> Result := 'i' + Result
\r
1269 <B>else</B> Result := Result + 'i';
\r
1270 <B>if</B> ModifierR
\r
1271 <B>then</B> Result := 'r' + Result
\r
1272 <B>else</B> Result := Result + 'r';
\r
1273 <B>if</B> ModifierS
\r
1274 <B>then</B> Result := 's' + Result
\r
1275 <B>else</B> Result := Result + 's';
\r
1277 <B>if</B> Result [length (Result)] = '-' <I><FONT COLOR="Navy">// remove '-' if all modifiers are 'On' </FONT></I>
\r
1278 <B>then</B> System.Delete (Result, length (Result), 1);
\r
1279 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.GetModifierStr
\r
1280 --------------------------------------------------------------}</FONT></I>
\r
1282 <B>function</B> TRegExpr.SetModifiersInt (<B>const</B> AModifiers : RegExprString; <B>var</B> AModifiersInt : integer) : boolean;
\r
1290 Mask := 0; <I><FONT COLOR="Navy">// strange compiler varning </FONT></I>
\r
1291 <B>for</B> i := 1 <B>to</B> length (AModifiers) <B>do</B>
\r
1292 <B>if</B> AModifiers [i] = '-'
\r
1293 <B>then</B> IsOn := false
\r
1294 <B>else</B> <B>begin</B>
\r
1295 <B>if</B> Pos (AModifiers [i], 'iI') > 0
\r
1296 <B>then</B> Mask := MaskModI
\r
1297 <B>else</B> <B>if</B> Pos (AModifiers [i], 'rR') > 0
\r
1298 <B>then</B> Mask := MaskModR
\r
1299 <B>else</B> <B>if</B> Pos (AModifiers [i], 'sS') > 0
\r
1300 <B>then</B> Mask := MaskModS
\r
1301 <B>else</B> <B>begin</B>
\r
1306 <B>then</B> AModifiersInt := AModifiersInt <B>or</B> Mask
\r
1307 <B>else</B> AModifiersInt := AModifiersInt <B>and</B> <B>not</B> Mask;
\r
1309 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.SetModifiersInt
\r
1310 --------------------------------------------------------------}</FONT></I>
\r
1312 <B>procedure</B> TRegExpr.SetModifierStr (<B>const</B> AModifiers : RegExprString);
\r
1314 <B>if</B> <B>not</B> SetModifiersInt (AModifiers, fModifiers)
\r
1315 <B>then</B> Error (reeModifierUnsupported);
\r
1316 CheckCompModifiers;
\r
1317 <B>end</B>; <I><FONT COLOR="Navy">{ of procedure TRegExpr.SetModifierStr
\r
1318 --------------------------------------------------------------}</FONT></I>
\r
1320 <B>function</B> TRegExpr.GetModifier (AIndex : integer) : boolean;
\r
1325 <B>case</B> AIndex <B>of</B>
\r
1326 1: Mask := MaskModI;
\r
1327 2: Mask := MaskModR;
\r
1328 3: Mask := MaskModS;
\r
1329 <B>else</B> <B>begin</B>
\r
1330 Error (reeModifierUnsupported);
\r
1334 Result := (fModifiers <B>and</B> Mask) = Mask;
\r
1335 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.GetModifier
\r
1336 --------------------------------------------------------------}</FONT></I>
\r
1338 <B>procedure</B> TRegExpr.SetModifier (AIndex : integer; ASet : boolean);
\r
1342 <B>case</B> AIndex <B>of</B>
\r
1343 1: Mask := MaskModI;
\r
1344 2: Mask := MaskModR;
\r
1345 3: Mask := MaskModS;
\r
1346 <B>else</B> <B>begin</B>
\r
1347 Error (reeModifierUnsupported);
\r
1352 <B>then</B> fModifiers := fModifiers <B>or</B> Mask
\r
1353 <B>else</B> fModifiers := fModifiers <B>and</B> <B>not</B> Mask;
\r
1354 CheckCompModifiers;
\r
1355 <B>end</B>; <I><FONT COLOR="Navy">{ of procedure TRegExpr.SetModifier
\r
1356 --------------------------------------------------------------}</FONT></I>
\r
1359 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
1360 <I><FONT COLOR="Navy">{==================== Compiler section =======================}</FONT></I>
\r
1361 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
1363 <B>procedure</B> TRegExpr.Tail (p : PRegExprChar; val : PRegExprChar);
\r
1364 <I><FONT COLOR="Navy">// set the next-pointer at the end of a node chain </FONT></I>
\r
1366 scan : PRegExprChar;
\r
1367 temp : PRegExprChar;
\r
1369 <B>if</B> p = @regdummy
\r
1370 <B>then</B> EXIT;
\r
1371 <I><FONT COLOR="Navy">// Find last node. </FONT></I>
\r
1374 temp := regnext (scan);
\r
1375 <B>if</B> temp = <B>nil</B>
\r
1376 <B>then</B> BREAK;
\r
1378 <B>UNTIL</B> false;
\r
1379 <I><FONT COLOR="Navy">// Set Next 'pointer' </FONT></I>
\r
1380 PRENextOff (scan + REOpSz)^ := val - scan; <I><FONT COLOR="Navy">//###0.933 </FONT></I>
\r
1381 <B>end</B>; <I><FONT COLOR="Navy">{ of procedure TRegExpr.Tail
\r
1382 --------------------------------------------------------------}</FONT></I>
\r
1384 <B>procedure</B> TRegExpr.OpTail (p : PRegExprChar; val : PRegExprChar);
\r
1385 <I><FONT COLOR="Navy">// regtail on operand of first argument; nop if operandless </FONT></I>
\r
1387 <I><FONT COLOR="Navy">// "Operandless" and "op != BRANCH" are synonymous in practice. </FONT></I>
\r
1388 <B>if</B> (p = <B>nil</B>) <B>or</B> (p = @regdummy) <B>or</B> (PREOp (p)^ <> BRANCH)
\r
1389 <B>then</B> EXIT;
\r
1390 Tail (p + REOpSz + RENextOffSz, val); <I><FONT COLOR="Navy">//###0.933 </FONT></I>
\r
1391 <B>end</B>; <I><FONT COLOR="Navy">{ of procedure TRegExpr.OpTail
\r
1392 --------------------------------------------------------------}</FONT></I>
\r
1394 <B>function</B> TRegExpr.EmitNode (op : TREOp) : PRegExprChar; <I><FONT COLOR="Navy">//###0.933 </FONT></I>
\r
1395 <I><FONT COLOR="Navy">// emit a node, return location </FONT></I>
\r
1397 Result := regcode;
\r
1398 <B>if</B> Result <> @regdummy <B>then</B> <B>begin</B>
\r
1399 PREOp (regcode)^ := op;
\r
1400 inc (regcode, REOpSz);
\r
1401 PRENextOff (regcode)^ := 0; <I><FONT COLOR="Navy">// Next "pointer" := nil </FONT></I>
\r
1402 inc (regcode, RENextOffSz);
\r
1404 <B>else</B> inc (regsize, REOpSz + RENextOffSz); <I><FONT COLOR="Navy">// compute code size without code generation </FONT></I>
\r
1405 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.EmitNode
\r
1406 --------------------------------------------------------------}</FONT></I>
\r
1408 <B>procedure</B> TRegExpr.EmitC (b : REChar);
\r
1409 <I><FONT COLOR="Navy">// emit a byte to code </FONT></I>
\r
1411 <B>if</B> regcode <> @regdummy <B>then</B> <B>begin</B>
\r
1415 <B>else</B> inc (regsize); <I><FONT COLOR="Navy">// Type of p-code pointer always is ^REChar </FONT></I>
\r
1416 <B>end</B>; <I><FONT COLOR="Navy">{ of procedure TRegExpr.EmitC
\r
1417 --------------------------------------------------------------}</FONT></I>
\r
1419 <B>procedure</B> TRegExpr.InsertOperator (op : TREOp; opnd : PRegExprChar; sz : integer);
\r
1420 <I><FONT COLOR="Navy">// insert an operator in front of already-emitted operand </FONT></I>
\r
1421 <I><FONT COLOR="Navy">// Means relocating the operand. </FONT></I>
\r
1423 src, dst, place : PRegExprChar;
\r
1426 <B>if</B> regcode = @regdummy <B>then</B> <B>begin</B>
\r
1427 inc (regsize, sz);
\r
1431 inc (regcode, sz);
\r
1433 <B>while</B> src > opnd <B>do</B> <B>begin</B>
\r
1438 place := opnd; <I><FONT COLOR="Navy">// Op node, where operand used to be. </FONT></I>
\r
1439 PREOp (place)^ := op;
\r
1440 inc (place, REOpSz);
\r
1441 <B>for</B> i := 1 + REOpSz <B>to</B> sz <B>do</B> <B>begin</B>
\r
1445 <B>end</B>; <I><FONT COLOR="Navy">{ of procedure TRegExpr.InsertOperator
\r
1446 --------------------------------------------------------------}</FONT></I>
\r
1448 <B>function</B> strcspn (s1 : PRegExprChar; s2 : PRegExprChar) : integer;
\r
1449 <I><FONT COLOR="Navy">// find length of initial segment of s1 consisting </FONT></I>
\r
1450 <I><FONT COLOR="Navy">// entirely of characters not from s2 </FONT></I>
\r
1451 <B>var</B> scan1, scan2 : PRegExprChar;
\r
1455 <B>while</B> scan1^ <> #0 <B>do</B> <B>begin</B>
\r
1457 <B>while</B> scan2^ <> #0 <B>do</B>
\r
1458 <B>if</B> scan1^ = scan2^
\r
1460 <B>else</B> inc (scan2);
\r
1464 <B>end</B>; <I><FONT COLOR="Navy">{ of function strcspn
\r
1465 --------------------------------------------------------------}</FONT></I>
\r
1468 <I><FONT COLOR="Navy">// Flags to be passed up and down. </FONT></I>
\r
1469 HASWIDTH = 01; <I><FONT COLOR="Navy">// Known never to match nil string. </FONT></I>
\r
1470 SIMPLE = 02; <I><FONT COLOR="Navy">// Simple enough to be STAR/PLUS/BRACES operand. </FONT></I>
\r
1471 SPSTART = 04; <I><FONT COLOR="Navy">// Starts with * or +. </FONT></I>
\r
1472 WORST = 0; <I><FONT COLOR="Navy">// Worst case. </FONT></I>
\r
1473 META : <B>array</B> [0 .. 12] <B>of</B> REChar = (
\r
1474 '^', '$', '.', '[', '(', ')', '|', '?', '+', '*', '/', '{', #0);
\r
1475 <I><FONT COLOR="Navy">// Any modification must be synchronized with QuoteRegExprMetaChars !!! </FONT></I>
\r
1477 <I><FONT COLOR="Navy">{$IFDEF UniCode}</FONT></I>
\r
1478 RusRangeLo : <B>array</B> [0 .. 33] <B>of</B> REChar =
\r
1479 (#$430,#$431,#$432,#$433,#$434,#$435,#$451,#$436,#$437,
\r
1480 #$438,#$439,#$43A,#$43B,#$43C,#$43D,#$43E,#$43F,
\r
1481 #$440,#$441,#$442,#$443,#$444,#$445,#$446,#$447,
\r
1482 #$448,#$449,#$44A,#$44B,#$44C,#$44D,#$44E,#$44F,#0);
\r
1483 RusRangeHi : <B>array</B> [0 .. 33] <B>of</B> REChar =
\r
1484 (#$410,#$411,#$412,#$413,#$414,#$415,#$401,#$416,#$417,
\r
1485 #$418,#$419,#$41A,#$41B,#$41C,#$41D,#$41E,#$41F,
\r
1486 #$420,#$421,#$422,#$423,#$424,#$425,#$426,#$427,
\r
1487 #$428,#$429,#$42A,#$42B,#$42C,#$42D,#$42E,#$42F,#0);
\r
1488 RusRangeLoLow = #$430<I><FONT COLOR="Navy">{'à'}</FONT></I>;
\r
1489 RusRangeLoHigh = #$44F<I><FONT COLOR="Navy">{'ÿ'}</FONT></I>;
\r
1490 RusRangeHiLow = #$410<I><FONT COLOR="Navy">{'À'}</FONT></I>;
\r
1491 RusRangeHiHigh = #$42F<I><FONT COLOR="Navy">{'ß'}</FONT></I>;
\r
1492 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
1493 RusRangeLo = 'àáâãäå¸æçèéêëìíîïðñòóôõö÷øùúûüýþÿ';
\r
1494 RusRangeHi = 'ÀÁÂÃÄŨÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞß';
\r
1495 RusRangeLoLow = 'à';
\r
1496 RusRangeLoHigh = 'ÿ';
\r
1497 RusRangeHiLow = 'À';
\r
1498 RusRangeHiHigh = 'ß';
\r
1499 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
1501 <B>function</B> TRegExpr.CompileRegExpr (exp : PRegExprChar) : boolean;
\r
1502 <I><FONT COLOR="Navy">// compile a regular expression into internal code </FONT></I>
\r
1503 <I><FONT COLOR="Navy">// We can't allocate space until we know how big the compiled form will be, </FONT></I>
\r
1504 <I><FONT COLOR="Navy">// but we can't compile it (and thus know how big it is) until we've got a </FONT></I>
\r
1505 <I><FONT COLOR="Navy">// place to put the code. So we cheat: we compile it twice, once with code </FONT></I>
\r
1506 <I><FONT COLOR="Navy">// generation turned off and size counting turned on, and once "for real". </FONT></I>
\r
1507 <I><FONT COLOR="Navy">// This also means that we don't allocate space until we are sure that the </FONT></I>
\r
1508 <I><FONT COLOR="Navy">// thing really will compile successfully, and we never have to move the </FONT></I>
\r
1509 <I><FONT COLOR="Navy">// code and thus invalidate pointers into it. (Note that it has to be in </FONT></I>
\r
1510 <I><FONT COLOR="Navy">// one piece because free() must be able to free it all.) </FONT></I>
\r
1511 <I><FONT COLOR="Navy">// Beware that the optimization-preparation code in here knows about some </FONT></I>
\r
1512 <I><FONT COLOR="Navy">// of the structure of the compiled regexp. </FONT></I>
\r
1514 scan, longest : PRegExprChar;
\r
1518 Result := false; <I><FONT COLOR="Navy">// life too dark </FONT></I>
\r
1519 fExprIsCompiled := false;
\r
1521 regparse := <B>nil</B>; <I><FONT COLOR="Navy">// for correct error handling </FONT></I>
\r
1522 regexpbeg := exp;
\r
1523 <B>try</B> <I><FONT COLOR="Navy">// must clear regexpbeg after compilation </FONT></I>
\r
1525 <B>if</B> programm <> <B>nil</B> <B>then</B> <B>begin</B>
\r
1526 FreeMem (programm);
\r
1527 programm := <B>nil</B>;
\r
1530 <B>if</B> exp = <B>nil</B> <B>then</B> <B>begin</B>
\r
1531 Error (reeCompNullArgument);
\r
1535 fProgModifiers := fModifiers;
\r
1536 <I><FONT COLOR="Navy">// well, may it's paranoia. I'll check it later... !!!!!!!! </FONT></I>
\r
1538 <I><FONT COLOR="Navy">// First pass: determine size, legality. </FONT></I>
\r
1539 fCompModifiers := fModifiers;
\r
1543 regcode := @regdummy;
\r
1545 <B>if</B> ParseReg (0, flags) = <B>nil</B>
\r
1546 <B>then</B> EXIT;
\r
1548 <I><FONT COLOR="Navy">// Small enough for 2-bytes programm pointers ? </FONT></I>
\r
1549 <I><FONT COLOR="Navy">// ###0.933 no real p-code length limits now :))) </FONT></I>
\r
1550 <I><FONT COLOR="Navy">// if regsize >= 64 * 1024 then begin </FONT></I>
\r
1551 <I><FONT COLOR="Navy">// Error (reeCompRegexpTooBig); </FONT></I>
\r
1552 <I><FONT COLOR="Navy">// EXIT; </FONT></I>
\r
1553 <I><FONT COLOR="Navy">// end; </FONT></I>
\r
1555 <I><FONT COLOR="Navy">// Allocate space. </FONT></I>
\r
1556 GetMem (programm, regsize * SizeOf (REChar));
\r
1558 <I><FONT COLOR="Navy">// Second pass: emit code. </FONT></I>
\r
1559 fCompModifiers := fModifiers;
\r
1562 regcode := programm;
\r
1564 <B>if</B> ParseReg (0, flags) = <B>nil</B>
\r
1565 <B>then</B> EXIT;
\r
1567 <I><FONT COLOR="Navy">// Dig out information for optimizations. </FONT></I>
\r
1568 <I><FONT COLOR="Navy">{$IFDEF UseFirstCharSet}</FONT></I> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
1569 FirstCharSet := [];
\r
1570 FillFirstCharSet (programm + REOpSz);
\r
1571 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
1572 regstart := #0; <I><FONT COLOR="Navy">// Worst-case defaults. </FONT></I>
\r
1574 regmust := <B>nil</B>;
\r
1576 scan := programm + REOpSz; <I><FONT COLOR="Navy">// First BRANCH. </FONT></I>
\r
1577 <B>if</B> PREOp (regnext (scan))^ = EEND <B>then</B> <B>begin</B> <I><FONT COLOR="Navy">// Only one top-level choice. </FONT></I>
\r
1578 scan := scan + REOpSz + RENextOffSz;
\r
1580 <I><FONT COLOR="Navy">// Starting-point info. </FONT></I>
\r
1581 <B>if</B> PREOp (scan)^ = EXACTLY
\r
1582 <B>then</B> regstart := (scan + REOpSz + RENextOffSz)^
\r
1583 <B>else</B> <B>if</B> PREOp (scan)^ = BOL
\r
1584 <B>then</B> inc (reganch);
\r
1586 <I><FONT COLOR="Navy">// If there's something expensive in the r.e., find the longest </FONT></I>
\r
1587 <I><FONT COLOR="Navy">// literal string that must appear and make it the regmust. Resolve </FONT></I>
\r
1588 <I><FONT COLOR="Navy">// ties in favor of later strings, since the regstart check works </FONT></I>
\r
1589 <I><FONT COLOR="Navy">// with the beginning of the r.e. and avoiding duplication </FONT></I>
\r
1590 <I><FONT COLOR="Navy">// strengthens checking. Not a strong reason, but sufficient in the </FONT></I>
\r
1591 <I><FONT COLOR="Navy">// absence of others. </FONT></I>
\r
1592 <B>if</B> (flags <B>and</B> SPSTART) <> 0 <B>then</B> <B>begin</B>
\r
1593 longest := <B>nil</B>;
\r
1595 <B>while</B> scan <> <B>nil</B> <B>do</B> <B>begin</B>
\r
1596 <B>if</B> (PREOp (scan)^ = EXACTLY)
\r
1597 <B>and</B> (strlen (scan + REOpSz + RENextOffSz) >= len) <B>then</B> <B>begin</B>
\r
1598 longest := scan + REOpSz + RENextOffSz;
\r
1599 len := strlen (longest);
\r
1601 scan := regnext (scan);
\r
1603 regmust := longest;
\r
1608 <B>finally</B> regexpbeg := <B>nil</B>;
\r
1611 fExprIsCompiled := true;
\r
1613 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.CompileRegExpr
\r
1614 --------------------------------------------------------------}</FONT></I>
\r
1616 <B>function</B> TRegExpr.ParseReg (paren : integer; <B>var</B> flagp : integer) : PRegExprChar;
\r
1617 <I><FONT COLOR="Navy">// regular expression, i.e. main body or parenthesized thing </FONT></I>
\r
1618 <I><FONT COLOR="Navy">// Caller must absorb opening parenthesis. </FONT></I>
\r
1619 <I><FONT COLOR="Navy">// Combining parenthesis handling with the base level of regular expression </FONT></I>
\r
1620 <I><FONT COLOR="Navy">// is a trifle forced, but the need to tie the tails of the branches to what </FONT></I>
\r
1621 <I><FONT COLOR="Navy">// follows makes it hard to avoid. </FONT></I>
\r
1623 ret, br, ender : PRegExprChar;
\r
1626 SavedModifiers : integer;
\r
1628 Result := <B>nil</B>;
\r
1629 flagp := HASWIDTH; <I><FONT COLOR="Navy">// Tentatively. </FONT></I>
\r
1630 parno := 0; <I><FONT COLOR="Navy">// eliminate compiler stupid warning </FONT></I>
\r
1631 SavedModifiers := fCompModifiers;
\r
1633 <I><FONT COLOR="Navy">// Make an OPEN node, if parenthesized. </FONT></I>
\r
1634 <B>if</B> paren <> 0 <B>then</B> <B>begin</B>
\r
1635 <B>if</B> regnpar >= NSUBEXP <B>then</B> <B>begin</B>
\r
1636 Error (reeCompParseRegTooManyBrackets);
\r
1639 parno := regnpar;
\r
1641 ret := EmitNode (TREOp (ord (OPEN) + parno));
\r
1643 <B>else</B> ret := <B>nil</B>;
\r
1645 <I><FONT COLOR="Navy">// Pick up the branches, linking them together. </FONT></I>
\r
1646 br := ParseBranch (flags);
\r
1647 <B>if</B> br = <B>nil</B> <B>then</B> <B>begin</B>
\r
1648 Result := <B>nil</B>;
\r
1651 <B>if</B> ret <> <B>nil</B>
\r
1652 <B>then</B> Tail (ret, br) <I><FONT COLOR="Navy">// OPEN -> first. </FONT></I>
\r
1653 <B>else</B> ret := br;
\r
1654 <B>if</B> (flags <B>and</B> HASWIDTH) = 0
\r
1655 <B>then</B> flagp := flagp <B>and</B> <B>not</B> HASWIDTH;
\r
1656 flagp := flagp <B>or</B> flags <B>and</B> SPSTART;
\r
1657 <B>while</B> (regparse^ = '|') <B>do</B> <B>begin</B>
\r
1659 br := ParseBranch (flags);
\r
1660 <B>if</B> br = <B>nil</B> <B>then</B> <B>begin</B>
\r
1661 Result := <B>nil</B>;
\r
1664 Tail (ret, br); <I><FONT COLOR="Navy">// BRANCH -> BRANCH. </FONT></I>
\r
1665 <B>if</B> (flags <B>and</B> HASWIDTH) = 0
\r
1666 <B>then</B> flagp := flagp <B>and</B> <B>not</B> HASWIDTH;
\r
1667 flagp := flagp <B>or</B> flags <B>and</B> SPSTART;
\r
1670 <I><FONT COLOR="Navy">// Make a closing node, and hook it on the end. </FONT></I>
\r
1671 <B>if</B> paren <> 0
\r
1672 <B>then</B> ender := EmitNode (TREOp (ord (CLOSE) + parno))
\r
1673 <B>else</B> ender := EmitNode (EEND);
\r
1674 Tail (ret, ender);
\r
1676 <I><FONT COLOR="Navy">// Hook the tails of the branches to the closing node. </FONT></I>
\r
1678 <B>while</B> br <> <B>nil</B> <B>do</B> <B>begin</B>
\r
1679 OpTail (br, ender);
\r
1680 br := regnext (br);
\r
1683 <I><FONT COLOR="Navy">// Check for proper termination. </FONT></I>
\r
1684 <B>if</B> paren <> 0 <B>then</B>
\r
1685 <B>if</B> regparse^ <> ')' <B>then</B> <B>begin</B>
\r
1686 Error (reeCompParseRegUnmatchedBrackets);
\r
1689 <B>else</B> inc (regparse); <I><FONT COLOR="Navy">// skip trailing ')' </FONT></I>
\r
1690 <B>if</B> (paren = 0) <B>and</B> (regparse^ <> #0) <B>then</B> <B>begin</B>
\r
1691 <B>if</B> regparse^ = ')'
\r
1692 <B>then</B> Error (reeCompParseRegUnmatchedBrackets2)
\r
1693 <B>else</B> Error (reeCompParseRegJunkOnEnd);
\r
1696 fCompModifiers := SavedModifiers; <I><FONT COLOR="Navy">// restore modifiers of parent </FONT></I>
\r
1698 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.ParseReg
\r
1699 --------------------------------------------------------------}</FONT></I>
\r
1701 <B>function</B> TRegExpr.ParseBranch (<B>var</B> flagp : integer) : PRegExprChar;
\r
1702 <I><FONT COLOR="Navy">// one alternative of an | operator </FONT></I>
\r
1703 <I><FONT COLOR="Navy">// Implements the concatenation operator. </FONT></I>
\r
1705 ret, chain, latest : PRegExprChar;
\r
1708 flagp := WORST; <I><FONT COLOR="Navy">// Tentatively. </FONT></I>
\r
1710 ret := EmitNode (BRANCH);
\r
1711 chain := <B>nil</B>;
\r
1712 <B>while</B> (regparse^ <> #0) <B>and</B> (regparse^ <> '|')
\r
1713 <B>and</B> (regparse^ <> ')') <B>do</B> <B>begin</B>
\r
1714 latest := ParsePiece (flags);
\r
1715 <B>if</B> latest = <B>nil</B> <B>then</B> <B>begin</B>
\r
1716 Result := <B>nil</B>;
\r
1719 flagp := flagp <B>or</B> flags <B>and</B> HASWIDTH;
\r
1720 <B>if</B> chain = <B>nil</B> <I><FONT COLOR="Navy">// First piece. </FONT></I>
\r
1721 <B>then</B> flagp := flagp <B>or</B> flags <B>and</B> SPSTART
\r
1722 <B>else</B> Tail (chain, latest);
\r
1725 <B>if</B> chain = <B>nil</B> <I><FONT COLOR="Navy">// Loop ran zero times. </FONT></I>
\r
1726 <B>then</B> EmitNode (NOTHING);
\r
1728 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.ParseBranch
\r
1729 --------------------------------------------------------------}</FONT></I>
\r
1731 <B>function</B> TRegExpr.ParsePiece (<B>var</B> flagp : integer) : PRegExprChar;
\r
1732 <I><FONT COLOR="Navy">// something followed by possible [*+?{] </FONT></I>
\r
1733 <I><FONT COLOR="Navy">// Note that the branching code sequences used for ? and the general cases </FONT></I>
\r
1734 <I><FONT COLOR="Navy">// of * and + and { are somewhat optimized: they use the same NOTHING node as </FONT></I>
\r
1735 <I><FONT COLOR="Navy">// both the endmarker for their branch list and the body of the last branch. </FONT></I>
\r
1736 <I><FONT COLOR="Navy">// It might seem that this node could be dispensed with entirely, but the </FONT></I>
\r
1737 <I><FONT COLOR="Navy">// endmarker role is not redundant. </FONT></I>
\r
1738 <B>function</B> parsenum (AStart, AEnd : PRegExprChar) : TREBracesArg;
\r
1741 <B>if</B> AEnd - AStart + 1 > 8 <B>then</B> <B>begin</B> <I><FONT COLOR="Navy">// prevent stupid scanning </FONT></I>
\r
1742 Error (reeBRACESArgTooBig);
\r
1745 <B>while</B> AStart <= AEnd <B>do</B> <B>begin</B>
\r
1746 Result := Result * 10 + (ord (AStart^) - ord ('0'));
\r
1749 <B>if</B> (Result > MaxBracesArg) <B>or</B> (Result < 0) <B>then</B> <B>begin</B>
\r
1750 Error (reeBRACESArgTooBig);
\r
1756 NextNode : PRegExprChar;
\r
1758 BracesMin, Bracesmax : TREBracesArg;
\r
1759 p, savedparse : PRegExprChar;
\r
1760 <I><FONT COLOR="Navy">{$IFDEF ComplexBraces}</FONT></I>
\r
1762 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
1764 Result := ParseAtom (flags);
\r
1765 <B>if</B> Result = <B>nil</B>
\r
1766 <B>then</B> EXIT;
\r
1769 <B>if</B> <B>not</B> ((op = '*') <B>or</B> (op = '+') <B>or</B> (op = '?') <B>or</B> (op = '{')) <B>then</B> <B>begin</B>
\r
1773 <B>if</B> ((flags <B>and</B> HASWIDTH) = 0) <B>and</B> (op <> '?') <B>then</B> <B>begin</B>
\r
1774 Error (reePlusStarOperandCouldBeEmpty);
\r
1778 <B>case</B> op <B>of</B>
\r
1779 '*': <B>begin</B>
\r
1780 flagp := WORST <B>or</B> SPSTART;
\r
1781 <B>if</B> (flags <B>and</B> SIMPLE) = 0 <B>then</B> <B>begin</B>
\r
1782 <I><FONT COLOR="Navy">// Emit x* as (x&|), where & means "self". </FONT></I>
\r
1783 InsertOperator (BRANCH, Result, REOpSz + RENextOffSz); <I><FONT COLOR="Navy">// Either x </FONT></I>
\r
1784 OpTail (Result, EmitNode (BACK)); <I><FONT COLOR="Navy">// and loop </FONT></I>
\r
1785 OpTail (Result, Result); <I><FONT COLOR="Navy">// back </FONT></I>
\r
1786 Tail (Result, EmitNode (BRANCH)); <I><FONT COLOR="Navy">// or </FONT></I>
\r
1787 Tail (Result, EmitNode (NOTHING)); <I><FONT COLOR="Navy">// nil. </FONT></I>
\r
1789 <B>else</B> InsertOperator (STAR, Result, REOpSz + RENextOffSz);
\r
1790 <B>end</B>; <I><FONT COLOR="Navy">{ of case '*'}</FONT></I>
\r
1791 '+': <B>begin</B>
\r
1792 flagp := WORST <B>or</B> SPSTART <B>or</B> HASWIDTH;
\r
1793 <B>if</B> (flags <B>and</B> SIMPLE) = 0 <B>then</B> <B>begin</B>
\r
1794 <I><FONT COLOR="Navy">// Emit x+ as x(&|), where & means "self". </FONT></I>
\r
1795 NextNode := EmitNode (BRANCH); <I><FONT COLOR="Navy">// Either </FONT></I>
\r
1796 Tail (Result, NextNode);
\r
1797 Tail (EmitNode (BACK), Result); <I><FONT COLOR="Navy">// loop back </FONT></I>
\r
1798 Tail (NextNode, EmitNode (BRANCH)); <I><FONT COLOR="Navy">// or </FONT></I>
\r
1799 Tail (Result, EmitNode (NOTHING)); <I><FONT COLOR="Navy">// nil. </FONT></I>
\r
1801 <B>else</B> InsertOperator (PLUS, Result, REOpSz + RENextOffSz);
\r
1802 <B>end</B>; <I><FONT COLOR="Navy">{ of case '+'}</FONT></I>
\r
1803 '?': <B>begin</B>
\r
1805 <I><FONT COLOR="Navy">// Emit x? as (x|) </FONT></I>
\r
1806 InsertOperator (BRANCH, Result, REOpSz + RENextOffSz); <I><FONT COLOR="Navy">// Either x </FONT></I>
\r
1807 Tail (Result, EmitNode (BRANCH)); <I><FONT COLOR="Navy">// or </FONT></I>
\r
1808 NextNode := EmitNode (NOTHING); <I><FONT COLOR="Navy">// nil. </FONT></I>
\r
1809 Tail (Result, NextNode);
\r
1810 OpTail (Result, NextNode);
\r
1811 <B>end</B>; <I><FONT COLOR="Navy">{ of case '?'}</FONT></I>
\r
1812 '{': <B>begin</B>
\r
1813 savedparse := regparse;
\r
1816 <B>while</B> Pos (regparse^, '0123456789') > 0 <I><FONT COLOR="Navy">// <min> MUST appear </FONT></I>
\r
1817 <B>do</B> inc (regparse);
\r
1818 <B>if</B> (regparse^ <> '}') <B>and</B> (regparse^ <> ',') <B>or</B> (p = regparse) <B>then</B> <B>begin</B>
\r
1819 regparse := savedparse;
\r
1823 BracesMin := parsenum (p, regparse - 1);
\r
1824 <B>if</B> regparse^ = ',' <B>then</B> <B>begin</B>
\r
1827 <B>while</B> Pos (regparse^, '0123456789') > 0
\r
1828 <B>do</B> inc (regparse);
\r
1829 <B>if</B> regparse^ <> '}' <B>then</B> <B>begin</B>
\r
1830 regparse := savedparse;
\r
1833 <B>if</B> p = regparse
\r
1834 <B>then</B> BracesMax := MaxBracesArg
\r
1835 <B>else</B> BracesMax := parsenum (p, regparse - 1);
\r
1837 <B>else</B> BracesMax := BracesMin; <I><FONT COLOR="Navy">// {n} == {n,n} </FONT></I>
\r
1838 <B>if</B> BracesMin > BracesMax <B>then</B> <B>begin</B>
\r
1839 Error (reeBracesMinParamGreaterMax);
\r
1842 <B>if</B> BracesMin > 0
\r
1843 <B>then</B> flagp := WORST;
\r
1844 <B>if</B> BracesMax > 0
\r
1845 <B>then</B> flagp := flagp <B>or</B> HASWIDTH <B>or</B> SPSTART;
\r
1846 <B>if</B> (flags <B>and</B> SIMPLE) <> 0 <B>then</B> <B>begin</B>
\r
1847 InsertOperator (BRACES, Result, REOpSz + RENextOffSz + REBracesArgSz * 2);
\r
1848 <B>if</B> regcode <> @regdummy <B>then</B> <B>begin</B>
\r
1849 PREBracesArg (Result + REOpSz + RENextOffSz)^ := BracesMin;
\r
1850 PREBracesArg (Result + REOpSz + RENextOffSz + REBracesArgSz)^ := BracesMax;
\r
1853 <B>else</B> <B>begin</B> <I><FONT COLOR="Navy">// Emit complex x{min,max} </FONT></I>
\r
1854 <I><FONT COLOR="Navy">{$IFNDEF ComplexBraces}</FONT></I>
\r
1855 Error (reeComplexBracesNotImplemented);
\r
1857 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
1858 InsertOperator (LOOPENTRY, Result, REOpSz + RENextOffSz);
\r
1859 NextNode := EmitNode (LOOP);
\r
1860 <B>if</B> regcode <> @regdummy <B>then</B> <B>begin</B>
\r
1861 off := (Result + REOpSz + RENextOffSz)
\r
1862 - (regcode - REOpSz - RENextOffSz); <I><FONT COLOR="Navy">// back to Atom after LOOPENTRY </FONT></I>
\r
1863 PREBracesArg (regcode)^ := BracesMin;
\r
1864 inc (regcode, REBracesArgSz);
\r
1865 PREBracesArg (regcode)^ := BracesMax;
\r
1866 inc (regcode, REBracesArgSz);
\r
1867 PRENextOff (regcode)^ := off;
\r
1868 inc (regcode, RENextOffSz);
\r
1870 <B>else</B> inc (regsize, REBracesArgSz * 2 + RENextOffSz);
\r
1871 Tail (Result, NextNode); <I><FONT COLOR="Navy">// LOOPENTRY -> LOOP </FONT></I>
\r
1872 <B>if</B> regcode <> @regdummy <B>then</B>
\r
1873 Tail (Result + REOpSz + RENextOffSz, NextNode); <I><FONT COLOR="Navy">// Atom -> LOOP </FONT></I>
\r
1874 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
1876 <B>end</B>; <I><FONT COLOR="Navy">{ of case '{'}</FONT></I>
\r
1877 <I><FONT COLOR="Navy">// else // here we can't be </FONT></I>
\r
1878 <B>end</B>; <I><FONT COLOR="Navy">{ of case op}</FONT></I>
\r
1881 <B>if</B> (regparse^ = '*') <B>or</B> (regparse^ = '+') <B>or</B> (regparse^ = '?') <B>or</B> (regparse^ = '{') <B>then</B> <B>begin</B>
\r
1882 Error (reeNestedSQP);
\r
1885 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.ParsePiece
\r
1886 --------------------------------------------------------------}</FONT></I>
\r
1888 <B>function</B> TRegExpr.ParseAtom (<B>var</B> flagp : integer) : PRegExprChar;
\r
1889 <I><FONT COLOR="Navy">// the lowest level </FONT></I>
\r
1890 <I><FONT COLOR="Navy">// Optimization: gobbles an entire sequence of ordinary characters so that </FONT></I>
\r
1891 <I><FONT COLOR="Navy">// it can turn them into a single node, which is smaller to store and </FONT></I>
\r
1892 <I><FONT COLOR="Navy">// faster to run. Backslashed characters are exceptions, each becoming a </FONT></I>
\r
1893 <I><FONT COLOR="Navy">// separate node; the code is simpler that way and it's not worth fixing. </FONT></I>
\r
1895 ret : PRegExprChar;
\r
1897 RangeBeg, RangeEnd : REChar;
\r
1898 CanBeRange : boolean;
\r
1901 begmodfs : PRegExprChar;
\r
1903 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I> <I><FONT COLOR="Navy">//###0.930 </FONT></I>
\r
1904 RangePCodeBeg : PRegExprChar;
\r
1905 RangePCodeIdx : integer;
\r
1906 RangeIsCI : boolean;
\r
1907 RangeSet : TSetOfREChar;
\r
1908 RangeLen : integer;
\r
1909 RangeChMin, RangeChMax : REChar;
\r
1910 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
1912 <B>procedure</B> EmitExactly (ch : REChar);
\r
1914 <B>if</B> (fCompModifiers <B>and</B> MaskModI) = MaskModI
\r
1915 <B>then</B> ret := EmitNode (EXACTLYCI)
\r
1916 <B>else</B> ret := EmitNode (EXACTLY);
\r
1919 flagp := flagp <B>or</B> HASWIDTH <B>or</B> SIMPLE;
\r
1922 <B>procedure</B> EmitStr (<B>const</B> s : RegExprString);
\r
1923 <B>var</B> i : integer;
\r
1925 <B>for</B> i := 1 <B>to</B> length (s)
\r
1926 <B>do</B> EmitC (s [i]);
\r
1929 <B>function</B> HexDig (ch : REChar) : integer;
\r
1932 <B>if</B> (ch >= 'a') <B>and</B> (ch <= 'f')
\r
1933 <B>then</B> ch := REChar (ord (ch) - (ord ('a') - ord ('A')));
\r
1934 <B>if</B> (ch < '0') <B>or</B> (ch > 'F') <B>or</B> ((ch > '9') <B>and</B> (ch < 'A')) <B>then</B> <B>begin</B>
\r
1935 Error (reeBadHexDigit);
\r
1938 Result := ord (ch) - ord ('0');
\r
1939 <B>if</B> ch >= 'A'
\r
1940 <B>then</B> Result := Result - (ord ('A') - ord ('9') - 1);
\r
1943 <B>function</B> EmitRange (AOpCode : REChar) : PRegExprChar;
\r
1945 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I>
\r
1946 <B>case</B> AOpCode <B>of</B>
\r
1947 ANYBUTCI, ANYBUT:
\r
1948 Result := EmitNode (ANYBUTTINYSET);
\r
1949 <B>else</B> <I><FONT COLOR="Navy">// ANYOFCI, ANYOF </FONT></I>
\r
1950 Result := EmitNode (ANYOFTINYSET);
\r
1952 <B>case</B> AOpCode <B>of</B>
\r
1953 ANYBUTCI, ANYOFCI:
\r
1954 RangeIsCI := True;
\r
1955 <B>else</B> <I><FONT COLOR="Navy">// ANYBUT, ANYOF </FONT></I>
\r
1956 RangeIsCI := False;
\r
1958 RangePCodeBeg := regcode;
\r
1959 RangePCodeIdx := regsize;
\r
1962 RangeChMin := #255;
\r
1963 RangeChMax := #0;
\r
1964 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
1965 Result := EmitNode (AOpCode);
\r
1966 <I><FONT COLOR="Navy">// ToDo: </FONT></I>
\r
1967 <I><FONT COLOR="Navy">// !!!!!!!!!!!!! Implement ANYOF[BUT]TINYSET generation for UniCode !!!!!!!!!! </FONT></I>
\r
1968 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
1971 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I>
\r
1972 <B>procedure</B> EmitRangeCPrim (b : REChar); <I><FONT COLOR="Navy">//###0.930 </FONT></I>
\r
1974 <B>if</B> b <B>in</B> RangeSet
\r
1975 <B>then</B> EXIT;
\r
1977 <B>if</B> b < RangeChMin
\r
1978 <B>then</B> RangeChMin := b;
\r
1979 <B>if</B> b > RangeChMax
\r
1980 <B>then</B> RangeChMax := b;
\r
1981 Include (RangeSet, b);
\r
1983 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
1985 <B>procedure</B> EmitRangeC (b : REChar);
\r
1986 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I>
\r
1989 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
1991 CanBeRange := false;
\r
1992 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I>
\r
1993 <B>if</B> b <> #0 <B>then</B> <B>begin</B>
\r
1994 EmitRangeCPrim (b); <I><FONT COLOR="Navy">//###0.930 </FONT></I>
\r
1995 <B>if</B> RangeIsCI
\r
1996 <B>then</B> EmitRangeCPrim (InvertCase (b)); <I><FONT COLOR="Navy">//###0.930 </FONT></I>
\r
1998 <B>else</B> <B>begin</B>
\r
1999 Assert (RangeLen > 0, 'TRegExpr.ParseAtom(subroutine EmitRangeC): empty range'); <I><FONT COLOR="Navy">// impossible, but who knows.. </FONT></I>
\r
2000 Assert (RangeChMin <= RangeChMax, 'TRegExpr.ParseAtom(subroutine EmitRangeC): RangeChMin > RangeChMax'); <I><FONT COLOR="Navy">// impossible, but who knows.. </FONT></I>
\r
2001 <B>if</B> RangeLen <= TinySetLen <B>then</B> <B>begin</B> <I><FONT COLOR="Navy">// emit "tiny set" </FONT></I>
\r
2002 <B>if</B> regcode = @regdummy <B>then</B> <B>begin</B>
\r
2003 regsize := RangePCodeIdx + TinySetLen; <I><FONT COLOR="Navy">// RangeChMin/Max !!! </FONT></I>
\r
2006 regcode := RangePCodeBeg;
\r
2007 <B>for</B> Ch := RangeChMin <B>to</B> RangeChMax <B>do</B> <I><FONT COLOR="Navy">//###0.930 </FONT></I>
\r
2008 <B>if</B> Ch <B>in</B> RangeSet <B>then</B> <B>begin</B>
\r
2012 <I><FONT COLOR="Navy">// fill rest: </FONT></I>
\r
2013 <B>while</B> regcode < RangePCodeBeg + TinySetLen <B>do</B> <B>begin</B>
\r
2014 regcode^ := RangeChMax;
\r
2018 <B>else</B> <B>begin</B>
\r
2019 <B>if</B> regcode = @regdummy <B>then</B> <B>begin</B>
\r
2020 regsize := RangePCodeIdx + SizeOf (TSetOfREChar);
\r
2023 <B>if</B> (RangePCodeBeg - REOpSz - RENextOffSz)^ = ANYBUTTINYSET
\r
2024 <B>then</B> RangeSet := [#0 .. #255] - RangeSet;
\r
2025 PREOp (RangePCodeBeg - REOpSz - RENextOffSz)^ := ANYOFFULLSET;
\r
2026 regcode := RangePCodeBeg;
\r
2027 Move (RangeSet, regcode^, SizeOf (TSetOfREChar));
\r
2028 inc (regcode, SizeOf (TSetOfREChar));
\r
2031 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
2033 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2036 <B>procedure</B> EmitSimpleRangeC (b : REChar);
\r
2040 CanBeRange := true;
\r
2043 <B>procedure</B> EmitRangeStr (<B>const</B> s : RegExprString);
\r
2044 <B>var</B> i : integer;
\r
2046 <B>for</B> i := 1 <B>to</B> length (s)
\r
2047 <B>do</B> EmitRangeC (s [i]);
\r
2050 <B>function</B> UnQuoteChar (<B>var</B> APtr : PRegExprChar) : REChar; <I><FONT COLOR="Navy">//###0.934 </FONT></I>
\r
2052 <B>case</B> APtr^ <B>of</B>
\r
2053 't': Result := #$9; <I><FONT COLOR="Navy">// tab (HT/TAB) </FONT></I>
\r
2054 'n': Result := #$a; <I><FONT COLOR="Navy">// newline (NL) </FONT></I>
\r
2055 'r': Result := #$d; <I><FONT COLOR="Navy">// car.return (CR) </FONT></I>
\r
2056 'f': Result := #$c; <I><FONT COLOR="Navy">// form feed (FF) </FONT></I>
\r
2057 'a': Result := #$7; <I><FONT COLOR="Navy">// alarm (bell) (BEL) </FONT></I>
\r
2058 'e': Result := #$1b; <I><FONT COLOR="Navy">// escape (ESC) </FONT></I>
\r
2059 'x': <B>begin</B> <I><FONT COLOR="Navy">// hex char </FONT></I>
\r
2062 <B>if</B> APtr^ = #0 <B>then</B> <B>begin</B>
\r
2063 Error (reeNoHexCodeAfterBSlashX);
\r
2066 <B>if</B> APtr^ = '{' <B>then</B> <B>begin</B> <I><FONT COLOR="Navy">// /x{nnnn} //###0.936 </FONT></I>
\r
2069 <B>if</B> APtr^ = #0 <B>then</B> <B>begin</B>
\r
2070 Error (reeNoHexCodeAfterBSlashX);
\r
2073 <B>if</B> APtr^ <> '}' <B>then</B> <B>begin</B>
\r
2074 <B>if</B> (Ord (Result)
\r
2075 <B>ShR</B> (SizeOf (REChar) * 8 - 4)) <B>and</B> $F <> 0 <B>then</B> <B>begin</B>
\r
2076 Error (reeHexCodeAfterBSlashXTooBig);
\r
2079 Result := REChar ((Ord (Result) <B>ShL</B> 4) <B>or</B> HexDig (APtr^));
\r
2080 <I><FONT COLOR="Navy">// HexDig will cause Error if bad hex digit found </FONT></I>
\r
2082 <B>else</B> BREAK;
\r
2083 <B>UNTIL</B> False;
\r
2085 <B>else</B> <B>begin</B>
\r
2086 Result := REChar (HexDig (APtr^));
\r
2087 <I><FONT COLOR="Navy">// HexDig will cause Error if bad hex digit found </FONT></I>
\r
2089 <B>if</B> APtr^ = #0 <B>then</B> <B>begin</B>
\r
2090 Error (reeNoHexCodeAfterBSlashX);
\r
2093 Result := REChar ((Ord (Result) <B>ShL</B> 4) <B>or</B> HexDig (APtr^));
\r
2094 <I><FONT COLOR="Navy">// HexDig will cause Error if bad hex digit found </FONT></I>
\r
2097 <B>else</B> Result := APtr^;
\r
2102 Result := <B>nil</B>;
\r
2103 flagp := WORST; <I><FONT COLOR="Navy">// Tentatively. </FONT></I>
\r
2106 <B>case</B> (regparse - 1)^ <B>of</B>
\r
2107 '^': ret := EmitNode (BOL);
\r
2108 '$': ret := EmitNode (EOL);
\r
2110 <B>if</B> (fCompModifiers <B>and</B> MaskModS) = MaskModS <B>then</B> <B>begin</B>
\r
2111 ret := EmitNode (ANY);
\r
2112 flagp := flagp <B>or</B> HASWIDTH <B>or</B> SIMPLE;
\r
2114 <B>else</B> <B>begin</B> <I><FONT COLOR="Navy">// not /s, so emit [^/n] </FONT></I>
\r
2115 ret := EmitRange (ANYBUT);
\r
2116 EmitRangeStr (#$a);
\r
2118 flagp := flagp <B>or</B> HASWIDTH <B>or</B> SIMPLE;
\r
2120 '[': <B>begin</B>
\r
2121 <B>if</B> regparse^ = '^' <B>then</B> <B>begin</B> <I><FONT COLOR="Navy">// Complement of range. </FONT></I>
\r
2122 <B>if</B> (fCompModifiers <B>and</B> MaskModI) = MaskModI
\r
2123 <B>then</B> ret := EmitRange (ANYBUTCI)
\r
2124 <B>else</B> ret := EmitRange (ANYBUT);
\r
2128 <B>if</B> (fCompModifiers <B>and</B> MaskModI) = MaskModI
\r
2129 <B>then</B> ret := EmitRange (ANYOFCI)
\r
2130 <B>else</B> ret := EmitRange (ANYOF);
\r
2132 CanBeRange := false;
\r
2134 <B>if</B> (regparse^ = ']') <B>then</B> <B>begin</B>
\r
2135 EmitSimpleRangeC (regparse^); <I><FONT COLOR="Navy">// []-a] -> ']' .. 'a' </FONT></I>
\r
2139 <B>while</B> (regparse^ <> #0) <B>and</B> (regparse^ <> ']') <B>do</B> <B>begin</B>
\r
2140 <B>if</B> (regparse^ = '-')
\r
2141 <B>and</B> ((regparse + 1)^ <> #0) <B>and</B> ((regparse + 1)^ <> ']')
\r
2142 <B>and</B> CanBeRange <B>then</B> <B>begin</B>
\r
2144 RangeEnd := regparse^;
\r
2145 <B>if</B> RangeEnd = '/' <B>then</B> <B>begin</B>
\r
2146 <I><FONT COLOR="Navy">{$IFDEF UniCode}</FONT></I> <I><FONT COLOR="Navy">//###0.935 </FONT></I>
\r
2147 <B>if</B> (ord ((regparse + 1)^) < 256)
\r
2148 <B>and</B> (char ((regparse + 1)^)
\r
2149 <B>in</B> ['d', 'D', 's', 'S', 'w', 'W']) <B>then</B> <B>begin</B>
\r
2150 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
2151 <B>if</B> (regparse + 1)^ <B>in</B> ['d', 'D', 's', 'S', 'w', 'W'] <B>then</B> <B>begin</B>
\r
2152 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2153 EmitRangeC ('-'); <I><FONT COLOR="Navy">// or treat as error ?!! </FONT></I>
\r
2157 RangeEnd := UnQuoteChar (regparse);
\r
2160 <I><FONT COLOR="Navy">// r.e.ranges extension for russian </FONT></I>
\r
2161 <B>if</B> ((fCompModifiers <B>and</B> MaskModR) = MaskModR)
\r
2162 <B>and</B> (RangeBeg = RusRangeLoLow) <B>and</B> (RangeEnd = RusRangeLoHigh) <B>then</B> <B>begin</B>
\r
2163 EmitRangeStr (RusRangeLo);
\r
2165 <B>else</B> <B>if</B> ((fCompModifiers <B>and</B> MaskModR) = MaskModR)
\r
2166 <B>and</B> (RangeBeg = RusRangeHiLow) <B>and</B> (RangeEnd = RusRangeHiHigh) <B>then</B> <B>begin</B>
\r
2167 EmitRangeStr (RusRangeHi);
\r
2169 <B>else</B> <B>if</B> ((fCompModifiers <B>and</B> MaskModR) = MaskModR)
\r
2170 <B>and</B> (RangeBeg = RusRangeLoLow) <B>and</B> (RangeEnd = RusRangeHiHigh) <B>then</B> <B>begin</B>
\r
2171 EmitRangeStr (RusRangeLo);
\r
2172 EmitRangeStr (RusRangeHi);
\r
2174 <B>else</B> <B>begin</B> <I><FONT COLOR="Navy">// standard r.e. handling </FONT></I>
\r
2175 <B>if</B> RangeBeg > RangeEnd <B>then</B> <B>begin</B>
\r
2176 Error (reeInvalidRange);
\r
2180 EmitRangeC (RangeEnd); <I><FONT COLOR="Navy">// prevent infinite loop if RangeEnd=$ff </FONT></I>
\r
2181 <B>while</B> RangeBeg < RangeEnd <B>do</B> <B>begin</B> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
2182 EmitRangeC (RangeBeg);
\r
2188 <B>else</B> <B>begin</B>
\r
2189 <B>if</B> regparse^ = '/' <B>then</B> <B>begin</B>
\r
2191 <B>if</B> regparse^ = #0 <B>then</B> <B>begin</B>
\r
2192 Error (reeParseAtomTrailingBackSlash);
\r
2195 <B>case</B> regparse^ <B>of</B> <I><FONT COLOR="Navy">// r.e.extensions </FONT></I>
\r
2196 'd': EmitRangeStr ('0123456789');
\r
2197 'w': EmitRangeStr (WordChars);
\r
2198 's': EmitRangeStr (SpaceChars);
\r
2199 <B>else</B> EmitSimpleRangeC (UnQuoteChar (regparse));
\r
2200 <B>end</B>; <I><FONT COLOR="Navy">{ of case}</FONT></I>
\r
2202 <B>else</B> EmitSimpleRangeC (regparse^);
\r
2205 <B>end</B>; <I><FONT COLOR="Navy">{ of while}</FONT></I>
\r
2207 <B>if</B> regparse^ <> ']' <B>then</B> <B>begin</B>
\r
2208 Error (reeUnmatchedSqBrackets);
\r
2212 flagp := flagp <B>or</B> HASWIDTH <B>or</B> SIMPLE;
\r
2214 '(': <B>begin</B>
\r
2215 <B>if</B> regparse^ = '?' <B>then</B> <B>begin</B>
\r
2216 <I><FONT COLOR="Navy">// check for extended Perl syntax : (?..) </FONT></I>
\r
2217 <B>if</B> (regparse + 1)^ = '#' <B>then</B> <B>begin</B> <I><FONT COLOR="Navy">// (?#comment) </FONT></I>
\r
2218 inc (regparse, 2); <I><FONT COLOR="Navy">// find closing ')' </FONT></I>
\r
2219 <B>while</B> (regparse^ <> #0) <B>and</B> (regparse^ <> ')')
\r
2220 <B>do</B> inc (regparse);
\r
2221 <B>if</B> regparse^ <> ')' <B>then</B> <B>begin</B>
\r
2222 Error (reeUnclosedComment);
\r
2225 inc (regparse); <I><FONT COLOR="Navy">// skip ')' </FONT></I>
\r
2226 ret := EmitNode (COMMENT); <I><FONT COLOR="Navy">// comment </FONT></I>
\r
2228 <B>else</B> <B>begin</B> <I><FONT COLOR="Navy">// modifiers ? </FONT></I>
\r
2229 inc (regparse); <I><FONT COLOR="Navy">// skip '?' </FONT></I>
\r
2230 begmodfs := regparse;
\r
2231 <B>while</B> (regparse^ <> #0) <B>and</B> (regparse^ <> ')')
\r
2232 <B>do</B> inc (regparse);
\r
2233 <B>if</B> (regparse^ <> ')')
\r
2234 <B>or</B> <B>not</B> SetModifiersInt (copy (begmodfs, 1, (regparse - begmodfs)), fCompModifiers) <B>then</B> <B>begin</B>
\r
2235 Error (reeUrecognizedModifier);
\r
2238 inc (regparse); <I><FONT COLOR="Navy">// skip ')' </FONT></I>
\r
2239 ret := EmitNode (COMMENT); <I><FONT COLOR="Navy">// comment </FONT></I>
\r
2240 <I><FONT COLOR="Navy">// Error (reeQPSBFollowsNothing); </FONT></I>
\r
2241 <I><FONT COLOR="Navy">// EXIT; </FONT></I>
\r
2244 <B>else</B> <B>begin</B>
\r
2245 ret := ParseReg (1, flags);
\r
2246 <B>if</B> ret = <B>nil</B> <B>then</B> <B>begin</B>
\r
2247 Result := <B>nil</B>;
\r
2250 flagp := flagp <B>or</B> flags <B>and</B> (HASWIDTH <B>or</B> SPSTART);
\r
2253 #0, '|', ')': <B>begin</B> <I><FONT COLOR="Navy">// Supposed to be caught earlier. </FONT></I>
\r
2254 Error (reeInternalUrp);
\r
2257 '?', '+', '*': <B>begin</B>
\r
2258 Error (reeQPSBFollowsNothing);
\r
2261 '/': <B>begin</B>
\r
2262 <B>if</B> regparse^ = #0 <B>then</B> <B>begin</B>
\r
2263 Error (reeTrailingBackSlash);
\r
2266 <B>case</B> regparse^ <B>of</B> <I><FONT COLOR="Navy">// r.e.extensions </FONT></I>
\r
2267 'd': <B>begin</B> <I><FONT COLOR="Navy">// r.e.extension - any digit ('0' .. '9') </FONT></I>
\r
2268 ret := EmitNode (ANYDIGIT);
\r
2269 flagp := flagp <B>or</B> HASWIDTH <B>or</B> SIMPLE;
\r
2271 'D': <B>begin</B> <I><FONT COLOR="Navy">// r.e.extension - not digit ('0' .. '9') </FONT></I>
\r
2272 ret := EmitNode (NOTDIGIT);
\r
2273 flagp := flagp <B>or</B> HASWIDTH <B>or</B> SIMPLE;
\r
2275 's': <B>begin</B> <I><FONT COLOR="Navy">// r.e.extension - any space char </FONT></I>
\r
2276 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I>
\r
2277 ret := EmitRange (ANYOF);
\r
2278 EmitRangeStr (SpaceChars);
\r
2280 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
2281 ret := EmitNode (ANYSPACE);
\r
2282 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2283 flagp := flagp <B>or</B> HASWIDTH <B>or</B> SIMPLE;
\r
2285 'S': <B>begin</B> <I><FONT COLOR="Navy">// r.e.extension - not space char </FONT></I>
\r
2286 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I>
\r
2287 ret := EmitRange (ANYBUT);
\r
2288 EmitRangeStr (SpaceChars);
\r
2290 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
2291 ret := EmitNode (NOTSPACE);
\r
2292 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2293 flagp := flagp <B>or</B> HASWIDTH <B>or</B> SIMPLE;
\r
2295 'w': <B>begin</B> <I><FONT COLOR="Navy">// r.e.extension - any english char or '_' </FONT></I>
\r
2296 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I>
\r
2297 ret := EmitRange (ANYOF);
\r
2298 EmitRangeStr (WordChars);
\r
2300 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
2301 ret := EmitNode (ANYLETTER);
\r
2302 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2303 flagp := flagp <B>or</B> HASWIDTH <B>or</B> SIMPLE;
\r
2305 'W': <B>begin</B> <I><FONT COLOR="Navy">// r.e.extension - not english char or '_' </FONT></I>
\r
2306 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I>
\r
2307 ret := EmitRange (ANYBUT);
\r
2308 EmitRangeStr (WordChars);
\r
2310 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
2311 ret := EmitNode (NOTLETTER);
\r
2312 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2313 flagp := flagp <B>or</B> HASWIDTH <B>or</B> SIMPLE;
\r
2315 '1' .. '9': <B>begin</B> <I><FONT COLOR="Navy">//###0.936 </FONT></I>
\r
2316 <B>if</B> (fCompModifiers <B>and</B> MaskModI) = MaskModI
\r
2317 <B>then</B> ret := EmitNode (BSUBEXPCI)
\r
2318 <B>else</B> ret := EmitNode (BSUBEXP);
\r
2319 EmitC (REChar (ord (regparse^) - ord ('0')));
\r
2320 flagp := flagp <B>or</B> HASWIDTH <B>or</B> SIMPLE;
\r
2322 <B>else</B> EmitExactly (UnQuoteChar (regparse));
\r
2323 <B>end</B>; <I><FONT COLOR="Navy">{ of case}</FONT></I>
\r
2326 <B>else</B> <B>begin</B>
\r
2328 len := strcspn (regparse, META);
\r
2329 <B>if</B> len <= 0 <B>then</B>
\r
2330 <B>if</B> regparse^ <> '{' <B>then</B> <B>begin</B>
\r
2331 Error (reeRarseAtomInternalDisaster);
\r
2334 <B>else</B> len := strcspn (regparse + 1, META) + 1; <I><FONT COLOR="Navy">// bad {n,m} - compile as EXATLY </FONT></I>
\r
2335 ender := (regparse + len)^;
\r
2336 <B>if</B> (len > 1)
\r
2337 <B>and</B> ((ender = '*') <B>or</B> (ender = '+') <B>or</B> (ender = '?') <B>or</B> (ender = '{'))
\r
2338 <B>then</B> dec (len); <I><FONT COLOR="Navy">// Back off clear of ?+*{ operand. </FONT></I>
\r
2339 flagp := flagp <B>or</B> HASWIDTH;
\r
2340 <B>if</B> len = 1
\r
2341 <B>then</B> flagp := flagp <B>or</B> SIMPLE;
\r
2342 <B>if</B> (fCompModifiers <B>and</B> MaskModI) = MaskModI
\r
2343 <B>then</B> ret := EmitNode (EXACTLYCI)
\r
2344 <B>else</B> ret := EmitNode (EXACTLY);
\r
2345 <B>while</B> len > 0 <B>do</B> <B>begin</B>
\r
2346 EmitC (regparse^);
\r
2351 <B>end</B>; <I><FONT COLOR="Navy">{ of case else}</FONT></I>
\r
2352 <B>end</B>; <I><FONT COLOR="Navy">{ of case}</FONT></I>
\r
2355 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.ParseAtom
\r
2356 --------------------------------------------------------------}</FONT></I>
\r
2358 <B>function</B> TRegExpr.GetCompilerErrorPos : integer;
\r
2361 <B>if</B> (regexpbeg = <B>nil</B>) <B>or</B> (regparse = <B>nil</B>)
\r
2362 <B>then</B> EXIT; <I><FONT COLOR="Navy">// not in compiling mode ? </FONT></I>
\r
2363 Result := regparse - regexpbeg;
\r
2364 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.GetCompilerErrorPos
\r
2365 --------------------------------------------------------------}</FONT></I>
\r
2368 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
2369 <I><FONT COLOR="Navy">{===================== Matching section ======================}</FONT></I>
\r
2370 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
2372 <I><FONT COLOR="Navy">{$IFNDEF UseSetOfChar}</FONT></I>
\r
2373 <B>function</B> TRegExpr.StrScanCI (s : PRegExprChar; ch : REChar) : PRegExprChar; <I><FONT COLOR="Navy">//###0.928 - now method of TRegExpr </FONT></I>
\r
2375 <B>while</B> (s^ <> #0) <B>and</B> (s^ <> ch) <B>and</B> (s^ <> InvertCase (ch))
\r
2376 <B>do</B> inc (s);
\r
2377 <B>if</B> s^ <> #0
\r
2378 <B>then</B> Result := s
\r
2379 <B>else</B> Result := <B>nil</B>;
\r
2380 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.StrScanCI
\r
2381 --------------------------------------------------------------}</FONT></I>
\r
2382 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2384 <B>function</B> TRegExpr.regrepeat (p : PRegExprChar; AMax : integer) : integer;
\r
2385 <I><FONT COLOR="Navy">// repeatedly match something simple, report how many </FONT></I>
\r
2387 scan : PRegExprChar;
\r
2388 opnd : PRegExprChar;
\r
2389 TheMax : integer;
\r
2390 <I><FONT COLOR="Navy">{Ch,}</FONT></I> InvCh : REChar; <I><FONT COLOR="Navy">//###0.931 </FONT></I>
\r
2391 sestart, seend : PRegExprChar; <I><FONT COLOR="Navy">//###0.936 </FONT></I>
\r
2394 scan := reginput;
\r
2395 opnd := p + REOpSz + RENextOffSz; <I><FONT COLOR="Navy">//OPERAND </FONT></I>
\r
2396 TheMax := fInputEnd - scan;
\r
2397 <B>if</B> TheMax > AMax
\r
2398 <B>then</B> TheMax := AMax;
\r
2399 <B>case</B> PREOp (p)^ <B>of</B>
\r
2400 ANY: <B>begin</B>
\r
2401 Result := TheMax;
\r
2402 inc (scan, Result);
\r
2404 EXACTLY: <B>begin</B> <I><FONT COLOR="Navy">// in opnd can be only ONE char !!! </FONT></I>
\r
2405 <I><FONT COLOR="Navy">// Ch := opnd^; // store in register //###0.931 </FONT></I>
\r
2406 <B>while</B> (Result < TheMax) <B>and</B> (opnd^ = scan^) <B>do</B> <B>begin</B>
\r
2411 EXACTLYCI: <B>begin</B> <I><FONT COLOR="Navy">// in opnd can be only ONE char !!! </FONT></I>
\r
2412 <I><FONT COLOR="Navy">// Ch := opnd^; // store in register //###0.931 </FONT></I>
\r
2413 <B>while</B> (Result < TheMax) <B>and</B> (opnd^ = scan^) <B>do</B> <B>begin</B> <I><FONT COLOR="Navy">// prevent unneeded InvertCase //###0.931 </FONT></I>
\r
2417 <B>if</B> Result < TheMax <B>then</B> <B>begin</B> <I><FONT COLOR="Navy">//###0.931 </FONT></I>
\r
2418 InvCh := InvertCase (opnd^); <I><FONT COLOR="Navy">// store in register </FONT></I>
\r
2419 <B>while</B> (Result < TheMax) <B>and</B>
\r
2420 ((opnd^ = scan^) <B>or</B> (InvCh = scan^)) <B>do</B> <B>begin</B>
\r
2426 BSUBEXP: <B>begin</B> <I><FONT COLOR="Navy">//###0.936 </FONT></I>
\r
2427 sestart := startp [ord (opnd^)];
\r
2428 <B>if</B> sestart = <B>nil</B>
\r
2429 <B>then</B> EXIT;
\r
2430 seend := endp [ord (opnd^)];
\r
2431 <B>if</B> seend = <B>nil</B>
\r
2432 <B>then</B> EXIT;
\r
2435 <B>while</B> opnd < seend <B>do</B> <B>begin</B>
\r
2436 <B>if</B> (scan >= fInputEnd) <B>or</B> (scan^ <> opnd^)
\r
2437 <B>then</B> EXIT;
\r
2442 reginput := scan;
\r
2443 <B>UNTIL</B> Result >= AMax;
\r
2445 BSUBEXPCI: <B>begin</B> <I><FONT COLOR="Navy">//###0.936 </FONT></I>
\r
2446 sestart := startp [ord (opnd^)];
\r
2447 <B>if</B> sestart = <B>nil</B>
\r
2448 <B>then</B> EXIT;
\r
2449 seend := endp [ord (opnd^)];
\r
2450 <B>if</B> seend = <B>nil</B>
\r
2451 <B>then</B> EXIT;
\r
2454 <B>while</B> opnd < seend <B>do</B> <B>begin</B>
\r
2455 <B>if</B> (scan >= fInputEnd) <B>or</B>
\r
2456 ((scan^ <> opnd^) <B>and</B> (scan^ <> InvertCase (opnd^)))
\r
2457 <B>then</B> EXIT;
\r
2462 reginput := scan;
\r
2463 <B>UNTIL</B> Result >= AMax;
\r
2466 <B>while</B> (Result < TheMax) <B>and</B>
\r
2467 (scan^ >= '0') <B>and</B> (scan^ <= '9') <B>do</B> <B>begin</B>
\r
2472 <B>while</B> (Result < TheMax) <B>and</B>
\r
2473 ((scan^ < '0') <B>or</B> (scan^ > '9')) <B>do</B> <B>begin</B>
\r
2477 <I><FONT COLOR="Navy">{$IFNDEF UseSetOfChar}</FONT></I> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
2479 <B>while</B> (Result < TheMax) <B>and</B>
\r
2480 <I><FONT COLOR="Navy">// !!!!!?????? if length (fWordChars) <> 0 </FONT></I>
\r
2481 <I><FONT COLOR="Navy">// then Pos (scan^, fWordChars) </FONT></I>
\r
2482 <I><FONT COLOR="Navy">// else </FONT></I>
\r
2483 ((scan^ >= 'a') <B>and</B> (scan^ <= 'z')
\r
2484 <B>or</B> (scan^ >= 'A') <B>and</B> (scan^ <= 'Z') <B>or</B> (scan^ = '_')) <B>do</B> <B>begin</B>
\r
2489 <B>while</B> (Result < TheMax) <B>and</B>
\r
2490 <I><FONT COLOR="Navy">// !!!!!?????? if length (fWordChars) <> 0 </FONT></I>
\r
2491 <I><FONT COLOR="Navy">// then Pos (scan^, fWordChars) </FONT></I>
\r
2492 <I><FONT COLOR="Navy">// else </FONT></I>
\r
2493 <B>not</B> ((scan^ >= 'a') <B>and</B> (scan^ <= 'z')
\r
2494 <B>or</B> (scan^ >= 'A') <B>and</B> (scan^ <= 'Z')
\r
2495 <B>or</B> (scan^ = '_')) <B>do</B> <B>begin</B>
\r
2500 <B>while</B> (Result < TheMax) <B>and</B>
\r
2501 (Pos (scan^, fSpaceChars) > 0) <B>do</B> <B>begin</B>
\r
2506 <B>while</B> (Result < TheMax) <B>and</B>
\r
2507 (Pos (scan^, fSpaceChars) <= 0) <B>do</B> <B>begin</B>
\r
2511 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2512 ANYOFTINYSET: <B>begin</B>
\r
2513 <B>while</B> (Result < TheMax) <B>and</B> <I><FONT COLOR="Navy">//!!!TinySet </FONT></I>
\r
2514 ((scan^ = opnd^) <B>or</B> (scan^ = (opnd + 1)^)
\r
2515 <B>or</B> (scan^ = (opnd + 2)^)) <B>do</B> <B>begin</B>
\r
2520 ANYBUTTINYSET: <B>begin</B>
\r
2521 <B>while</B> (Result < TheMax) <B>and</B> <I><FONT COLOR="Navy">//!!!TinySet </FONT></I>
\r
2522 (scan^ <> opnd^) <B>and</B> (scan^ <> (opnd + 1)^)
\r
2523 <B>and</B> (scan^ <> (opnd + 2)^) <B>do</B> <B>begin</B>
\r
2528 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
2529 ANYOFFULLSET: <B>begin</B>
\r
2530 <B>while</B> (Result < TheMax) <B>and</B>
\r
2531 (scan^ <B>in</B> PSetOfREChar (opnd)^) <B>do</B> <B>begin</B>
\r
2536 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
2538 <B>while</B> (Result < TheMax) <B>and</B>
\r
2539 (StrScan (opnd, scan^) <> <B>nil</B>) <B>do</B> <B>begin</B>
\r
2544 <B>while</B> (Result < TheMax) <B>and</B>
\r
2545 (StrScan (opnd, scan^) = <B>nil</B>) <B>do</B> <B>begin</B>
\r
2550 <B>while</B> (Result < TheMax) <B>and</B> (StrScanCI (opnd, scan^) <> <B>nil</B>) <B>do</B> <B>begin</B>
\r
2555 <B>while</B> (Result < TheMax) <B>and</B> (StrScanCI (opnd, scan^) = <B>nil</B>) <B>do</B> <B>begin</B>
\r
2559 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2560 <B>else</B> <B>begin</B> <I><FONT COLOR="Navy">// Oh dear. Called inappropriately. </FONT></I>
\r
2561 Result := 0; <I><FONT COLOR="Navy">// Best compromise. </FONT></I>
\r
2562 Error (reeRegRepeatCalledInappropriately);
\r
2565 <B>end</B>; <I><FONT COLOR="Navy">{ of case}</FONT></I>
\r
2566 reginput := scan;
\r
2567 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.regrepeat
\r
2568 --------------------------------------------------------------}</FONT></I>
\r
2570 <B>function</B> TRegExpr.regnext (p : PRegExprChar) : PRegExprChar;
\r
2571 <I><FONT COLOR="Navy">// dig the "next" pointer out of a node </FONT></I>
\r
2572 <B>var</B> offset : TRENextOff;
\r
2574 <B>if</B> p = @regdummy <B>then</B> <B>begin</B>
\r
2575 Result := <B>nil</B>;
\r
2578 offset := PRENextOff (p + REOpSz)^; <I><FONT COLOR="Navy">//###0.933 inlined NEXT </FONT></I>
\r
2579 <B>if</B> offset = 0
\r
2580 <B>then</B> Result := <B>nil</B>
\r
2581 <B>else</B> Result := p + offset;
\r
2582 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.regnext
\r
2583 --------------------------------------------------------------}</FONT></I>
\r
2585 <B>function</B> TRegExpr.MatchPrim (prog : PRegExprChar) : boolean;
\r
2586 <I><FONT COLOR="Navy">// recursively matching routine </FONT></I>
\r
2587 <I><FONT COLOR="Navy">// Conceptually the strategy is simple: check to see whether the current </FONT></I>
\r
2588 <I><FONT COLOR="Navy">// node matches, call self recursively to see whether the rest matches, </FONT></I>
\r
2589 <I><FONT COLOR="Navy">// and then act accordingly. In practice we make some effort to avoid </FONT></I>
\r
2590 <I><FONT COLOR="Navy">// recursion, in particular by going through "ordinary" nodes (that don't </FONT></I>
\r
2591 <I><FONT COLOR="Navy">// need to know whether the rest of the match failed) by a loop instead of </FONT></I>
\r
2592 <I><FONT COLOR="Navy">// by recursion. </FONT></I>
\r
2594 scan : PRegExprChar; <I><FONT COLOR="Navy">// Current node. </FONT></I>
\r
2595 next : PRegExprChar; <I><FONT COLOR="Navy">// Next node. </FONT></I>
\r
2597 opnd : PRegExprChar;
\r
2599 save : PRegExprChar;
\r
2601 BracesMin, BracesMax : integer; <I><FONT COLOR="Navy">// we use integer instead of TREBracesArg for better support */+ </FONT></I>
\r
2602 <I><FONT COLOR="Navy">{$IFDEF ComplexBraces}</FONT></I>
\r
2603 SavedLoopStack : <B>array</B> [1 .. LoopStackMax] <B>of</B> integer; <I><FONT COLOR="Navy">// :(( very bad for recursion </FONT></I>
\r
2604 SavedLoopStackIdx : integer; <I><FONT COLOR="Navy">//###0.925 </FONT></I>
\r
2605 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2610 <B>while</B> scan <> <B>nil</B> <B>do</B> <B>begin</B>
\r
2611 len := PRENextOff (scan + 1)^; <I><FONT COLOR="Navy">//###0.932 inlined regnext </FONT></I>
\r
2612 <B>if</B> len = 0
\r
2613 <B>then</B> next := <B>nil</B>
\r
2614 <B>else</B> next := scan + len;
\r
2616 <B>case</B> scan^ <B>of</B>
\r
2617 BOL: <B>if</B> reginput <> fInputStart
\r
2618 <B>then</B> EXIT;
\r
2619 EOL: <B>if</B> reginput^ <> #0
\r
2620 <B>then</B> EXIT;
\r
2621 ANY: <B>begin</B>
\r
2622 <B>if</B> reginput^ = #0
\r
2623 <B>then</B> EXIT;
\r
2626 ANYDIGIT: <B>begin</B>
\r
2627 <B>if</B> (reginput^ = #0) <B>or</B> (reginput^ < '0') <B>or</B> (reginput^ > '9')
\r
2628 <B>then</B> EXIT;
\r
2631 NOTDIGIT: <B>begin</B>
\r
2632 <B>if</B> (reginput^ = #0) <B>or</B> ((reginput^ >= '0') <B>and</B> (reginput^ <= '9'))
\r
2633 <B>then</B> EXIT;
\r
2636 <I><FONT COLOR="Navy">{$IFNDEF UseSetOfChar}</FONT></I> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
2637 ANYLETTER: <B>begin</B>
\r
2638 <B>if</B> (reginput^ = #0) <B>or</B>
\r
2639 <I><FONT COLOR="Navy">// !!!!!?????? Pos (scan^, fWordChars) </FONT></I>
\r
2640 <B>not</B> ((reginput^ >= 'a') <B>and</B> (reginput^ <= 'z')
\r
2641 <B>or</B> (reginput^ >= 'A') <B>and</B> (reginput^ <= 'Z')
\r
2642 <B>or</B> (reginput^ = '_'))
\r
2643 <B>then</B> EXIT;
\r
2646 NOTLETTER: <B>begin</B>
\r
2647 <B>if</B> (reginput^ = #0) <B>or</B>
\r
2648 <I><FONT COLOR="Navy">// !!!!!?????? Pos (scan^, fWordChars) </FONT></I>
\r
2649 (reginput^ >= 'a') <B>and</B> (reginput^ <= 'z')
\r
2650 <B>or</B> (reginput^ >= 'A') <B>and</B> (reginput^ <= 'Z')
\r
2651 <B>or</B> (reginput^ = '_')
\r
2652 <B>then</B> EXIT;
\r
2655 ANYSPACE: <B>begin</B>
\r
2656 <B>if</B> (reginput^ = #0) <B>or</B> <B>not</B> (Pos (scan^, fSpaceChars) > 0)
\r
2657 <B>then</B> EXIT;
\r
2660 NOTSPACE: <B>begin</B>
\r
2661 <B>if</B> (reginput^ = #0) <B>or</B> (Pos (scan^, fSpaceChars) > 0)
\r
2662 <B>then</B> EXIT;
\r
2665 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2666 EXACTLYCI: <B>begin</B>
\r
2667 opnd := scan + REOpSz + RENextOffSz; <I><FONT COLOR="Navy">// OPERAND </FONT></I>
\r
2668 <I><FONT COLOR="Navy">// Inline the first character, for speed. </FONT></I>
\r
2669 <B>if</B> (opnd^ <> reginput^)
\r
2670 <B>and</B> (InvertCase (opnd^) <> reginput^)
\r
2671 <B>then</B> EXIT;
\r
2672 len := strlen (opnd);
\r
2673 <I><FONT COLOR="Navy">//###0.929 begin </FONT></I>
\r
2675 save := reginput;
\r
2676 <B>while</B> no > 1 <B>do</B> <B>begin</B>
\r
2679 <B>if</B> (opnd^ <> save^)
\r
2680 <B>and</B> (InvertCase (opnd^) <> save^)
\r
2681 <B>then</B> EXIT;
\r
2684 <I><FONT COLOR="Navy">//###0.929 end </FONT></I>
\r
2685 inc (reginput, len);
\r
2687 EXACTLY: <B>begin</B>
\r
2688 opnd := scan + REOpSz + RENextOffSz; <I><FONT COLOR="Navy">// OPERAND </FONT></I>
\r
2689 <I><FONT COLOR="Navy">// Inline the first character, for speed. </FONT></I>
\r
2690 <B>if</B> opnd^ <> reginput^
\r
2691 <B>then</B> EXIT;
\r
2692 len := strlen (opnd);
\r
2693 <I><FONT COLOR="Navy">//###0.929 begin </FONT></I>
\r
2695 save := reginput;
\r
2696 <B>while</B> no > 1 <B>do</B> <B>begin</B>
\r
2699 <B>if</B> opnd^ <> save^
\r
2700 <B>then</B> EXIT;
\r
2703 <I><FONT COLOR="Navy">//###0.929 end </FONT></I>
\r
2704 inc (reginput, len);
\r
2706 BSUBEXP: <B>begin</B> <I><FONT COLOR="Navy">//###0.936 </FONT></I>
\r
2707 no := ord ((scan + REOpSz + RENextOffSz)^);
\r
2708 <B>if</B> startp [no] = <B>nil</B>
\r
2709 <B>then</B> EXIT;
\r
2710 <B>if</B> endp [no] = <B>nil</B>
\r
2711 <B>then</B> EXIT;
\r
2712 save := reginput;
\r
2713 opnd := startp [no];
\r
2714 <B>while</B> opnd < endp [no] <B>do</B> <B>begin</B>
\r
2715 <B>if</B> (save >= fInputEnd) <B>or</B> (save^ <> opnd^)
\r
2716 <B>then</B> EXIT;
\r
2720 reginput := save;
\r
2722 BSUBEXPCI: <B>begin</B> <I><FONT COLOR="Navy">//###0.936 </FONT></I>
\r
2723 no := ord ((scan + REOpSz + RENextOffSz)^);
\r
2724 <B>if</B> startp [no] = <B>nil</B>
\r
2725 <B>then</B> EXIT;
\r
2726 <B>if</B> endp [no] = <B>nil</B>
\r
2727 <B>then</B> EXIT;
\r
2728 save := reginput;
\r
2729 opnd := startp [no];
\r
2730 <B>while</B> opnd < endp [no] <B>do</B> <B>begin</B>
\r
2731 <B>if</B> (save >= fInputEnd) <B>or</B>
\r
2732 ((save^ <> opnd^) <B>and</B> (save^ <> InvertCase (opnd^)))
\r
2733 <B>then</B> EXIT;
\r
2737 reginput := save;
\r
2739 ANYOFTINYSET: <B>begin</B>
\r
2740 <B>if</B> (reginput^ = #0) <B>or</B> <I><FONT COLOR="Navy">//!!!TinySet </FONT></I>
\r
2741 ((reginput^ <> (scan + REOpSz + RENextOffSz)^)
\r
2742 <B>and</B> (reginput^ <> (scan + REOpSz + RENextOffSz + 1)^)
\r
2743 <B>and</B> (reginput^ <> (scan + REOpSz + RENextOffSz + 2)^))
\r
2744 <B>then</B> EXIT;
\r
2747 ANYBUTTINYSET: <B>begin</B>
\r
2748 <B>if</B> (reginput^ = #0) <B>or</B> <I><FONT COLOR="Navy">//!!!TinySet </FONT></I>
\r
2749 (reginput^ = (scan + REOpSz + RENextOffSz)^)
\r
2750 <B>or</B> (reginput^ = (scan + REOpSz + RENextOffSz + 1)^)
\r
2751 <B>or</B> (reginput^ = (scan + REOpSz + RENextOffSz + 2)^)
\r
2752 <B>then</B> EXIT;
\r
2755 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
2756 ANYOFFULLSET: <B>begin</B>
\r
2757 <B>if</B> (reginput^ = #0)
\r
2758 <B>or</B> <B>not</B> (reginput^ <B>in</B> PSetOfREChar (scan + REOpSz + RENextOffSz)^)
\r
2759 <B>then</B> EXIT;
\r
2762 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
2763 ANYOF: <B>begin</B>
\r
2764 <B>if</B> (reginput^ = #0) <B>or</B> (StrScan (scan + REOpSz + RENextOffSz, reginput^) = <B>nil</B>)
\r
2765 <B>then</B> EXIT;
\r
2768 ANYBUT: <B>begin</B>
\r
2769 <B>if</B> (reginput^ = #0) <B>or</B> (StrScan (scan + REOpSz + RENextOffSz, reginput^) <> <B>nil</B>)
\r
2770 <B>then</B> EXIT;
\r
2773 ANYOFCI: <B>begin</B>
\r
2774 <B>if</B> (reginput^ = #0) <B>or</B> (StrScanCI (scan + REOpSz + RENextOffSz, reginput^) = <B>nil</B>)
\r
2775 <B>then</B> EXIT;
\r
2778 ANYBUTCI: <B>begin</B>
\r
2779 <B>if</B> (reginput^ = #0) <B>or</B> (StrScanCI (scan + REOpSz + RENextOffSz, reginput^) <> <B>nil</B>)
\r
2780 <B>then</B> EXIT;
\r
2783 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2787 Succ (OPEN) .. TREOp (Ord (OPEN) + NSUBEXP - 1) : <B>begin</B> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
2788 no := ord (scan^) - ord (OPEN);
\r
2789 <I><FONT COLOR="Navy">// save := reginput; </FONT></I>
\r
2790 save := startp [no]; <I><FONT COLOR="Navy">//###0.936 </FONT></I>
\r
2791 startp [no] := reginput; <I><FONT COLOR="Navy">//###0.936 </FONT></I>
\r
2792 Result := MatchPrim (next);
\r
2793 <B>if</B> <B>not</B> Result <I><FONT COLOR="Navy">//###0.936 </FONT></I>
\r
2794 <B>then</B> startp [no] := save;
\r
2795 <I><FONT COLOR="Navy">// if Result and (startp [no] = nil) </FONT></I>
\r
2796 <I><FONT COLOR="Navy">// then startp [no] := save; </FONT></I>
\r
2797 <I><FONT COLOR="Navy">// Don't set startp if some later invocation of the same </FONT></I>
\r
2798 <I><FONT COLOR="Navy">// parentheses already has. </FONT></I>
\r
2801 Succ (CLOSE) .. TREOp (Ord (CLOSE) + NSUBEXP - 1): <B>begin</B> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
2802 no := ord (scan^) - ord (CLOSE);
\r
2803 <I><FONT COLOR="Navy">// save := reginput; </FONT></I>
\r
2804 save := endp [no]; <I><FONT COLOR="Navy">//###0.936 </FONT></I>
\r
2805 endp [no] := reginput; <I><FONT COLOR="Navy">//###0.936 </FONT></I>
\r
2806 Result := MatchPrim (next);
\r
2807 <B>if</B> <B>not</B> Result <I><FONT COLOR="Navy">//###0.936 </FONT></I>
\r
2808 <B>then</B> endp [no] := save;
\r
2809 <I><FONT COLOR="Navy">// if Result and (endp [no] = nil) </FONT></I>
\r
2810 <I><FONT COLOR="Navy">// then endp [no] := save; </FONT></I>
\r
2811 <I><FONT COLOR="Navy">// Don't set endp if some later invocation of the same </FONT></I>
\r
2812 <I><FONT COLOR="Navy">// parentheses already has. </FONT></I>
\r
2815 BRANCH: <B>begin</B>
\r
2816 <B>if</B> (next^ <> BRANCH) <I><FONT COLOR="Navy">// No choice. </FONT></I>
\r
2817 <B>then</B> next := scan + REOpSz + RENextOffSz <I><FONT COLOR="Navy">// Avoid recursion </FONT></I>
\r
2818 <B>else</B> <B>begin</B>
\r
2820 save := reginput;
\r
2821 Result := MatchPrim (scan + REOpSz + RENextOffSz);
\r
2823 <B>then</B> EXIT;
\r
2824 reginput := save;
\r
2825 scan := regnext (scan);
\r
2826 <B>UNTIL</B> (scan = <B>nil</B>) <B>or</B> (scan^ <> BRANCH);
\r
2830 <I><FONT COLOR="Navy">{$IFDEF ComplexBraces}</FONT></I>
\r
2831 LOOPENTRY: <B>begin</B> <I><FONT COLOR="Navy">//###0.925 </FONT></I>
\r
2832 no := LoopStackIdx;
\r
2833 inc (LoopStackIdx);
\r
2834 <B>if</B> LoopStackIdx > LoopStackMax <B>then</B> <B>begin</B>
\r
2835 Error (reeLoopStackExceeded);
\r
2838 save := reginput;
\r
2839 LoopStack [LoopStackIdx] := 0; <I><FONT COLOR="Navy">// init loop counter </FONT></I>
\r
2840 Result := MatchPrim (next); <I><FONT COLOR="Navy">// execute LOOP </FONT></I>
\r
2841 LoopStackIdx := no; <I><FONT COLOR="Navy">// cleanup </FONT></I>
\r
2843 <B>then</B> EXIT;
\r
2844 reginput := save;
\r
2847 LOOP: <B>begin</B> <I><FONT COLOR="Navy">//###0.925 </FONT></I>
\r
2848 <B>if</B> LoopStackIdx <= 0 <B>then</B> <B>begin</B>
\r
2849 Error (reeLoopWithoutEntry);
\r
2852 opnd := scan + PRENextOff (scan + REOpSz + RENextOffSz + 2 * REBracesArgSz)^;
\r
2853 BracesMin := PREBracesArg (scan + REOpSz + RENextOffSz)^;
\r
2854 BracesMax := PREBracesArg (scan + REOpSz + RENextOffSz + REBracesArgSz)^;
\r
2855 save := reginput;
\r
2856 <B>if</B> LoopStack [LoopStackIdx] >= BracesMin <B>then</B> <B>begin</B>
\r
2857 <I><FONT COLOR="Navy">// greedy way ;) </FONT></I>
\r
2858 <B>if</B> LoopStack [LoopStackIdx] < BracesMax <B>then</B> <B>begin</B>
\r
2859 inc (LoopStack [LoopStackIdx]);
\r
2860 no := LoopStackIdx;
\r
2861 Result := MatchPrim (opnd);
\r
2862 LoopStackIdx := no;
\r
2864 <B>then</B> EXIT;
\r
2865 reginput := save;
\r
2867 dec (LoopStackIdx);
\r
2868 Result := MatchPrim (next);
\r
2869 <B>if</B> <B>not</B> Result
\r
2870 <B>then</B> reginput := save;
\r
2873 <B>else</B> <B>begin</B> <I><FONT COLOR="Navy">// first match a min_cnt times </FONT></I>
\r
2874 inc (LoopStack [LoopStackIdx]);
\r
2875 no := LoopStackIdx;
\r
2876 Result := MatchPrim (opnd);
\r
2877 LoopStackIdx := no;
\r
2879 <B>then</B> EXIT;
\r
2880 dec (LoopStack [LoopStackIdx]);
\r
2881 reginput := save;
\r
2885 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2886 STAR, PLUS, BRACES: <B>begin</B>
\r
2887 <I><FONT COLOR="Navy">// Lookahead to avoid useless match attempts when we know </FONT></I>
\r
2888 <I><FONT COLOR="Navy">// what character comes next. </FONT></I>
\r
2890 <B>if</B> next^ = EXACTLY
\r
2891 <B>then</B> nextch := (next + REOpSz + RENextOffSz)^;
\r
2892 BracesMax := MaxInt; <I><FONT COLOR="Navy">// infinite loop for * and + //###0.92 </FONT></I>
\r
2893 <B>if</B> scan^ = STAR
\r
2894 <B>then</B> BracesMin := 0 <I><FONT COLOR="Navy">// STAR </FONT></I>
\r
2895 <B>else</B> <B>if</B> scan^ = PLUS
\r
2896 <B>then</B> BracesMin := 1 <I><FONT COLOR="Navy">// PLUS </FONT></I>
\r
2897 <B>else</B> <B>begin</B> <I><FONT COLOR="Navy">// BRACES </FONT></I>
\r
2898 BracesMin := PREBracesArg (scan + REOpSz + RENextOffSz)^;
\r
2899 BracesMax := PREBracesArg (scan + REOpSz + RENextOffSz + REBracesArgSz)^;
\r
2901 save := reginput;
\r
2902 opnd := scan + REOpSz + RENextOffSz;
\r
2903 <B>if</B> scan^ = BRACES
\r
2904 <B>then</B> inc (opnd, 2 * REBracesArgSz);
\r
2905 no := regrepeat (opnd, BracesMax); <I><FONT COLOR="Navy">// don't repeat more than max_cnt </FONT></I>
\r
2906 <B>while</B> no >= BracesMin <B>do</B> <B>begin</B>
\r
2907 <I><FONT COLOR="Navy">// If it could work, try it. </FONT></I>
\r
2908 <B>if</B> (nextch = #0) <B>or</B> (reginput^ = nextch) <B>then</B> <B>begin</B>
\r
2909 <I><FONT COLOR="Navy">{$IFDEF ComplexBraces}</FONT></I>
\r
2910 System.Move (LoopStack, SavedLoopStack, SizeOf (LoopStack)); <I><FONT COLOR="Navy">//###0.925 </FONT></I>
\r
2911 SavedLoopStackIdx := LoopStackIdx;
\r
2912 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2913 <B>if</B> MatchPrim (next) <B>then</B> <B>begin</B>
\r
2917 <I><FONT COLOR="Navy">{$IFDEF ComplexBraces}</FONT></I>
\r
2918 System.Move (SavedLoopStack, LoopStack, SizeOf (LoopStack));
\r
2919 LoopStackIdx := SavedLoopStackIdx;
\r
2920 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
2922 dec (no); <I><FONT COLOR="Navy">// Couldn't or didn't - back up. </FONT></I>
\r
2923 reginput := save + no;
\r
2924 <B>end</B>; <I><FONT COLOR="Navy">{ of while}</FONT></I>
\r
2927 EEND: <B>begin</B>
\r
2928 Result := true; <I><FONT COLOR="Navy">// Success! </FONT></I>
\r
2931 <B>else</B> <B>begin</B>
\r
2932 Error (reeMatchPrimMemoryCorruption);
\r
2935 <B>end</B>; <I><FONT COLOR="Navy">{ of case scan^}</FONT></I>
\r
2937 <B>end</B>; <I><FONT COLOR="Navy">{ of while scan <> nil}</FONT></I>
\r
2939 <I><FONT COLOR="Navy">// We get here only if there's trouble -- normally "case EEND" is the </FONT></I>
\r
2940 <I><FONT COLOR="Navy">// terminating point. </FONT></I>
\r
2941 Error (reeMatchPrimCorruptedPointers);
\r
2942 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.MatchPrim
\r
2943 --------------------------------------------------------------}</FONT></I>
\r
2945 <I><FONT COLOR="Navy">{$IFDEF UseFirstCharSet}</FONT></I> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
2946 <B>procedure</B> TRegExpr.FillFirstCharSet (prog : PRegExprChar);
\r
2948 scan : PRegExprChar; <I><FONT COLOR="Navy">// Current node. </FONT></I>
\r
2949 next : PRegExprChar; <I><FONT COLOR="Navy">// Next node. </FONT></I>
\r
2950 opnd : PRegExprChar;
\r
2951 min_cnt : integer;
\r
2954 <B>while</B> scan <> <B>nil</B> <B>do</B> <B>begin</B>
\r
2955 next := regnext (scan);
\r
2956 <B>case</B> PREOp (scan)^ <B>of</B>
\r
2957 BSUBEXP, BSUBEXPCI: <B>begin</B> <I><FONT COLOR="Navy">//###0.938 </FONT></I>
\r
2958 FirstCharSet := [#0 .. #255]; <I><FONT COLOR="Navy">// :((( we cannot </FONT></I>
\r
2959 <I><FONT COLOR="Navy">// optimize r.e. if it starts with back reference </FONT></I>
\r
2962 BOL: ; <I><FONT COLOR="Navy">// EXIT; //###0.937 </FONT></I>
\r
2963 EOL: ; <I><FONT COLOR="Navy">// EXIT; //###0.937 </FONT></I>
\r
2964 ANY: <B>begin</B>
\r
2965 FirstCharSet := [#0 .. #255]; <I><FONT COLOR="Navy">//###0.930 </FONT></I>
\r
2968 ANYDIGIT: <B>begin</B>
\r
2969 FirstCharSet := FirstCharSet + ['0' .. '9'];
\r
2972 NOTDIGIT: <B>begin</B>
\r
2973 FirstCharSet := [#0 .. #255] - ['0' .. '9'];
\r
2976 EXACTLYCI: <B>begin</B>
\r
2977 Include (FirstCharSet, (scan + REOpSz + RENextOffSz)^);
\r
2978 Include (FirstCharSet, InvertCase ((scan + REOpSz + RENextOffSz)^));
\r
2981 EXACTLY: <B>begin</B>
\r
2982 Include (FirstCharSet, (scan + REOpSz + RENextOffSz)^);
\r
2985 ANYOFFULLSET: <B>begin</B>
\r
2986 FirstCharSet := FirstCharSet + PSetOfREChar (scan + REOpSz + RENextOffSz)^;
\r
2989 ANYOFTINYSET: <B>begin</B>
\r
2990 <I><FONT COLOR="Navy">//!!!TinySet </FONT></I>
\r
2991 Include (FirstCharSet, (scan + REOpSz + RENextOffSz)^);
\r
2992 Include (FirstCharSet, (scan + REOpSz + RENextOffSz + 1)^);
\r
2993 Include (FirstCharSet, (scan + REOpSz + RENextOffSz + 2)^);
\r
2994 <I><FONT COLOR="Navy">// ... // up to TinySetLen </FONT></I>
\r
2997 ANYBUTTINYSET: <B>begin</B>
\r
2998 <I><FONT COLOR="Navy">//!!!TinySet </FONT></I>
\r
2999 FirstCharSet := [#0 .. #255];
\r
3000 Exclude (FirstCharSet, (scan + REOpSz + RENextOffSz)^);
\r
3001 Exclude (FirstCharSet, (scan + REOpSz + RENextOffSz + 1)^);
\r
3002 Exclude (FirstCharSet, (scan + REOpSz + RENextOffSz + 2)^);
\r
3003 <I><FONT COLOR="Navy">// ... // up to TinySetLen </FONT></I>
\r
3009 Succ (OPEN) .. TREOp (Ord (OPEN) + NSUBEXP - 1) : <B>begin</B> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
3010 FillFirstCharSet (next);
\r
3013 Succ (CLOSE) .. TREOp (Ord (CLOSE) + NSUBEXP - 1): <B>begin</B> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
3014 FillFirstCharSet (next);
\r
3017 BRANCH: <B>begin</B>
\r
3018 <B>if</B> (PREOp (next)^ <> BRANCH) <I><FONT COLOR="Navy">// No choice. </FONT></I>
\r
3019 <B>then</B> next := scan + REOpSz + RENextOffSz <I><FONT COLOR="Navy">// Avoid recursion. </FONT></I>
\r
3020 <B>else</B> <B>begin</B>
\r
3022 FillFirstCharSet (scan + REOpSz + RENextOffSz);
\r
3023 scan := regnext (scan);
\r
3024 <B>UNTIL</B> (scan = <B>nil</B>) <B>or</B> (PREOp (scan)^ <> BRANCH);
\r
3028 <I><FONT COLOR="Navy">{$IFDEF ComplexBraces}</FONT></I>
\r
3029 LOOPENTRY: <B>begin</B> <I><FONT COLOR="Navy">//###0.925 </FONT></I>
\r
3030 LoopStack [LoopStackIdx] := 0; <I><FONT COLOR="Navy">// init loop counter </FONT></I>
\r
3031 FillFirstCharSet (next); <I><FONT COLOR="Navy">// execute LOOP </FONT></I>
\r
3034 LOOP: <B>begin</B> <I><FONT COLOR="Navy">//###0.925 </FONT></I>
\r
3035 opnd := scan + PRENextOff (scan + REOpSz + RENextOffSz + REBracesArgSz * 2)^;
\r
3036 min_cnt := PREBracesArg (scan + REOpSz + RENextOffSz)^;
\r
3037 FillFirstCharSet (opnd);
\r
3038 <B>if</B> min_cnt = 0
\r
3039 <B>then</B> FillFirstCharSet (next);
\r
3042 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3044 FillFirstCharSet (scan + REOpSz + RENextOffSz);
\r
3045 PLUS: <B>begin</B>
\r
3046 FillFirstCharSet (scan + REOpSz + RENextOffSz);
\r
3049 BRACES: <B>begin</B>
\r
3050 opnd := scan + REOpSz + RENextOffSz + REBracesArgSz * 2;
\r
3051 min_cnt := PREBracesArg (scan + REOpSz + RENextOffSz)^; <I><FONT COLOR="Navy">// BRACES </FONT></I>
\r
3052 FillFirstCharSet (opnd);
\r
3053 <B>if</B> min_cnt > 0
\r
3054 <B>then</B> EXIT;
\r
3056 EEND: <B>begin</B>
\r
3059 <B>else</B> <B>begin</B>
\r
3060 Error (reeMatchPrimMemoryCorruption);
\r
3063 <B>end</B>; <I><FONT COLOR="Navy">{ of case scan^}</FONT></I>
\r
3065 <B>end</B>; <I><FONT COLOR="Navy">{ of while scan <> nil}</FONT></I>
\r
3066 <B>end</B>; <I><FONT COLOR="Navy">{ of procedure FillFirstCharSet;
\r
3067 --------------------------------------------------------------}</FONT></I>
\r
3068 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3070 <B>function</B> TRegExpr.RegMatch (str : PRegExprChar) : boolean;
\r
3071 <I><FONT COLOR="Navy">// try match at specific point </FONT></I>
\r
3072 <B>var</B> i : integer;
\r
3074 <B>for</B> i := 0 <B>to</B> NSUBEXP - 1 <B>do</B> <B>begin</B>
\r
3075 startp [i] := <B>nil</B>;
\r
3076 endp [i] := <B>nil</B>;
\r
3079 Result := MatchPrim (programm + REOpSz);
\r
3080 <B>if</B> Result <B>then</B> <B>begin</B>
\r
3081 startp [0] := str;
\r
3082 endp [0] := reginput;
\r
3084 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.RegMatch
\r
3085 --------------------------------------------------------------}</FONT></I>
\r
3087 <B>function</B> TRegExpr.Exec (<B>const</B> AInputString : RegExprString) : boolean;
\r
3089 InputString := AInputString;
\r
3090 Result := ExecPrim (1);
\r
3091 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.Exec
\r
3092 --------------------------------------------------------------}</FONT></I>
\r
3094 <B>function</B> TRegExpr.ExecPrim (AOffset: integer) : boolean;
\r
3096 s : PRegExprChar;
\r
3097 StartPtr: PRegExprChar;
\r
3098 InputLen : integer;
\r
3100 Result := false; <I><FONT COLOR="Navy">// Be paranoid... </FONT></I>
\r
3102 <B>if</B> <B>not</B> IsProgrammOk <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
3103 <B>then</B> EXIT;
\r
3105 <I><FONT COLOR="Navy">// Check InputString presence </FONT></I>
\r
3106 <B>if</B> <B>not</B> Assigned (fInputString) <B>then</B> <B>begin</B>
\r
3107 Error (reeNoInpitStringSpecified);
\r
3111 InputLen := length (fInputString);
\r
3113 <I><FONT COLOR="Navy">//Check that the start position is not negative </FONT></I>
\r
3114 <B>if</B> AOffset < 1 <B>then</B> <B>begin</B>
\r
3115 Error (reeOffsetMustBeGreaterThen0);
\r
3118 <I><FONT COLOR="Navy">// Check that the start position is not longer than the line </FONT></I>
\r
3119 <I><FONT COLOR="Navy">// If so then exit with nothing found </FONT></I>
\r
3120 <B>if</B> AOffset > (InputLen + 1) <I><FONT COLOR="Navy">// for matching empty string after last char. </FONT></I>
\r
3121 <B>then</B> EXIT;
\r
3123 StartPtr := fInputString + AOffset - 1;
\r
3125 <I><FONT COLOR="Navy">// If there is a "must appear" string, look for it. </FONT></I>
\r
3126 <B>if</B> regmust <> <B>nil</B> <B>then</B> <B>begin</B>
\r
3129 s := StrScan (s, regmust [0]);
\r
3130 <B>if</B> s <> <B>nil</B> <B>then</B> <B>begin</B>
\r
3131 <B>if</B> StrLComp (s, regmust, regmlen) = 0
\r
3132 <B>then</B> BREAK; <I><FONT COLOR="Navy">// Found it. </FONT></I>
\r
3135 <B>UNTIL</B> s = <B>nil</B>;
\r
3136 <B>if</B> s = <B>nil</B> <I><FONT COLOR="Navy">// Not present. </FONT></I>
\r
3137 <B>then</B> EXIT;
\r
3140 <I><FONT COLOR="Navy">// Mark beginning of line for ^ . </FONT></I>
\r
3141 fInputStart := fInputString;
\r
3143 <I><FONT COLOR="Navy">// Pointer to end of input stream - for </FONT></I>
\r
3144 <I><FONT COLOR="Navy">// pascal-style string processing (may include #0) </FONT></I>
\r
3145 fInputEnd := fInputString + InputLen;
\r
3147 <I><FONT COLOR="Navy">{$IFDEF ComplexBraces}</FONT></I>
\r
3148 <I><FONT COLOR="Navy">// no loops started </FONT></I>
\r
3149 LoopStackIdx := 0; <I><FONT COLOR="Navy">//###0.925 </FONT></I>
\r
3150 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3152 <I><FONT COLOR="Navy">// Simplest case: anchored match need be tried only once. </FONT></I>
\r
3153 <B>if</B> reganch <> #0 <B>then</B> <B>begin</B>
\r
3154 Result := RegMatch (StartPtr);
\r
3158 <I><FONT COLOR="Navy">// Messy cases: unanchored match. </FONT></I>
\r
3160 <B>if</B> regstart <> #0 <B>then</B> <I><FONT COLOR="Navy">// We know what char it must start with. </FONT></I>
\r
3162 s := StrScan (s, regstart);
\r
3163 <B>if</B> s <> <B>nil</B> <B>then</B> <B>begin</B>
\r
3164 Result := RegMatch (s);
\r
3166 <B>then</B> EXIT;
\r
3169 <B>UNTIL</B> s = <B>nil</B>
\r
3170 <B>else</B> <B>begin</B> <I><FONT COLOR="Navy">// We don't - general case. </FONT></I>
\r
3171 <I><FONT COLOR="Navy">{$IFDEF UseFirstCharSet}</FONT></I> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
3172 <B>while</B> s^ <> #0 <B>do</B> <B>begin</B>
\r
3173 <B>if</B> s^ <B>in</B> FirstCharSet
\r
3174 <B>then</B> Result := RegMatch (s);
\r
3176 <B>then</B> EXIT;
\r
3179 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
3181 Result := RegMatch (s);
\r
3183 <B>then</B> EXIT;
\r
3185 <B>UNTIL</B> s^ = #0;
\r
3186 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3188 <I><FONT COLOR="Navy">// Failure </FONT></I>
\r
3189 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.ExecPrim
\r
3190 --------------------------------------------------------------}</FONT></I>
\r
3192 <B>function</B> TRegExpr.ExecNext : boolean;
\r
3193 <B>var</B> offset : integer;
\r
3196 <B>if</B> <B>not</B> Assigned (startp[0]) <B>or</B> <B>not</B> Assigned (endp[0]) <B>then</B> <B>begin</B>
\r
3197 Error (reeExecNextWithoutExec);
\r
3200 <I><FONT COLOR="Navy">// Offset := MatchPos [0] + MatchLen [0]; </FONT></I>
\r
3201 <I><FONT COLOR="Navy">// if MatchLen [0] = 0 </FONT></I>
\r
3202 Offset := endp [0] - fInputString + 1; <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
3203 <B>if</B> endp [0] = startp [0] <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
3204 <B>then</B> inc (Offset); <I><FONT COLOR="Navy">// prevent infinite looping if empty string match r.e. </FONT></I>
\r
3205 Result := ExecPrim (Offset);
\r
3206 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.ExecNext
\r
3207 --------------------------------------------------------------}</FONT></I>
\r
3209 <B>function</B> TRegExpr.ExecPos (AOffset: integer <I><FONT COLOR="Navy">{$IFDEF D4_}</FONT></I>= 1<I><FONT COLOR="Navy">{$ENDIF}</FONT></I>) : boolean;
\r
3211 Result := ExecPrim (AOffset);
\r
3212 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.ExecPos
\r
3213 --------------------------------------------------------------}</FONT></I>
\r
3215 <B>function</B> TRegExpr.GetInputString : RegExprString;
\r
3217 <B>if</B> <B>not</B> Assigned (fInputString) <B>then</B> <B>begin</B>
\r
3218 Error (reeGetInputStringWithoutInputString);
\r
3221 Result := fInputString;
\r
3222 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.GetInputString
\r
3223 --------------------------------------------------------------}</FONT></I>
\r
3225 <B>procedure</B> TRegExpr.SetInputString (<B>const</B> AInputString : RegExprString);
\r
3230 <I><FONT COLOR="Navy">// clear Match* - before next Exec* call it's undefined </FONT></I>
\r
3231 <B>for</B> i := 0 <B>to</B> NSUBEXP - 1 <B>do</B> <B>begin</B>
\r
3232 startp [i] := <B>nil</B>;
\r
3233 endp [i] := <B>nil</B>;
\r
3236 <I><FONT COLOR="Navy">// need reallocation of input string buffer ? </FONT></I>
\r
3237 Len := length (AInputString);
\r
3238 <B>if</B> Assigned (fInputString) <B>and</B> (Length (fInputString) <> Len) <B>then</B> <B>begin</B>
\r
3239 FreeMem (fInputString);
\r
3240 fInputString := <B>nil</B>;
\r
3242 <I><FONT COLOR="Navy">// buffer [re]allocation </FONT></I>
\r
3243 <B>if</B> <B>not</B> Assigned (fInputString)
\r
3244 <B>then</B> GetMem (fInputString, (Len + 1) * SizeOf (REChar));
\r
3246 <I><FONT COLOR="Navy">// copy input string into buffer </FONT></I>
\r
3247 <I><FONT COLOR="Navy">{$IFDEF UniCode}</FONT></I>
\r
3248 StrPCopy (fInputString, Copy (AInputString, 1, Len)); <I><FONT COLOR="Navy">//###0.927 </FONT></I>
\r
3249 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
3250 StrLCopy (fInputString, PRegExprChar (AInputString), Len);
\r
3251 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3253 <I><FONT COLOR="Navy">{
\r
3254 fInputString : string;
\r
3255 fInputStart, fInputEnd : PRegExprChar;
\r
3258 fInputString := AInputString;
\r
3259 UniqueString (fInputString);
\r
3260 fInputStart := PChar (fInputString);
\r
3261 Len := length (fInputString);
\r
3262 fInputEnd := PRegExprChar (integer (fInputStart) + Len); ??
\r
3263 !! startp/endp âñå ðàâíî áóäåò îïàñíî èñïîëüçîâàòü ?
\r
3265 <B>end</B>; <I><FONT COLOR="Navy">{ of procedure TRegExpr.SetInputString
\r
3266 --------------------------------------------------------------}</FONT></I>
\r
3268 <B>function</B> TRegExpr.Substitute (<B>const</B> ATemplate : RegExprString) : RegExprString;
\r
3269 <I><FONT COLOR="Navy">// perform substitutions after a regexp match </FONT></I>
\r
3270 <I><FONT COLOR="Navy">// completely rewritten in 0.929 </FONT></I>
\r
3272 TemplateLen : integer;
\r
3273 TemplateBeg, TemplateEnd : PRegExprChar;
\r
3274 p, p0, ResultPtr : PRegExprChar;
\r
3275 ResultLen : integer;
\r
3278 <B>function</B> ParseVarName (<B>var</B> APtr : PRegExprChar) : integer;
\r
3279 <I><FONT COLOR="Navy">// extract name of variable (digits, may be enclosed with </FONT></I>
\r
3280 <I><FONT COLOR="Navy">// curly braces) from APtr^, uses TemplateEnd !!! </FONT></I>
\r
3282 Digits = ['0' .. '9'];
\r
3284 p : PRegExprChar;
\r
3285 Delimited : boolean;
\r
3289 Delimited := (p < TemplateEnd) <B>and</B> (p^ = '{');
\r
3290 <B>if</B> Delimited
\r
3291 <B>then</B> inc (p); <I><FONT COLOR="Navy">// skip left curly brace </FONT></I>
\r
3292 <B>if</B> (p < TemplateEnd) <B>and</B> (p^ = '&')
\r
3293 <B>then</B> inc (p) <I><FONT COLOR="Navy">// this is '$&' or '${&}' </FONT></I>
\r
3295 <B>while</B> (p < TemplateEnd) <B>and</B>
\r
3296 <I><FONT COLOR="Navy">{$IFDEF UniCode}</FONT></I> <I><FONT COLOR="Navy">//###0.935 </FONT></I>
\r
3297 (ord (p^) < 256) <B>and</B> (char (p^) <B>in</B> Digits)
\r
3298 <I><FONT COLOR="Navy">{$ELSE}</FONT></I>
\r
3299 (p^ <B>in</B> Digits)
\r
3300 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3301 <B>do</B> <B>begin</B>
\r
3302 inc (Result, ord (p^) - ord ('0'));
\r
3305 <B>if</B> Delimited <B>then</B>
\r
3306 <B>if</B> (p < TemplateEnd) <B>and</B> (p^ = '}')
\r
3307 <B>then</B> inc (p) <I><FONT COLOR="Navy">// skip right curly brace </FONT></I>
\r
3308 <B>else</B> p := APtr; <I><FONT COLOR="Navy">// isn't properly terminated </FONT></I>
\r
3309 <B>if</B> p = APtr
\r
3310 <B>then</B> Result := -1; <I><FONT COLOR="Navy">// no valid digits found or no right curly brace </FONT></I>
\r
3314 <I><FONT COLOR="Navy">// Check programm and input string </FONT></I>
\r
3315 <B>if</B> <B>not</B> IsProgrammOk
\r
3316 <B>then</B> EXIT;
\r
3317 <B>if</B> <B>not</B> Assigned (fInputString) <B>then</B> <B>begin</B>
\r
3318 Error (reeNoInpitStringSpecified);
\r
3321 <I><FONT COLOR="Navy">// Prepare for working </FONT></I>
\r
3322 TemplateLen := length (ATemplate);
\r
3323 <B>if</B> TemplateLen = 0 <B>then</B> <B>begin</B> <I><FONT COLOR="Navy">// prevent nil pointers </FONT></I>
\r
3327 TemplateBeg := pointer (ATemplate);
\r
3328 TemplateEnd := TemplateBeg + TemplateLen;
\r
3329 <I><FONT COLOR="Navy">// Count result length for speed optimization. </FONT></I>
\r
3331 p := TemplateBeg;
\r
3332 <B>while</B> p < TemplateEnd <B>do</B> <B>begin</B>
\r
3335 <B>if</B> Ch = '$'
\r
3336 <B>then</B> n := ParseVarName (p)
\r
3337 <B>else</B> n := -1;
\r
3338 <B>if</B> n >= 0 <B>then</B> <B>begin</B>
\r
3339 <B>if</B> (n < NSUBEXP) <B>and</B> Assigned (startp [n]) <B>and</B> Assigned (endp [n])
\r
3340 <B>then</B> inc (ResultLen, endp [n] - startp [n]);
\r
3342 <B>else</B> <B>begin</B>
\r
3343 <B>if</B> (Ch = '/') <B>and</B> (p < TemplateEnd)
\r
3344 <B>then</B> inc (p); <I><FONT COLOR="Navy">// quoted or special char followed </FONT></I>
\r
3348 <I><FONT COLOR="Navy">// Get memory. We do it once and it significant speed up work ! </FONT></I>
\r
3349 <B>if</B> ResultLen = 0 <B>then</B> <B>begin</B>
\r
3353 SetString (Result, <B>nil</B>, ResultLen);
\r
3354 <I><FONT COLOR="Navy">// Fill Result </FONT></I>
\r
3355 ResultPtr := pointer (Result);
\r
3356 p := TemplateBeg;
\r
3357 <B>while</B> p < TemplateEnd <B>do</B> <B>begin</B>
\r
3360 <B>if</B> Ch = '$'
\r
3361 <B>then</B> n := ParseVarName (p)
\r
3362 <B>else</B> n := -1;
\r
3363 <B>if</B> n >= 0 <B>then</B> <B>begin</B>
\r
3364 p0 := startp [n];
\r
3365 <B>if</B> (n < NSUBEXP) <B>and</B> Assigned (p0) <B>and</B> Assigned (endp [n]) <B>then</B>
\r
3366 <B>while</B> p0 < endp [n] <B>do</B> <B>begin</B>
\r
3367 ResultPtr^ := p0^;
\r
3372 <B>else</B> <B>begin</B>
\r
3373 <B>if</B> (Ch = '/') <B>and</B> (p < TemplateEnd) <B>then</B> <B>begin</B> <I><FONT COLOR="Navy">// quoted or special char followed </FONT></I>
\r
3377 ResultPtr^ := Ch;
\r
3381 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.Substitute
\r
3382 --------------------------------------------------------------}</FONT></I>
\r
3384 <I><FONT COLOR="Navy">(*
\r
3385 function TRegExpr.Substitute (const ATemplate : RegExprString) : RegExprString;
\r
3386 // perform substitutions after a regexp match
\r
3388 src : integer; // PRegExprChar; //###0.927
\r
3394 if not IsProgrammOk //###0.929
\r
3397 src := 1; // PRegExprChar (ATemplate); //###0.927
\r
3398 while src <= Length (ATemplate) { ^ <> #0} do begin //###0.927
\r
3399 c := ATemplate [src]; // src^; //###0.927
\r
3401 c2 := ATemplate [src]; //###0.927
\r
3404 else if (c = '/') and ('0' <= c2) and (c2 <= '9')
\r
3406 no := ord (c2) - ord ('0');
\r
3411 if no < 0 then begin // Ordinary character.
\r
3412 if (c = '/') and ((c2 = '/') or (c2 = '&')) then begin
\r
3413 c := c2; // src^;
\r
3416 Result := Result + c;
\r
3418 else Result := Result + Match [no]; //###0.90
\r
3420 end; { of function TRegExpr.Substitute
\r
3421 --------------------------------------------------------------}
\r
3424 <B>procedure</B> TRegExpr.Split (AInputStr : RegExprString; APieces : TStrings);
\r
3425 <B>var</B> PrevPos : integer;
\r
3428 <B>if</B> Exec (AInputStr) <B>then</B>
\r
3430 APieces.Add (System.Copy (AInputStr, PrevPos, MatchPos [0] - PrevPos));
\r
3431 PrevPos := MatchPos [0] + MatchLen [0];
\r
3432 <B>UNTIL</B> <B>not</B> ExecNext;
\r
3433 APieces.Add (System.Copy (AInputStr, PrevPos, MaxInt)); <I><FONT COLOR="Navy">// Tail </FONT></I>
\r
3434 <B>end</B>; <I><FONT COLOR="Navy">{ of procedure TRegExpr.Split
\r
3435 --------------------------------------------------------------}</FONT></I>
\r
3437 <B>function</B> TRegExpr.Replace (AInputStr : RegExprString; <B>const</B> AReplaceStr : RegExprString) : RegExprString;
\r
3438 <B>var</B> PrevPos : integer;
\r
3442 <B>if</B> Exec (AInputStr) <B>then</B>
\r
3444 Result := Result + System.Copy (AInputStr, PrevPos,
\r
3445 MatchPos [0] - PrevPos) + AReplaceStr;
\r
3446 PrevPos := MatchPos [0] + MatchLen [0];
\r
3447 <B>UNTIL</B> <B>not</B> ExecNext;
\r
3448 Result := Result + System.Copy (AInputStr, PrevPos, MaxInt); <I><FONT COLOR="Navy">// Tail </FONT></I>
\r
3449 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.Replace
\r
3450 --------------------------------------------------------------}</FONT></I>
\r
3453 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
3454 <I><FONT COLOR="Navy">{====================== Debug section ========================}</FONT></I>
\r
3455 <I><FONT COLOR="Navy">{=============================================================}</FONT></I>
\r
3457 <I><FONT COLOR="Navy">{$IFDEF DebugRegExpr}</FONT></I>
\r
3458 <B>function</B> TRegExpr.DumpOp (op : TREOp) : RegExprString;
\r
3459 <I><FONT COLOR="Navy">// printable representation of opcode </FONT></I>
\r
3461 <B>case</B> op <B>of</B>
\r
3462 BOL: Result := 'BOL';
\r
3463 EOL: Result := 'EOL';
\r
3464 ANY: Result := 'ANY';
\r
3465 ANYLETTER: Result := 'ANYLETTER';
\r
3466 NOTLETTER: Result := 'NOTLETTER';
\r
3467 ANYDIGIT: Result := 'ANYDIGIT';
\r
3468 NOTDIGIT: Result := 'NOTDIGIT';
\r
3469 ANYSPACE: Result := 'ANYSPACE';
\r
3470 NOTSPACE: Result := 'NOTSPACE';
\r
3471 ANYOF: Result := 'ANYOF';
\r
3472 ANYBUT: Result := 'ANYBUT';
\r
3473 ANYOFCI: Result := 'ANYOF/CI';
\r
3474 ANYBUTCI: Result := 'ANYBUT/CI';
\r
3475 BRANCH: Result := 'BRANCH';
\r
3476 EXACTLY: Result := 'EXACTLY';
\r
3477 EXACTLYCI: Result := 'EXACTLY/CI';
\r
3478 NOTHING: Result := 'NOTHING';
\r
3479 COMMENT: Result := 'COMMENT';
\r
3480 BACK: Result := 'BACK';
\r
3481 EEND: Result := 'END';
\r
3482 BSUBEXP: Result := 'BSUBEXP';
\r
3483 BSUBEXPCI: Result := 'BSUBEXP/CI';
\r
3484 Succ (OPEN) .. TREOp (Ord (OPEN) + NSUBEXP - 1): <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
3485 Result := Format ('OPEN[%d]', [ord (op) - ord (OPEN)]);
\r
3486 Succ (CLOSE) .. TREOp (Ord (CLOSE) + NSUBEXP - 1): <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
3487 Result := Format ('CLOSE[%d]', [ord (op) - ord (CLOSE)]);
\r
3488 STAR: Result := 'STAR';
\r
3489 PLUS: Result := 'PLUS';
\r
3490 BRACES: Result := 'BRACES';
\r
3491 <I><FONT COLOR="Navy">{$IFDEF ComplexBraces}</FONT></I>
\r
3492 LOOPENTRY: Result := 'LOOPENTRY'; <I><FONT COLOR="Navy">//###0.925 </FONT></I>
\r
3493 LOOP: Result := 'LOOP'; <I><FONT COLOR="Navy">//###0.925 </FONT></I>
\r
3494 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3495 ANYOFTINYSET: Result:= 'ANYOFTINYSET';
\r
3496 ANYBUTTINYSET:Result:= 'ANYBUTTINYSET';
\r
3497 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
3498 ANYOFFULLSET: Result:= 'ANYOFFULLSET';
\r
3499 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3500 <B>else</B> Error (reeDumpCorruptedOpcode);
\r
3501 <B>end</B>; <I><FONT COLOR="Navy">{of case op}</FONT></I>
\r
3502 Result := ':' + Result;
\r
3503 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.DumpOp
\r
3504 --------------------------------------------------------------}</FONT></I>
\r
3506 <B>function</B> TRegExpr.Dump : RegExprString;
\r
3507 <I><FONT COLOR="Navy">// dump a regexp in vaguely comprehensible form </FONT></I>
\r
3509 s : PRegExprChar;
\r
3510 op : TREOp; <I><FONT COLOR="Navy">// Arbitrary non-END op. </FONT></I>
\r
3511 next : PRegExprChar;
\r
3513 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
3515 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3517 <B>if</B> <B>not</B> IsProgrammOk <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
3518 <B>then</B> EXIT;
\r
3522 s := programm + REOpSz;
\r
3523 <B>while</B> op <> EEND <B>do</B> <B>begin</B> <I><FONT COLOR="Navy">// While that wasn't END last time... </FONT></I>
\r
3525 Result := Result + Format ('%2d%s', [s - programm, DumpOp (s^)]); <I><FONT COLOR="Navy">// Where, what. </FONT></I>
\r
3526 next := regnext (s);
\r
3527 <B>if</B> next = <B>nil</B> <I><FONT COLOR="Navy">// Next ptr. </FONT></I>
\r
3528 <B>then</B> Result := Result + ' (0)'
\r
3529 <B>else</B> Result := Result + Format (' (%d) ', [(s - programm) + (next - s)]);
\r
3530 inc (s, REOpSz + RENextOffSz);
\r
3531 <B>if</B> (op = ANYOF) <B>or</B> (op = ANYOFCI) <B>or</B> (op = ANYBUT) <B>or</B> (op = ANYBUTCI)
\r
3532 <B>or</B> (op = EXACTLY) <B>or</B> (op = EXACTLYCI) <B>then</B> <B>begin</B>
\r
3533 <I><FONT COLOR="Navy">// Literal string, where present. </FONT></I>
\r
3534 <B>while</B> s^ <> #0 <B>do</B> <B>begin</B>
\r
3535 Result := Result + s^;
\r
3540 <B>if</B> (op = ANYOFTINYSET) <B>or</B> (op = ANYBUTTINYSET) <B>then</B> <B>begin</B>
\r
3541 <B>for</B> i := 1 <B>to</B> TinySetLen <B>do</B> <B>begin</B>
\r
3542 Result := Result + s^;
\r
3546 <B>if</B> (op = BSUBEXP) <B>or</B> (op = BSUBEXPCI) <B>then</B> <B>begin</B>
\r
3547 Result := Result + ' /' + IntToStr (Ord (s^));
\r
3550 <I><FONT COLOR="Navy">{$IFDEF UseSetOfChar}</FONT></I> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
3551 <B>if</B> op = ANYOFFULLSET <B>then</B> <B>begin</B>
\r
3552 <B>for</B> Ch := #0 <B>to</B> #255 <B>do</B>
\r
3553 <B>if</B> Ch <B>in</B> PSetOfREChar (s)^ <B>then</B>
\r
3554 <B>if</B> Ch < ' '
\r
3555 <B>then</B> Result := Result + '#' + IntToStr (Ord (Ch)) <I><FONT COLOR="Navy">//###0.936 </FONT></I>
\r
3556 <B>else</B> Result := Result + Ch;
\r
3557 inc (s, SizeOf (TSetOfREChar));
\r
3559 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3560 <B>if</B> (op = BRACES) <B>then</B> <B>begin</B>
\r
3561 <I><FONT COLOR="Navy">// show min/max argument of BRACES operator </FONT></I>
\r
3562 Result := Result + Format ('{%d,%d}', [PREBracesArg (s)^, PREBracesArg (s + REBracesArgSz)^]);
\r
3563 inc (s, REBracesArgSz * 2);
\r
3565 <I><FONT COLOR="Navy">{$IFDEF ComplexBraces}</FONT></I>
\r
3566 <B>if</B> op = LOOP <B>then</B> <B>begin</B> <I><FONT COLOR="Navy">//###0.925 </FONT></I>
\r
3567 Result := Result + Format (' -> (%d) {%d,%d}', [
\r
3568 (s - programm - (REOpSz + RENextOffSz)) + PRENextOff (s + 2 * REBracesArgSz)^,
\r
3569 PREBracesArg (s)^, PREBracesArg (s + REBracesArgSz)^]);
\r
3570 inc (s, 2 * REBracesArgSz + RENextOffSz);
\r
3572 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3573 Result := Result + #$d#$a;
\r
3574 <B>end</B>; <I><FONT COLOR="Navy">{ of while}</FONT></I>
\r
3576 <I><FONT COLOR="Navy">// Header fields of interest. </FONT></I>
\r
3578 <B>if</B> regstart <> #0
\r
3579 <B>then</B> Result := Result + 'start ' + regstart;
\r
3580 <B>if</B> reganch <> #0
\r
3581 <B>then</B> Result := Result + 'anchored ';
\r
3582 <B>if</B> regmust <> <B>nil</B>
\r
3583 <B>then</B> Result := Result + 'must have ' + regmust;
\r
3584 <I><FONT COLOR="Navy">{$IFDEF UseFirstCharSet}</FONT></I> <I><FONT COLOR="Navy">//###0.929 </FONT></I>
\r
3585 Result := Result + #$d#$a'FirstCharSet:';
\r
3586 <B>for</B> Ch := #0 <B>to</B> #255 <B>do</B>
\r
3587 <B>if</B> Ch <B>in</B> FirstCharSet
\r
3588 <B>then</B> Result := Result + Ch;
\r
3589 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3590 Result := Result + #$d#$a;
\r
3591 <B>end</B>; <I><FONT COLOR="Navy">{ of function TRegExpr.Dump
\r
3592 --------------------------------------------------------------}</FONT></I>
\r
3593 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3595 <I><FONT COLOR="Navy">{$IFDEF reRealExceptionAddr}</FONT></I>
\r
3596 <I><FONT COLOR="Navy">{$OPTIMIZATION ON}</FONT></I>
\r
3597 <I><FONT COLOR="Navy">// ReturnAddr works correctly only if compiler optimization is ON </FONT></I>
\r
3598 <I><FONT COLOR="Navy">// I placed this method at very end of unit because there are no </FONT></I>
\r
3599 <I><FONT COLOR="Navy">// way to restore compiler optimization flag ... </FONT></I>
\r
3600 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3601 <B>procedure</B> TRegExpr.Error (AErrorID : integer);
\r
3602 <I><FONT COLOR="Navy">{$IFDEF reRealExceptionAddr}</FONT></I>
\r
3603 <B>function</B> ReturnAddr : pointer; <I><FONT COLOR="Navy">//###0.938 </FONT></I>
\r
3607 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3611 fLastError := AErrorID; <I><FONT COLOR="Navy">// dummy stub - useless because will raise exception </FONT></I>
\r
3612 <B>if</B> AErrorID < 1000 <I><FONT COLOR="Navy">// compilation error ? </FONT></I>
\r
3613 <B>then</B> e := ERegExpr.Create (ErrorMsg (AErrorID) <I><FONT COLOR="Navy">// yes - show error pos </FONT></I>
\r
3614 + ' (pos ' + IntToStr (CompilerErrorPos) + ')')
\r
3615 <B>else</B> e := ERegExpr.Create (ErrorMsg (AErrorID));
\r
3616 e.ErrorCode := AErrorID;
\r
3617 e.CompilerErrorPos := CompilerErrorPos;
\r
3619 <I><FONT COLOR="Navy">{$IFDEF reRealExceptionAddr}</FONT></I>
\r
3620 At ReturnAddr; <I><FONT COLOR="Navy">//###0.938 </FONT></I>
\r
3621 <I><FONT COLOR="Navy">{$ENDIF}</FONT></I>
\r
3622 <B>end</B>; <I><FONT COLOR="Navy">{ of procedure TRegExpr.Error
\r
3623 --------------------------------------------------------------}</FONT></I>
\r
3625 <I><FONT COLOR="Navy">// be carefull - placed here code will be always compiled with </FONT></I>
\r
3626 <I><FONT COLOR="Navy">// compiler optimization flag </FONT></I>
\r