Ansicht
Dokumentation

ABENREGEX_XPATH_SYNTAX_SPECIALS - REGEX XPATH SYNTAX SPECIALS

ABENREGEX_XPATH_SYNTAX_SPECIALS - REGEX XPATH SYNTAX SPECIALS

General Material Data   BAL Application Log Documentation  
This documentation is copyright by SAP AG.
SAP E-Book

- Special Characters

The following tables summarize the special characters in XPath regular expressions.

Pattern Syntax

Single Character Escapes

Syntax Description
\n line feed (0x0A)
\r carriage return (0x0D)
\t tab (0x09)
\\ literal \
\| literal |
\. literal .
\- literal -
\^ literal ^
\$ literal $
\? literal ?
\* literal *
\+ literal +
\{ literal {
\} literal }
\( literal (
\) literal )
\[ literal [
\] literal ]

When enabling the RELAXED_ESCAPES option while constructing an XPath regular expression using CL_ABAP_REGEX=>CREATE_XPATH2, a character sequence \x that does not appear in the table above or has no other special meaning will match a literal x. Otherwise, if the RELAXED_ESCAPES option was disabled, a character sequence \x will raise an exception.

Multi Character Escapes

Syntax Description
. any character except line feed and carriage return
\d a digit (respecting Unicode character properties)
\D a character that is not a digit
\p{xx} a character with the xx Unicode character property (see below)
\P{xx} a character without the xx Unicode character property (see below)
\s a white space character (respecting Unicode character properties)
\S a character that is not a white space character
\w a "word" character (respecting Unicode character properties)
\W a "non-word" character
\i a character that may be the first character of an XML name
\I a character that may not be the first character of an XML name
\c a character that may occur after the first character in an XML name
\C a character that may not occur after the first character in an XML name

Category Escapes

Syntax Description
\p{xx} a character with the xx Unicode character property (see below)
\P{xx} a character without the xx Unicode character property (see below)

General Categories for Properties \p and \P

Based on the general categories as defined by the Unicode standard.

Category Identifier Description
C Other
Cc Control
Cf Format
Cn Unassigned
Co Private use
L Letter
Ll Lower case letter
Lm Modifier letter
Lo Other letter
Lt Title case letter
Lu Upper case letter
M Mark
Mc Spacing mark
Me Enclosing mark
Mn Non-spacing mark
N Number
Nd Decimal number
Nl Letter number
No Other number
P Punctuation
Pc Connector punctuation
Pd Dash punctuation
Pe Close punctuation
Pf Final punctuation
Pi Initial punctuation
Po Other punctuation
Ps Open punctuation
S Symbol
Sc Currency symbol
Sk Modifier symbol
Sm Mathematical symbol
So Other symbol
Z Separator
Zl Line separator
Zp Paragraph separator
Zs Space separator

Block Names for Properties \p and \P

Based on the block names as defined by the Unicode standard.

The following block names can be used regardless of current UNICODE_HANDLING:

Block Identifier Start Code End Code
IsBasicLatin #x0000 #x007F
IsLatin-1Supplement #x0080 #x00FF
IsLatinExtended-A #x0100 #x017F
IsLatinExtended-B #x0180 #x024F
IsIPAExtensions #x0250 #x02AF
IsSpacingModifierLetters #x02B0 #x02FF
IsCombiningDiacriticalMarks #x0300 #x036F
IsGreek #x0370 #x03FF
IsCyrillic #x0400 #x04FF
IsArmenian #x0530 #x058F
IsHebrew #x0590 #x05FF
IsArabic #x0600 #x06FF
IsSyriac #x0700 #x074F
IsThaana #x0780 #x07BF
IsDevanagari #x0900 #x097F
IsBengali #x0980 #x09FF
IsGurmukhi #x0A00 #x0A7F
IsGujarati #x0A80 #x0AFF
IsOriya #x0B00 #x0B7F
IsTamil #x0B80 #x0BFF
IsTelugu #x0C00 #x0C7F
IsKannada #x0C80 #x0CFF
IsMalayalam #x0D00 #x0D7F
IsSinhala #x0D80 #x0DFF
IsThai #x0E00 #x0E7F
IsLao #x0E80 #x0EFF
IsTibetan #x0F00 #x0FFF
IsMyanmar #x1000 #x109F
IsGeorgian #x10A0 #x10FF
IsHangulJamo #x1100 #x11FF
IsEthiopic #x1200 #x137F
IsCherokee #x13A0 #x13FF
IsUnifiedCanadianAboriginalSyllabics #x1400 #x167F
IsOgham #x1680 #x169F
IsRunic #x16A0 #x16FF
IsKhmer #x1780 #x17FF
IsMongolian #x1800 #x18AF
IsLatinExtendedAdditional #x1E00 #x1EFF
IsGreekExtended #x1F00 #x1FFF
IsGeneralPunctuation #x2000 #x206F
IsSuperscriptsandSubscripts #x2070 #x209F
IsCurrencySymbols #x20A0 #x20CF
IsCombiningMarksforSymbols #x20D0 #x20FF
IsLetterlikeSymbols #x2100 #x214F
IsNumberForms #x2150 #x218F
IsArrows #x2190 #x21FF
IsMathematicalOperators #x2200 #x22FF
IsMiscellaneousTechnical #x2300 #x23FF
IsControlPictures #x2400 #x243F
IsOpticalCharacterRecognition #x2440 #x245F
IsEnclosedAlphanumerics #x2460 #x24FF
IsBoxDrawing #x2500 #x257F
IsBlockElements #x2580 #x259F
IsGeometricShapes #x25A0 #x25FF
IsMiscellaneousSymbols #x2600 #x26FF
IsDingbats #x2700 #x27BF
IsBraillePatterns #x2800 #x28FF
IsCJKRadicalsSupplement #x2E80 #x2EFF
IsKangxiRadicals #x2F00 #x2FDF
IsIdeographicDescriptionCharacters #x2FF0 #x2FFF
IsCJKSymbolsandPunctuation #x3000 #x303F
IsHiragana #x3040 #x309F
IsKatakana #x30A0 #x30FF
IsBopomofo #x3100 #x312F
IsHangulCompatibilityJamo #x3130 #x318F
IsKanbun #x3190 #x319F
IsBopomofoExtended #x31A0 #x31BF
IsEnclosedCJKLettersandMonths #x3200 #x32FF
IsCJKCompatibility #x3300 #x33FF
IsCJKUnifiedIdeographsExtensionA #x3400 #x4DB5
IsCJKUnifiedIdeographs #x4E00 #x9FFF
IsYiSyllables #xA000 #xA48F
IsYiRadicals #xA490 #xA4CF
IsHangulSyllables #xAC00 #xD7A3
IsPrivateUse #xE000 #xF8FF
IsCJKCompatibilityIdeographs #xF900 #xFAFF
IsAlphabeticPresentationForms #xFB00 #xFB4F
IsArabicPresentationForms-A #xFB50 #xFDFF
IsCombiningHalfMarks #xFE20 #xFE2F
IsCJKCompatibilityForms #xFE30 #xFE4F
IsSmallFormVariants #xFE50 #xFE6F
IsArabicPresentationForms-B #xFE70 #xFEFE
IsSpecials #xFEFF #xFEFF
IsHalfwidthandFullwidthForms #xFF00 #xFFEF
IsSpecials #xFFF0 #xFFFD

The following block names can only be used when UNICODE_HANDLING is set to STRICT or IGNORE, but not when set to RELAXED, as they do not overlap with the Basic Multilingual Plane:

Block Identifier Start Code End Code
IsByzantineMusicalSymbols #x1D000 #x1D0FF
IsMusicalSymbols #x1D100 #x1D1FF
IsMathematicalAlphanumericSymbols #x1D400 #x1D7FF
IsCJKUnifiedIdeographsExtensionB #x20000 #x2A6D6
IsCJKCompatibilityIdeographsSupplement #x2F800 #x2FA1F
IsTags #xE0000 #xE007F

Quantifiers

Syntax Description
? 0 or 1, greedy
?? 0 or 1, lazy
* 0 or more, greedy
*? 0 or more, lazy
+ 1 or more, greedy
+? 1 or more, lazy
{n} exactly n
{n,m} at least n, no more than m, greedy
{n,m}? at least n, no more than m, lazy
{n,} n or more, greedy
{n,}? n or more, lazy

Grouping and Capturing

Syntax Description
(...) capture group
(?:...) non-capture group

Anchors

Syntax Description
^ start of subject (also after an internal line break, that is a line feed that does not occur at the end of the subject, in multiline mode)
$ end of subject (also before line break in multiline mode)

Backreferences

Syntax Description
\n reference by number n;\lbr a capture group cannot be referenced from within itself;\lbr a backreference can be followed by more digits; digits are only taken into account if the resulting number is smaller to or equal the amount of opening parentheses seen so far in the pattern; e.g. the pattern (a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)\11 (that includes 11 capture groups) would match the string abcdefghijkk, but the pattern (a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)\12 (that also includes 11 capture groups) would match the string abcdefghijka2

Alternation

Syntax Description
| start of alternative branch

Character Classes

Syntax Description
[...] positive character class
[^...] negative character class
[x-y] range
[a-[b]] character class subtraction (can be nested)

Replacement Syntax

The syntax of replacement patterns for XPath regular expressions is the same as for PCRE regular expressions.






ROGBILLS - Synchronize billing plans   CPI1466 during Backup  
This documentation is copyright by SAP AG.

Length: 52979 Date: 20240419 Time: 052209     sap01-206 ( 375 ms )