Ansicht
Dokumentation
ABENREGEX_XSD_SYNTAX_SPECIALS - REGEX XSD SYNTAX SPECIALS
ROGBILLS - Synchronize billing plans PERFORM Short ReferenceThis documentation is copyright by SAP AG.
- Special Characters
The following tables summarize the special characters in XSD regular expressions.
Pattern Syntax
Single Character Escapes
Syntax | Description |
\n | line feed (0x0A) |
\r | carriage return (0x0D) |
\t | tab (0x09) |
\\ | literal \ |
\| | literal \| |
\. | literal . |
\- | literal - |
\^ | literal ^ |
\? | literal ? |
\* | literal * |
\+ | literal + |
\{ | literal { |
\} | literal } |
\( | literal ( |
\) | literal ) |
\[ | literal [ |
\] | literal ] |
When enabling the RELAXED_ESCAPES option while constructing an XSD regular expression using CL_ABAP_REGEX=>CREATE_XSD, a character sequence \x that does not appear in the table above or has no other special meaning will match a literal x. Otherwise, if the RELAXED_ESCAPES option was disabled, such a character sequence \x will raise an exception.
Character Escapes
Syntax | Description |
. | any character except line feed and carriage return |
\d | a digit (respecting Unicode character properties) |
\D | a character that is not a digit |
\p{xx} | a character with the xx Unicode character property (see below) |
\P{xx} | a character without the xx Unicode character property (see below) |
\s | a white space character (respecting Unicode character properties) |
\S | a character that is not a white space character |
\w | a "word" character (respecting Unicode character properties) |
\W | a "non-word" character |
\i | a character that may be the first character of an XML name |
\I | a character that may not be the first character of an XML name |
\c | a character that may occur after the first character in an XML name |
\C | a character that may not occur after the first character in an XML name |
Category Escapes
Syntax | Description |
\p{xx} | a character with the xx Unicode character property (see below) |
\P{xx} | a character without the xx Unicode character property (see below) |
General Categories for Properties \p and \P
Based on the general categories as defined by the Unicode standard.
Category Identifier | Description |
C | Other |
Cc | Control |
Cf | Format |
Cn | Unassigned |
Co | Private use |
L | Letter |
Ll | Lower case letter |
Lm | Modifier letter |
Lo | Other letter |
Lt | Title case letter |
Lu | Upper case letter |
M | Mark |
Mc | Spacing mark |
Me | Enclosing mark |
Mn | Non-spacing mark |
N | Number |
Nd | Decimal number |
Nl | Letter number |
No | Other number |
P | Punctuation |
Pc | Connector punctuation |
Pd | Dash punctuation |
Pe | Close punctuation |
Pf | Final punctuation |
Pi | Initial punctuation |
Po | Other punctuation |
Ps | Open punctuation |
S | Symbol |
Sc | Currency symbol |
Sk | Modifier symbol |
Sm | Mathematical symbol |
So | Other symbol |
Z | Separator |
Zl | Line separator |
Zp | Paragraph separator |
Zs | Space separator |
Block Names for Properties \p and \P
Based on the block names as defined by the Unicode standard.
The following block names can be used regardless of current UNICODE_HANDLING:
Block Identifier | Start Code | End Code |
IsBasicLatin | #x0000 | #x007F |
IsLatin-1Supplement | #x0080 | #x00FF |
IsLatinExtended-A | #x0100 | #x017F |
IsLatinExtended-B | #x0180 | #x024F |
IsIPAExtensions | #x0250 | #x02AF |
IsSpacingModifierLetters | #x02B0 | #x02FF |
IsCombiningDiacriticalMarks | #x0300 | #x036F |
IsGreek | #x0370 | #x03FF |
IsCyrillic | #x0400 | #x04FF |
IsArmenian | #x0530 | #x058F |
IsHebrew | #x0590 | #x05FF |
IsArabic | #x0600 | #x06FF |
IsSyriac | #x0700 | #x074F |
IsThaana | #x0780 | #x07BF |
IsDevanagari | #x0900 | #x097F |
IsBengali | #x0980 | #x09FF |
IsGurmukhi | #x0A00 | #x0A7F |
IsGujarati | #x0A80 | #x0AFF |
IsOriya | #x0B00 | #x0B7F |
IsTamil | #x0B80 | #x0BFF |
IsTelugu | #x0C00 | #x0C7F |
IsKannada | #x0C80 | #x0CFF |
IsMalayalam | #x0D00 | #x0D7F |
IsSinhala | #x0D80 | #x0DFF |
IsThai | #x0E00 | #x0E7F |
IsLao | #x0E80 | #x0EFF |
IsTibetan | #x0F00 | #x0FFF |
IsMyanmar | #x1000 | #x109F |
IsGeorgian | #x10A0 | #x10FF |
IsHangulJamo | #x1100 | #x11FF |
IsEthiopic | #x1200 | #x137F |
IsCherokee | #x13A0 | #x13FF |
IsUnifiedCanadianAboriginalSyllabics | #x1400 | #x167F |
IsOgham | #x1680 | #x169F |
IsRunic | #x16A0 | #x16FF |
IsKhmer | #x1780 | #x17FF |
IsMongolian | #x1800 | #x18AF |
IsLatinExtendedAdditional | #x1E00 | #x1EFF |
IsGreekExtended | #x1F00 | #x1FFF |
IsGeneralPunctuation | #x2000 | #x206F |
IsSuperscriptsandSubscripts | #x2070 | #x209F |
IsCurrencySymbols | #x20A0 | #x20CF |
IsCombiningMarksforSymbols | #x20D0 | #x20FF |
IsLetterlikeSymbols | #x2100 | #x214F |
IsNumberForms | #x2150 | #x218F |
IsArrows | #x2190 | #x21FF |
IsMathematicalOperators | #x2200 | #x22FF |
IsMiscellaneousTechnical | #x2300 | #x23FF |
IsControlPictures | #x2400 | #x243F |
IsOpticalCharacterRecognition | #x2440 | #x245F |
IsEnclosedAlphanumerics | #x2460 | #x24FF |
IsBoxDrawing | #x2500 | #x257F |
IsBlockElements | #x2580 | #x259F |
IsGeometricShapes | #x25A0 | #x25FF |
IsMiscellaneousSymbols | #x2600 | #x26FF |
IsDingbats | #x2700 | #x27BF |
IsBraillePatterns | #x2800 | #x28FF |
IsCJKRadicalsSupplement | #x2E80 | #x2EFF |
IsKangxiRadicals | #x2F00 | #x2FDF |
IsIdeographicDescriptionCharacters | #x2FF0 | #x2FFF |
IsCJKSymbolsandPunctuation | #x3000 | #x303F |
IsHiragana | #x3040 | #x309F |
IsKatakana | #x30A0 | #x30FF |
IsBopomofo | #x3100 | #x312F |
IsHangulCompatibilityJamo | #x3130 | #x318F |
IsKanbun | #x3190 | #x319F |
IsBopomofoExtended | #x31A0 | #x31BF |
IsEnclosedCJKLettersandMonths | #x3200 | #x32FF |
IsCJKCompatibility | #x3300 | #x33FF |
IsCJKUnifiedIdeographsExtensionA | #x3400 | #x4DB5 |
IsCJKUnifiedIdeographs | #x4E00 | #x9FFF |
IsYiSyllables | #xA000 | #xA48F |
IsYiRadicals | #xA490< | #xA4CF |
IsHangulSyllables | #xAC00 | #xD7A3 |
IsPrivateUse | #xE000 | #xF8FF |
IsCJKCompatibilityIdeographs | #xF900 | #xFAFF |
IsAlphabeticPresentationForms | #xFB00 | #xFB4F |
IsArabicPresentationForms-A | #xFB50 | #xFDFF |
IsCombiningHalfMarks | #xFE20 | #xFE2F |
IsCJKCompatibilityForms | #xFE30 | #xFE4F |
IsSmallFormVariants | #xFE50 | #xFE6F |
IsArabicPresentationForms-B | #xFE70 | #xFEFE |
IsSpecials | #xFEFF | #xFEFF |
IsHalfwidthandFullwidthForms | #xFF00 | #xFFEF |
IsSpecials | #xFFF0 | #xFFFD |
The following block names can only be used when UNICODE_HANDLING is set to STRICT or IGNORE, but not when set to RELAXED, as they do not overlap with the Basic Multilingual Plane:
Block Identifier | Start Code | End Code |
IsByzantineMusicalSymbols | #x1D000 | #x1D0FF |
IsMusicalSymbols | #x1D100 | #x1D1FF |
IsMathematicalAlphanumericSymbols | #x1D400 | #x1D7FF |
IsCJKUnifiedIdeographsExtensionB | #x20000 | #x2A6D6 |
IsCJKCompatibilityIdeographsSupplement | #x2F800 | #x2FA1F |
IsTags | #xE0000 | #xE007F |
Quantifiers
Syntax | Description |
? | 0 or 1, greedy |
* | 0 or more, greedy |
+ | 1 or more, greedy |
{n} | exactly n |
{n,m} | at least n, no more than m, greedy |
{n,} | n or more, greedy |
Lazy quantifiers (also known as reluctant quantifiers) are not supported.
Grouping and Capturing
Syntax | Description |
(...) | capture group |
Technically the concept of capturing groups does not exist in the XSD standard as there are no pure standard means to refer to the content of a group. As the ABAP implementation of XSD regular expressions allows for a PCRE-style replacement syntax, all groups are regarded as capturing.
Alternation
Syntax | Description |
| | start of alternative branch |
Character Classes
Syntax | Description |
[...] | positive character class |
[^...] | negative character class |
[x-y] | range |
[a-[b]] | character class subtraction (can be nested) |
Replacement Syntax
While the XSD standard only defines the match operation for regular expressions, the ABAP implementation also allows all other operations as defined by CL_ABAP_REGEX and CL_ABAP_MATCHER.
The syntax of replacement patterns for XSD regular expressions is the same as for PCRE regular expressions.
CPI1466 during Backup Vendor Master (General Section)
This documentation is copyright by SAP AG.
Length: 50187 Date: 20240420 Time: 015018 sap01-206 ( 310 ms )