Ansicht
Dokumentation

ABENREGEX_XSD_SYNTAX_SPECIALS - REGEX XSD SYNTAX SPECIALS

ROGBILLS - Synchronize billing plans PERFORM Short Reference
This documentation is copyright by SAP AG.

- Special Characters

The following tables summarize the special characters in XSD regular expressions.

Pattern Syntax

Single Character Escapes

Syntax	Description
\n	line feed (0x0A)
\r	carriage return (0x0D)
\t	tab (0x09)
\\	literal \
\\|	literal \\|
\.	literal .
\-	literal -
\^	literal ^
\?	literal ?
\*	literal *
\+	literal +
\{	literal {
\}	literal }
\(	literal (
\)	literal )
\[	literal [
\]	literal ]

When enabling the RELAXED_ESCAPES option while constructing an XSD regular expression using CL_ABAP_REGEX=>CREATE_XSD, a character sequence \x that does not appear in the table above or has no other special meaning will match a literal x. Otherwise, if the RELAXED_ESCAPES option was disabled, such a character sequence \x will raise an exception.

Character Escapes

Syntax	Description
.	any character except line feed and carriage return
\d	a digit (respecting Unicode character properties)
\D	a character that is not a digit
\p{xx}	a character with the xx Unicode character property (see below)
\P{xx}	a character without the xx Unicode character property (see below)
\s	a white space character (respecting Unicode character properties)
\S	a character that is not a white space character
\w	a "word" character (respecting Unicode character properties)
\W	a "non-word" character
\i	a character that may be the first character of an XML name
\I	a character that may not be the first character of an XML name
\c	a character that may occur after the first character in an XML name
\C	a character that may not occur after the first character in an XML name

Category Escapes

Syntax	Description
\p{xx}	a character with the xx Unicode character property (see below)
\P{xx}	a character without the xx Unicode character property (see below)

General Categories for Properties \p and \P

Based on the general categories as defined by the Unicode standard.

Category Identifier	Description
C	Other
Cc	Control
Cf	Format
Cn	Unassigned
Co	Private use
L	Letter
Ll	Lower case letter
Lm	Modifier letter
Lo	Other letter
Lt	Title case letter
Lu	Upper case letter
M	Mark
Mc	Spacing mark
Me	Enclosing mark
Mn	Non-spacing mark
N	Number
Nd	Decimal number
Nl	Letter number
No	Other number
P	Punctuation
Pc	Connector punctuation
Pd	Dash punctuation
Pe	Close punctuation
Pf	Final punctuation
Pi	Initial punctuation
Po	Other punctuation
Ps	Open punctuation
S	Symbol
Sc	Currency symbol
Sk	Modifier symbol
Sm	Mathematical symbol
So	Other symbol
Z	Separator
Zl	Line separator
Zp	Paragraph separator
Zs	Space separator

Block Names for Properties \p and \P

Based on the block names as defined by the Unicode standard.

The following block names can be used regardless of current UNICODE_HANDLING:

Block Identifier	Start Code	End Code
IsBasicLatin	#x0000	#x007F
IsLatin-1Supplement	#x0080	#x00FF
IsLatinExtended-A	#x0100	#x017F
IsLatinExtended-B	#x0180	#x024F
IsIPAExtensions	#x0250	#x02AF
IsSpacingModifierLetters	#x02B0	#x02FF
IsCombiningDiacriticalMarks	#x0300	#x036F
IsGreek	#x0370	#x03FF
IsCyrillic	#x0400	#x04FF
IsArmenian	#x0530	#x058F
IsHebrew	#x0590	#x05FF
IsArabic	#x0600	#x06FF
IsSyriac	#x0700	#x074F
IsThaana	#x0780	#x07BF
IsDevanagari	#x0900	#x097F
IsBengali	#x0980	#x09FF
IsGurmukhi	#x0A00	#x0A7F
IsGujarati	#x0A80	#x0AFF
IsOriya	#x0B00	#x0B7F
IsTamil	#x0B80	#x0BFF
IsTelugu	#x0C00	#x0C7F
IsKannada	#x0C80	#x0CFF
IsMalayalam	#x0D00	#x0D7F
IsSinhala	#x0D80	#x0DFF
IsThai	#x0E00	#x0E7F
IsLao	#x0E80	#x0EFF
IsTibetan	#x0F00	#x0FFF
IsMyanmar	#x1000	#x109F
IsGeorgian	#x10A0	#x10FF
IsHangulJamo	#x1100	#x11FF
IsEthiopic	#x1200	#x137F
IsCherokee	#x13A0	#x13FF
IsUnifiedCanadianAboriginalSyllabics	#x1400	#x167F
IsOgham	#x1680	#x169F
IsRunic	#x16A0	#x16FF
IsKhmer	#x1780	#x17FF
IsMongolian	#x1800	#x18AF
IsLatinExtendedAdditional	#x1E00	#x1EFF
IsGreekExtended	#x1F00	#x1FFF
IsGeneralPunctuation	#x2000	#x206F
IsSuperscriptsandSubscripts	#x2070	#x209F
IsCurrencySymbols	#x20A0	#x20CF
IsCombiningMarksforSymbols	#x20D0	#x20FF
IsLetterlikeSymbols	#x2100	#x214F
IsNumberForms	#x2150	#x218F
IsArrows	#x2190	#x21FF
IsMathematicalOperators	#x2200	#x22FF
IsMiscellaneousTechnical	#x2300	#x23FF
IsControlPictures	#x2400	#x243F
IsOpticalCharacterRecognition	#x2440	#x245F
IsEnclosedAlphanumerics	#x2460	#x24FF
IsBoxDrawing	#x2500	#x257F
IsBlockElements	#x2580	#x259F
IsGeometricShapes	#x25A0	#x25FF
IsMiscellaneousSymbols	#x2600	#x26FF
IsDingbats	#x2700	#x27BF
IsBraillePatterns	#x2800	#x28FF
IsCJKRadicalsSupplement	#x2E80	#x2EFF
IsKangxiRadicals	#x2F00	#x2FDF
IsIdeographicDescriptionCharacters	#x2FF0	#x2FFF
IsCJKSymbolsandPunctuation	#x3000	#x303F
IsHiragana	#x3040	#x309F
IsKatakana	#x30A0	#x30FF
IsBopomofo	#x3100	#x312F
IsHangulCompatibilityJamo	#x3130	#x318F
IsKanbun	#x3190	#x319F
IsBopomofoExtended	#x31A0	#x31BF
IsEnclosedCJKLettersandMonths	#x3200	#x32FF
IsCJKCompatibility	#x3300	#x33FF
IsCJKUnifiedIdeographsExtensionA	#x3400	#x4DB5
IsCJKUnifiedIdeographs	#x4E00	#x9FFF
IsYiSyllables	#xA000	#xA48F
IsYiRadicals	#xA490<	#xA4CF
IsHangulSyllables	#xAC00	#xD7A3
IsPrivateUse	#xE000	#xF8FF
IsCJKCompatibilityIdeographs	#xF900	#xFAFF
IsAlphabeticPresentationForms	#xFB00	#xFB4F
IsArabicPresentationForms-A	#xFB50	#xFDFF
IsCombiningHalfMarks	#xFE20	#xFE2F
IsCJKCompatibilityForms	#xFE30	#xFE4F
IsSmallFormVariants	#xFE50	#xFE6F
IsArabicPresentationForms-B	#xFE70	#xFEFE
IsSpecials	#xFEFF	#xFEFF
IsHalfwidthandFullwidthForms	#xFF00	#xFFEF
IsSpecials	#xFFF0	#xFFFD

The following block names can only be used when UNICODE_HANDLING is set to STRICT or IGNORE, but not when set to RELAXED, as they do not overlap with the Basic Multilingual Plane:

Block Identifier	Start Code	End Code
IsByzantineMusicalSymbols	#x1D000	#x1D0FF
IsMusicalSymbols	#x1D100	#x1D1FF
IsMathematicalAlphanumericSymbols	#x1D400	#x1D7FF
IsCJKUnifiedIdeographsExtensionB	#x20000	#x2A6D6
IsCJKCompatibilityIdeographsSupplement	#x2F800	#x2FA1F
IsTags	#xE0000	#xE007F

Quantifiers

Syntax	Description
?	0 or 1, greedy
*	0 or more, greedy
+	1 or more, greedy
{n}	exactly n
{n,m}	at least n, no more than m, greedy
{n,}	n or more, greedy

Lazy quantifiers (also known as reluctant quantifiers) are not supported.

Grouping and Capturing

Syntax	Description
(...)	capture group

Technically the concept of capturing groups does not exist in the XSD standard as there are no pure standard means to refer to the content of a group. As the ABAP implementation of XSD regular expressions allows for a PCRE-style replacement syntax, all groups are regarded as capturing.

Alternation

Syntax	Description
\|	start of alternative branch

Character Classes

Syntax	Description
[...]	positive character class
[^...]	negative character class
[x-y]	range
[a-[b]]	character class subtraction (can be nested)

Replacement Syntax

While the XSD standard only defines the match operation for regular expressions, the ABAP implementation also allows all other operations as defined by CL_ABAP_REGEX and CL_ABAP_MATCHER.

The syntax of replacement patterns for XSD regular expressions is the same as for PCRE regular expressions.

CPI1466 during Backup Vendor Master (General Section)
This documentation is copyright by SAP AG.

Length: 50187 Date: 20240420 Time: 015018 sap01-206 ( 310 ms )

Ansicht Dokumentation