Zum Inhalt springen

Regex

Aus Wikipedia
S Suachagebnis griagd ma midn Regex
(?<=\.) {2,}(?=[A-Z])
Zwoa Laazoachn miassn gfundn wean, owa nua wens noch an Punkt (.) afdredn und voa an groussn Buachstom.

A Regex oda Regular Expression (boarisch: Regulära Ausdruck) is a Sequenz vo Zoachn, wo a Suachmusta definiad.

Regex wean in da Softwareentwicklung vawendd owa aa in Texteditorn, wo s zan Suacha und Dasetzn vo Zoachnkeedn vawendd wean. So konst in ana Wikipedia olle Weata aussasuacha, wo mit A oofanga und mid -bichl afhean. Do is wuascht, wejchane Zoachn dazwischn liegn. Sowos geht nua mid an Regex.

D Syntax vo d Regex variiad a wengal zwischn vaschiednan Apps.

Oafoche Regex

[Werkeln | Am Gwëntext werkeln]
Operator Effekt
. Da Punktoperator driffd jeds Zoachn.
[ ] A Box (Kostn) dameglicht s Findn vo oanzlna Zoachn in an Text oda in ana Zoachnkeedn.
[^ ] A Complement Box (Gengdoalskostn) dameglicht, dass oanzlne Zoachn in an Text oda ana Zoachnkeedn ned gfundn wean.
^ A Caret Anchor (Zoachnanka) driffd en Ofang vo ana Zein (oda jeda Zein im Multiline Mode)
$ A Dollar Anchor(Dollaranka) driffd s End vo ana Zein (oda jeda Zein im Multiline Mode)
( ) Runde Klamman (parentheses) defininan an markiadn Untaausdruck (marked subexpression). Dea gfundaned Textowschnidd ko spada wieda owgruafa wean.
\n n is a Ziffa vo 1 to 9; driffd wos da nte markiade Untaausdruck driffd. Den Operator gibts ned in da daweitadn Regex-Syntax.
* A oanzlns Zoachn gfoigt vo "*" driffd Nui oda meah Kopien vo dem Ausdruck. Beispuisweis, "ab*c" driffd "ac", "abc", "abbbc" etc. "[xyz]*" driffd "", "x", "y", "zx", "zyx", und so weida.
  • \n*, where n is a digit from 1 to 9, matches zero or more iterations of what the nth marked subexpression matched. For example, "(a.)c\1*" matches "abcab" and "abcabab" but not "abcac".
  • A Ausdruck wo vo "\(" and "\)" eihgschlossn is, gfoigt vo an "*" guit ois invalid.
  • "^[MH]uad"
    • Driffd Muad und Huad owa nua am Ofang vo ana Zein.
  • "[MH]uad$"
    • Driffd Muad und Huad owa nua am End vo ana Zein.
[egh] oans vo d Zoachn „e“, „g“ oder „h“
[0-6] a Ziffa vo „0“ bis „6“ (Bindestriich gem an Bereich oo)
[A-Za-z0-9] a beliabiga lateinischa Buachstob oda a beliabige Ziffa
[^a] a beliabigs Zoachn aussa „a“ („^“ voa ana Zoachnklass moant Negation)
[-A-Z], [A-Z-] (bzw. [A-Z\-a-z], owa ned noch POSIX) D Auswoi enthoid aa en Bindestrich „-“

Es gibt Zoachnklassn, wo fiadefiniat san. Des wead owa ned in oin Implementiarunga glei untastitzt. Zoachnklassn san beispuisweis:

\d digit a Ziffa, oiso [0-9] (und evtl. aa no weidane Zoizoachn, wia Unicode usw.)
\D no digit a Zoachn, wo koa Ziffa is, oiso [^\d]
\w word character a Buachstob, a Ziffa oda a Untastrich, oiso [a-zA-Z_0-9] (und evtl. aa no ned-lateinische Buachstom, z. B. Umlaut)
\W no word character a Zoachn, wo weda Buachstob Zoi no Untastrich is, oiso [^\w]
\s whitespace moast mindast s Laazoachn und d Klass vo d Steiazoachn \f, \n, \r, \t und \v
\S no whitespace a Zoachn, wo koa Whitespace is, oiso [^\s]

Zoachnklassn noch POSIX-Standard

[Werkeln | Am Gwëntext werkeln]
POSIX Ned-Standard Perl/Tcl Vim Java ASCII Bschrieb
[:ascii:][1] \p{ASCII} [\x00-\x7F] ASCII characters (ASCII Zoachn)
[:alnum:] \p{Alnum} [A-Za-z0-9] Alphanumeric characters (alphanumerische Zoachn)
[:word:][1] \w \w \w [A-Za-z0-9_] Alphanumeric characters plus "_" (alphanum. Zoachn plus "_")
\W \W \W [^A-Za-z0-9_] Non-word characters (Ned-Woat Zoachn)
[:alpha:] \a \p{Alpha} [A-Za-z] Alphabetic characters (Buachstom)
[:blank:] \s \p{Blank} [ [[\t]]] Space and tab (Laazoachn und Tabs)
\b \< \> \b (?<=\W)(?=\w)|(?<=\w)(?=\W) Word boundaries (Woatgrenzn)
\B (?<=\W)(?=\W)|(?<=\w)(?=\w) Non-word boundaries (Ned-Woat-Grenzn)
[:cntrl:] \p{Cntrl} [\x00-\x1F\x7F] Control characters (Steiazoachn)
[:digit:] \d \d \p{Digit} or \d [0-9] Digits (Ziffan)
\D \D \D [^0-9] Non-digits (Ned-Ziffan)
[:graph:] \p{Graph} [\x21-\x7E] Visible characters (Sichtbore Zoachn)
[:lower:] \l \p{Lower} [a-z] Lowercase letters (kloane Buachstom)
[:print:] \p \p{Print} [\x20-\x7E] Visible characters and the space character (Sichtbore Zoachn & Laazoachn)
[:punct:] \p{Punct} [][!"#$%&'()*+,./:;<=>?@\^_`{|}~-] Punctuation characters (Zoachnsetzung bzw. Interpunktion)
[:space:] \s \_s \p{Space} or \s [ \t\r\n\v\f] Whitespace characters (Laazoachn)
\S \S \S [^ \t\r\n\v\f] Non-whitespace characters (Ned-Laazoachn)
[:upper:] \u \p{Upper} [A-Z] Uppercase letters (grousse Buachstom)
[:xdigit:] \x \p{XDigit} [A-Fa-f0-9] Hexadecimal digits (hexadezimale Zoachn)

Quantifier (Quantifiziara oda Wiedahoifaktorn) legn fest, wia oft a Ausdruck, oiso a vurigs Zoachn bzw. a vurige Zoachnkeedn zuaglossn is.

? Da vurige Ausdruck is optionai, ea ko fiakema, braucht owa ned. Des hoasst, da Ausdruck kimmt nui- oda oamoi fia. (Des entspricht {0,1})
+ Da vurige Ausdruck muass mindastns oamoi fiakema, deaf owa aa efta fiakema. (Des is aa {1,})
* Da vurige Ausdruck deaf beliabi oft (aa koamoi) fiakema. (Des is aa {0,})
{n} Da vurige Ausdruck muass exakt n-moi fiakema. (Des is aa {n,n})
{min,} Da vurige Ausdruck muass mindastens min-moi fiakema.
{min,max} Da vurige Ausdruck muass mindastens min-moi und deaf maximai max-moi fiakema.
{0,max} Da vurige Ausdruck deaf maximai max-moi fiakema.
  • a+ is „a“ owa aa „aaaa“
  • [0-9]+ is „0123456789“ owa aa „072345“
  • [ab]+ is „a“, „b“, „aa“, „bbaab“ usw.
  • [0-9]{2,5} is mindastns zwoa und maximai 5 Ziffan, z. B. „91“ oder „63091“

Praktische Beispui

[Werkeln | Am Gwëntext werkeln]
Operator Bschrieb Beispui
. Driffd normai jeds Zoachn auss a neie Zein.
In eckadn Klamman is da Punkt weatle gmoant.
$string1 = "Hello World\n";
if ($string1 =~ m/...../) {
  print "$string1 has length >= 5.\n";
}

Output:

Hello World
 has length >= 5.
( ) Grupiad Zoachn za oan Element.
Wen a Ausdruck in rundn Klamman gfunden wead, ko spada duach $1, $2, ... draf zuagriffa wean.
$string1 = "Hello World\n";
if ($string1 =~ m/(H..).(o..)/) {
  print "We matched '$1' and '$2'.\n";
}

Output:

We matched 'Hel' and 'o W'.
+ Driffd as voaherige Zoachn oamoi oda meahmois.
$string1 = "Hello World\n";
if ($string1 =~ m/l+/) {
  print "There are one or more consecutive letter \"l\"'s in $string1.\n";
}

Output:

There are one or more consecutive letter "l"'s in Hello World.
? Driffd as voaherige Zoachn nuimoi oda oamoi.
$string1 = "Hello World\n";
if ($string1 =~ m/H.?e/) {
  print "There is an 'H' and a 'e' separated by ";
  print "0-1 characters (e.g., He Hue Hee).\n";
}

Output:

There is an 'H' and a 'e' separated by 0-1 characters (e.g., He Hue Hee).
? Modifiziad an *, +, ? or {M,N} Regex, wo voahea kimmt, so dass a meglichst sejtn gfundn wead (non-greedy match).
$string1 = "Hello World\n";
if ($string1 =~ m/(l.+?o)/) {
  print "The non-greedy match with 'l' followed by one or\n";
  print "more characters is 'llo' rather than 'llo Wo'.\n";
}

Output:

The non-greedy match with 'l' followed by one or
more characters is 'llo' rather than 'llo Wo'.
* Driffd as voaherige Zoachn nuimoi oda meahmois.
$string1 = "Hello World\n";
if ($string1 =~ m/el*o/) {
  print "There is an 'e' followed by zero to many ";
  print "'l' followed by 'o' (e.g., eo, elo, ello, elllo).\n";
}

Output:

There is an 'e' followed by zero to many 'l' followed by 'o' (e.g., eo, elo, ello, elllo).
{M,N} Definiad a Minimum M und a Maximum N vo Zoachn-Iwaeihstimmunga (match count).
N ko ausglossn wean und M ko 0 sei: {M} driffd "genau" M moi; {M,} driffd "zmindast" M moi; {0,N} driffd "hextns" N moi.
x* y+ z? is so equivalent za x{0,} y{1,} z{0,1}.
$string1 = "Hello World\n";
if ($string1 =~ m/l{1,2}/) {
  print "There exists a substring with at least 1 ";
  print "and at most 2 l's in $string1\n";
}

Output:

There exists a substring with at least 1 and at most 2 l's in Hello World
[…] Definiad a Reih vo meglichn Zoachn-Iwaeihstimmunga.
$string1 = "Hello World\n";
if ($string1 =~ m/[aeiou]+/) {
  print "$string1 contains one or more vowels.\n";
}

Output:

Hello World
 contains one or more vowels.
| Separiad oitanative Meglikeidn.
$string1 = "Hello World\n";
if ($string1 =~ m/(Hello|Hi|Pogo)/) {
  print "$string1 contains at least one of Hello, Hi, or Pogo.";
}

Output:

Hello World
 contains at least one of Hello, Hi, or Pogo.
\b Driffd a Nuibroadngrenz (zero-width boundary) zwischn am Zoachn vo da Woatklass (schaug untn) und entweda am Zoachn vo da Ned-Woatklass oder ana Kantn; säim wia

(^\w|\w$|\W\w|\w\W).

$string1 = "Hello World\n";
if ($string1 =~ m/llo\b/) {
  print "There is a word that ends with 'llo'.\n";
}

Output:

There is a word that ends with 'llo'.
\w Driffd a alphanumerisches Zoachn, eihschliassle "_";
säim wia [A-Za-z0-9_] in ASCII, und
[\p{Alphabetic}\p{GC=Mark}\p{GC=Decimal_Number}\p{GC=Connector_Punctuation}]

in Unicode, wo Alphabetic mehra ois wia lateinische Buachstom moant und Decimal_Number mehra ois wia arabische Ziffan moant.

$string1 = "Hello World\n";
if ($string1 =~ m/\w/) {
  print "There is at least one alphanumeric ";
  print "character in $string1 (A-Z, a-z, 0-9, _).\n";
}

Output:

There is at least one alphanumeric character in Hello World
 (A-Z, a-z, 0-9, _).
\W Driffd a ned-alphanumerisches Zoachn, ausschliassle "_";
same as [^A-Za-z0-9_] in ASCII, und
[^\p{Alphabetic}\p{GC=Mark}\p{GC=Decimal_Number}\p{GC=Connector_Punctuation}]

in Unicode.

$string1 = "Hello World\n";
if ($string1 =~ m/\W/) {
  print "The space between Hello and ";
  print "World is not alphanumeric.\n";
}

Output:

The space between Hello and World is not alphanumeric.
\s Driffd a Laazoachn,
wo in ASCII a Tab(ulator), a Zeinfiaschub, a Seitnfiaschub, Wognrucklaf und a Laazoachn san; in Unicode stimmts aa mid Laazoachn ohne Untabrechung, vo da naxtn Zein und dena Laazoachn mid variabla Broadn (unta andam) iwaeih.
$string1 = "Hello World\n";
if ($string1 =~ m/\s.*\s/) {
  print "In $string1 there are TWO whitespace characters, which may";
  print " be separated by other characters.\n";
}

Output:

In Hello World
 there are TWO whitespace characters, which may be separated by other characters.
\S Driffd ois NUA KOA Laazoachn.
$string1 = "Hello World\n";
if ($string1 =~ m/\S.*\S/) {
  print "In $string1 there are TWO non-whitespace characters, which";
  print " may be separated by other characters.\n";
}

Output:

In Hello World
 there are TWO non-whitespace characters, which may be separated by other characters.
\d Driffd a Ziffa;
säim ois wia [0-9] in ASCII;
in Unicode, säim ois wia \p{Digit} or \p{GC=Decimal_Number}, wo a säim is ois wia \p{Numeric_Type=Decimal}.
$string1 = "99 bottles of beer on the wall.";
if ($string1 =~ m/(\d+)/) {
  print "$1 is the first number in '$string1'\n";
}

Output:

99 is the first number in '99 bottles of beer on the wall.'
\D Drifft a Ned-Ziffa;
säim ois wia [^0-9] in ASCII oda \P{Digit} in Unicode.
$string1 = "Hello World\n";
if ($string1 =~ m/\D/) {
  print "There is at least one character in $string1";
  print " that is not a digit.\n";
}

Output:

There is at least one character in Hello World
 that is not a digit.
^ Matches the beginning of a line or string.
$string1 = "Hello World\n";
if ($string1 =~ m/^He/) {
  print "$string1 starts with the characters 'He'.\n";
}

Output:

Hello World
 starts with the characters 'He'.
$ Matches the end of a line or string.
$string1 = "Hello World\n";
if ($string1 =~ m/rld$/) {
  print "$string1 is a line or string ";
  print "that ends with 'rld'.\n";
}

Output:

Hello World
 is a line or string that ends with 'rld'.
\A Matches the beginning of a string (but not an internal line).
$string1 = "Hello\nWorld\n";
if ($string1 =~ m/\AH/) {
  print "$string1 is a string ";
  print "that starts with 'H'.\n";
}

Output:

Hello
World
 is a string that starts with 'H'.
\z Matches the end of a string (but not an internal line).[2]
$string1 = "Hello\nWorld\n";
if ($string1 =~ m/d\n\z/) {
  print "$string1 is a string ";
  print "that ends with 'd\\n'.\n";
}

Output:

Hello
World
 is a string that ends with 'd\n'.
[^…] Matches every character except the ones inside brackets.
$string1 = "Hello World\n";
if ($string1 =~ m/[^abc]/) {
 print "$string1 contains a character other than ";
 print "a, b, and c.\n";
}

Output:

Hello World
 contains a character other than a, b, and c.
  1. 1,0 1,1 33.3.1.2 Character Classes — Emacs lisp manual — Version 25.1. In: gnu.org. 2016. Abgerufen am 13. Aprü 2017.
  2. Damian Conway: Regular Expressions, End of String. In: Perl Best Practices, S. 240, O'Reilly 2005, ISBN 978-0-596-00173-5
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy