w3hello.com logo
Home PHP C# C++ Android Java Javascript Python IOS SQL HTML videos Categories
How does the ? make a quantifier lazy in regex
Imagine you have the following text: BAAAAAAAAD The following regexs will return: /B(A+)/ => 'BAAAAAAAA' /B(A+?)/ => 'BA' /B(A*)/ => 'BAAAAAAAA' /B(A*?)/ => 'B' The addition of the "?" to the + and * operators make them "lazy" - i.e. they will match the absolute minimum required for the expression to be true. Whereas by default the * and + operators are "greedy" and try and match AS MUCH AS POSSIBLE for the expression to be true. Remember + means "one or more" so the minimum will be "one if possible, more if absolutely necessary" whereas the maximum will be "all if possible, one if absolutely necessary". And * means "zero or more" so the minimum will be "nothing if possible, more if absolutely necessary" whereas the maximum will be "all if possible, zero if absolutely n

Categories : Regex

regex to lazy match possible empty string
Use .*? rather than .+?. The former matches zero or more characters, while the latter matches one or more. "Hello A=[] B=[boy] World".match(/A*=*[(.*?)]/)[1] # => ""

Categories : Ruby

Quantifier follows nothing in regex
The * is a meta character, called a quantifier. It means "repeat the previous character or character class zero or more times". In your case, it follows nothing, and is therefore a syntax error. What you probably are trying is to match anything, which is .*: Wildcard, followed by a quantifier. However, this is already the default behaviour of a regex match unless it is anchored. So all you need is: my @my_files = grep { /xyz/ } @files; You could keep your end of the string anchor xlsx$, but since you have a limited list of file names, that hardly seems necessary. Though you have used qw() wrong, it is not comma separated, it is space separated: my @files = qw(file_xyz.xlsx file.xlsx); However, if you should have a larger set of file names, such as one read from a directory, you can p

Categories : Perl

Not getting * quantifier correctly in regex?
This is where you're going wrong: first it encounters a which means 0 x is found. So it should replace a with M. No - it means that 0 xs are found and then an a is found. You haven't said that the a should be replaced by M... you've said that any number of xs (including 0) should be replaced by M. If you want every character to be replaced by M, you should just use .: System.out.println(testStr.replaceAll(".", "23")); (I would personally have expected a result of MaMbM - I'm looking into why you get MaMMbMM instead - I suspect it's because there's a sequence of 0 xs between the x and the b, but it still seems a little odd to me.) EDIT: It becomes a bit clearer if you look at where your pattern matches. Here's code to show that: Pattern pattern = Pattern.compile("x*"); Matcher

Categories : Java

Invalid quantifier when using an object property to construct a regex
In regexes, the char * is a quantifier. The expression: a* Means a zero or more times (a could also be an expression). As you are trying to match the * itself and not use it as a quantifier, you should escape it: var styles = { "bold italic" : "\*\*\*", "bold" : "\*\*", "italic" : "\*" };

Categories : Javascript

Regex to match paths that don't match a specific pattern: Express Router
The following regex will match any path except those starting with /foo/ app.get(/^/([^f][^o][^o]|.{1,2}|.{4,})/.*$/, routes.index); I assume that this is a standard javascript regex.

Categories : Regex

Regex to match single new line. Regex to match double new line
To match exactly N repetitions of the same character you need lookaheads and lookbehinds (see Match exactly N repetitions of the same character). Since javascript doesn't support the latter, a pure regexp solution seems to be impossible. You'll have to use a helper function, for example: > x = "...a...aa...aaa...aaaa...a...aa" "...a...aa...aaa...aaaa...a...aa" > x.replace(/a+/g, function($0) { return $0.length == 2 ? '@@' : $0; }) "...a...@@...aaa...aaaa...a...@@"

Categories : Javascript

Regex to match only till first occurence of class match
You were missing ? Your regex would be (?i)(.*?)case[^a-zd]*(d+)(.*) You can toggle case insensitive match using (?i) in regex

Categories : Regex

Why is Perl lazy when regex matching with * against a group?
This isn't a matter of greedy or lazy repetition. (?:fj)* is greedily matching as many repetitions of "fj" as it can, but it will successfully match zero repetitions. When you try to match it against the string "f fjfj ff", it will first attempt to match at position zero (before the first "f"). The maximum number of times you can successfully match "fj" at position zero is zero, so the pattern successfully matches the empty string. Since the pattern successfully matched at position zero, we're done, and the engine has no reason to try a match at a later position. The moral of the story is: don't write a pattern that can match nothing, unless you want it to match nothing.

Categories : Regex

Regex - Find the match that is inside a match
You can try this regex: /href=[^>]+.pdf/ regex101 demo Most of the time, when you can avoid .* or .+ (or their lazy versions), it's better :) Also, don't forget to escape periods.

Categories : PHP

Regex that match if the match contains special word
You're kind of on the right track with lookahead assertions: {{START}}(?:(?!{{END}})[sS])*specialword(?:(?!{{END}})[sS])*{{END}} Explanation: {{START}} # Match {{START}} (?: # Match... (?!{{END}}) # ...as long as we haven't reached {{END}} yet: [sS] # any character )* # any number of times. specialword # Match "specialword" (?: # Match (as before)... (?!{{END}}) # whatever follows, unless it's {{END}} [sS] )* {{END}} # Then finally match {{END}}

Categories : Regex

Java regex: need one regex to match all the formats specified
Try using a reluctant quantifier: _year:.*?s. .replaceAll("_year:.*?\s", "_year:Y ") System.out .println("utc-hour_of_year:2013-07-30T17 dsfsdgfsgf utc-week_of_year:2013-W31 dsfsdgfsdgf" .replaceAll("_year:.*?\s", "_year:Y ")); utc-hour_of_year:Y dsfsdgfsgf utc-week_of_year:Y dsfsdgfsdgf

Categories : Java

regex not returning match but there is clearly a match
You need to escape the dollar sign. start = '>$' end = '</td>' AnnualDiv = re.search('%s(.*)%s' % (start, end), s).group(1) The reason is that the $ is a special character in regex. (It matches the end of a string or before the newline.) This will set AnnualDiv to the string '0.48'. If you want to add the $, you can do it using this: AnnualDiv = "$%s" % re.search('%s(.*)%s' % (start, end), s).group(1)

Categories : Python

Regex.Match() won't match a substring
Try removing ^ and $: Regex regex = new Regex(@"[ABCEGHJKLMNPRSTVXY]{1}d{1}[A-Z]{1} *d{1}[A-Z]{1}d{1}", RegexOptions.None); ^ : The match must start at the beginning of the string or line. $ : The match must occur at the end of the string or before at the end of the line or string. If you want to match only in word boundaries you can use  as suggested by Mike Strobel: Regex regex = new Regex(@"[ABCEGHJKLMNPRSTVXY]{1}d{1}[A-Z]{1} *d{1}[A-Z]{1}d{1}", RegexOptions.None);

Categories : C#

Javascript regex to match a regex
A regular expression to match a regular expression is //((?![*+?])(?:[^ [/\]|\.|[(?:[^ ]\]|\.)*])+)/((?:g(?:im?|mi?)?|i(?:gm?|mg?)?|m(?:gi?|ig?)?)?)/ To break it down, / matches a literal / (?![*+?]) is necessary because /* starts a comment, not a regular expression. [^ [/\] matches any non-escape sequence character and non-start of character group [...] matches a character group which can contain an un-escaped /. \. matches a prefix of an escape sequence + is necessary because // is a line comment, not a regular expression. (?:g...)? matches any combination of non-repeating regular expression flags. So ugly. This doesn't attempt to pair parentheses, or check that repetition modifiers are not applied to themselves, but filters out most of the other ways that regular expressions

Categories : Javascript

Haskell Data.Binary: Shouldn't this be lazy? And how do I make it lazy?
decodeFile is not lazy, just look at the source -it calls decodeOrFail, which itself must parse the whole file to determine success or failure. EDIT: So what I believe worked in the original binary is now broken (read: it's now a non-lazy memory hog). One solution that I doubt is optimally pretty is to use lazy readFile and runGetIncremental then manually push chunks into the decoder: import Data.Binary import Data.Binary.Get import Data.ByteString.Lazy as L import Data.ByteString as B import qualified Data.Array.IArray as IA import Data.Array.Unboxed as A main = do bs <- getListLazy `fmap` L.readFile "bintest2.data" mapM_ doStuff bs return () doStuff b = print $ b IA.! 100000 The important stuff is here: getListLazy :: L.ByteString -> [UArray Int Float] getListL

Categories : Haskell

Looking for non-zero property TOs: Can I match a Description with number property, but use a regex match?
It is known that integer types has to be passed as integers in the description rendering the usage of regular expressions useless unfortunately. I do not have a QTP installation at hand right now, but to investigate it further, what happens if you use Print Browser("myBrowser").WebElement("height:=11").ChildObjects.Count and Print Browser("myBrowser").WebElement("height:=^[1-9][0-9]*$").ChildObjects.Count Where "myBrowser" is your browser definition of course.

Categories : Regex

Cache that includes both lazy and non-lazy properties in Grails
By default, associations are treated as lazy in Grails. In the particular example above for Person, all address objects will be cached. The above default cache setting can be expanded to look like: cache usage: 'read-write', include: 'all' //includes lazy and non-lazy In order to cache only the association inside Person, you would need addresses cache: true In order to discard association from caching in Person, you would need cache usage: 'read-write', include: 'non-lazy' //usage can be according to the need 'read-only', 'read-write', etc

Categories : Hibernate

What exactly does the *+ quantifier do?
A greedy quantifier matches everything it can and then the pattern backtracks until the match succeeds. A lazy quantifier forward tracks until the match succeeds. A possessive quantifier matches everything it can and never backtracks. The + denotes a possessive quantifier. If can be used as, for example, ++ or *+. This ability to prevent backtracking means it can stop catastrophic backtracking.

Categories : Regex

Haskell lazy Bytestring words not lazy?
The use of lazy tuples there is sub-optimal. This is better rewritten as: main = do cont <- words <$> getContents putStrLn $ show $ sndT $ foldl opt (T 0 0) $ map (fst.fromJust.readInt) cont sndT :: T -> Int sndT (T _ m) = m opt (T c m) x = T (max 0 (c+x)) (max m (c+x)) data T = T {-# UNPACK #-} !Int {-# UNPACK #-}!Int So you get a strict, unboxed accumulator. However, you're better off writing this whole thing as an incremental left fold. that's why readInt returns the remaining input in its 2nd parameter. No need for the sum . map . words pipeline. The version you submitted leaks space. Run on a large file, and it uses heap proportional to the file size (on 640k entries). $ time ./A +RTS -p -s -K50M < input.txt.2 346882 326,337,136 bytes allocated i

Categories : Haskell

Match a^xb^x with regex
The \1 is a backreference and refers to the value of the group, not to the pattern as the recursion (?1) does in Perl. Unfortunately, Java regexes do not support recursion, but the pattern can be expressed using lookarounds and backrefs.

Categories : Java

Match BOL and EOL with std::regex
libstdc++ has no full support for regex (you can check it here). I'm tried to compile this code with clang 3.2 with libc++-3.2 and result is "true". Use libc++, or boost. Especially libstdc++ regex implementation status 8 Regular expressions 28.1 General N 28.2 Definitions N 28.3 Requirements N 28.4 Header <regex> synopsis N 28.5 Namespace std::regex_constants Y 28.6 Class regex_error Y 28.7 Class template regex_traits Partial 28.8 Class template basic_regex Partial 28.9 Class template sub_match Partial 28.10 Class template match_results Partial 28.11 Regular expression algorithms N 28.12 Regular expression Iterators N 28.13 Modified ECMAScript regular expression grammar N

Categories : C++

Regex right match url with DOT at the end
The simplest fix is to require a non-punctuation character as the last character: /(^|[?s])(www.[^? ]+/[^/ ]*?[^? ]*[^?.,! ]|www.[^? ]*[^?.,! ])/g Note that I removed some of your backslash, because they were not necessary. JSFiddle. However, this is still by for not a robust URL pattern. So, why reinvent the wheel instead of just using some established URL pattern?

Categories : Javascript

Given regex does not match to the end
You need to pass the global modifier. I'm not sure which programming language you are using, but the syntax often resembles the following: /$myregex/g For example, given the following text: Hello Adam, how are you? Hello Sarah, how are you? The regular expression /Hellos(.*),/g will match both Adam and Sarah.

Categories : Python

Regex match everything after
Why not using a mix of preg_match() and explode()?: $str = '/events/display/id/featured'; $pattern = '~/events/(?P<method>.*?)/(?P<parameter>.*)~'; preg_match($pattern, $str, $matches); // explode the params by '/' $matches['parameter'] = explode('/', $matches['parameter']); var_dump($matches); Output: array(5) { [0] => string(27) "/events/display/id/featured" 'method' => string(7) "display" [1] => string(7) "display" 'parameter' => array(2) { [0] => string(2) "id" [1] => string(8) "featured" } [2] => string(11) "id/featured" }

Categories : PHP

Regex to match [] but not []
http://rubular.com/r/16q3jSPHN0 [^\](?:]?([(.+?)])) should work for most cases. Edit: Seems like this will not match [test][test], as Rory pointed out. For that, I can't really think of a good solution without using multiple regexps, but if you want just one then try this: http://rubular.com/r/QBqFAbqW9E (?:[^\](?:]?([(.+?)]))|((?:]?([(.+?)])))\) Match groups will be populated in the first 3 if it a block with escaped brackets occurs after a regular block, and the last 3 if the opposite occurs. Match 1 1. 2. 3. [test] 4. [test] 5. test Match 2 1. [test] 2. test 3. 4. 5.

Categories : Javascript

Regex to match this
You can try this pattern: ^(?:[^e ]+|Be|e(?!xception))+.php:d+$ or this pattern, if you don't need to check a specific line format: ^(?>[^e ]++|Be|e(?!xception))+$ Notice: If you need to select all consecutive lines in one block, you just need to remove from the character classes.

Categories : Regex

Get first occurence of match in Regex
A way to do that is to use a lazy quantifier with dotall option: Regex regex = new Regex(@"^.*?(?>dog|mouse)"); Another way is to do that; Regex regex = new Regex(@"^(?>[^dm]*+|d++(?!og)|m++(?!ouse))*(?>dog|mouse)"); it is longer but more efficient. The idea is to avoid lazy quantifier that is slow because it tests on each characters to see what follows. Here i describe the begining as "all that is not a d or a m OR some d not followed by og OR some m not followed by ouse zero or more times. (?>..) is an atomic group, this is to avoid that the regex engine backtrack, it is a kind of 'all or nothing', more informations here ++ is a possessive quantifier that avoid backtracks too.

Categories : C#

Regex match UK postcode
Wrap your regex in ^ and $ to ensure that full string is matched: var re = /^(GIR[ ]?0AA|((AB|AL|B|BA|BB|BD|BH|BL|BN|BR|BS|BT|BX|CA|CB|CF|CH|CM|CO|CR|CT|CV|CW|DA|DD|DE|DG|DH|DL|DN|DT|DY|E|EC|EH|EN|EX|FK|FY|G|GL|GY|GU|HA|HD|HG|HP|HR|HS|HU|HX|IG|IM|IP|IV|JE|KA|KT|KW|KY|L|LA|LD|LE|LL|LN|LS|LU|M|ME|MK|ML|N|NE|NG|NN|NP|NR|NW|OL|OX|PA|PE|PH|PL|PO|PR|RG|RH|RM|S|SA|SE|SG|SK|SL|SM|SN|SO|SP|SR|SS|ST|SW|SY|TA|TD|TF|TN|TQ|TR|TS|TW|UB|W|WA|WC|WD|WF|WN|WR|WS|WV|YO|ZE)(d[dA-Z]?[ ]?d[ABD-HJLN-UW-Z]{2}))|BFPO[ ]?d{1,4})$/; console.log(re.test('WD4 9PL')); ^ matches beginning of the line $ matches end of the line Note, I've also wrapped it in (): /^abc|def$/ will match abc.... and ....def /^(abc|def)$/ will match only abc or def Example: > /abc/.test("abcd") true > /^abc$/.test("abcd") false

Categories : Javascript

regex match for each line
Use preg_match_all with PREG_SET_ORDER flag. For example: $text = <<<EOT #X0 alpha numeric content that I want #X1 something else #X26 this one as well EOT; preg_match_all('/^(#Xd{1,2}s+)(.*)/m', $text, $matches, PREG_SET_ORDER); foreach ($matches as $match) { echo $match[0] . " "; } UPDATE corresponding to the edited question. preg_match_all('/^(#Xd{1,2}s+)(.*)/m', $text, $matches, PREG_SET_ORDER); foreach ($matches as $match) { echo $match[2] . " "; }

Categories : PHP

How to match from the end of the string using regex
Use $ to denote end of the input string: / void main$/ var pattern = / void main$/; var pool1 = "abdodfo void main"; var pool2 = "abdodfo void main a"; var pool3 = "abdodfo void main ab void mai"; console.log(pattern.test(pool1)); // => true console.log(pattern.test(pool2)); // => false console.log(pattern.test(pool3)); // => false

Categories : Javascript

Regex Exclude match
Use a negative look ahead: ((?!^.*iPad.*$)Mobile)|iP(hone|od)|Android|BlackBerry|IEMobile|Kindle|NetFront|Silk-Accelerated|(hpw|web)OS|Fennec|Minimo|Opera M(obi|ini)|Blazer|Dolfin|Dolphin|Skyfire|Zune

Categories : Regex

regex match whole word only
That doesn't return 'true', if I run the following code: public void Main() { string matchstr = "\bC#BKN00([0-9]{1})\b"; string modify = null; modify = Regex.Replace("C#BKN005", matchstr, "ceiling $1 hundred broken."); Console.WriteLine(modify); Console.WriteLine(Regex.Replace("BKN005", matchstr, "ceiling $1 hundred broken.")); Console.ReadLine(); } I get: ceiling $1 hundred broken BKN005 What would you like this to return?

Categories : C#

How match a paragraph using regex
You can split on double-newline like this: paragraphs = re.split(r" ", DATA) Edit: To capture the paragraphs as matches, so you can get their start and end points, do this: for match in re.finditer(r'(?s)((?:[^ ][ ]?)+)', DATA): print match.start(), match.end() # Prints: # 0 214 # 215 298 # 299 589

Categories : Python

Regex Match string
I think something like this would work: ^C?R?U?D?$ To avoid matching empty strings, you can use a lookahead assertion: ^(?!$)C?R?U?D?$

Categories : PHP

how to match rules using regex in C#
I'm not sure if I understood your question. But if you want to get the number of Numerical Characters from a string you can use the following code: Regex regex = new Regex(@"^[0-9]+$"); string ValidateString = regex.Replace(ValidateString, ""); if(ValidateString.Length > 4 && ValidateString.Length < 10) //this is a customer id ....

Categories : C#

Regex javascript - match either or
You could use ^(w{5}-){2}w{5}((-w{5}){2})?$ (w{5}-){2}w{5} matches the 3x5 case and optionally there can be two more blocks, which would be the 5x5 case. Alternatively you could use ^(w{5}-){2}((w{5}-){2})?w{5}$ i.e. put the optional block in between, or a simple combination of your expressions: ^((w{5}-){4}w{5})|(w{5}-){2}w{5})$

Categories : Javascript

How to write regex to match URL;s?
You can use this : ^/jsp/offer/recr/us/wsj/recoffertemp2flow1.jsp?offerId=d+&promoCode=d+$ But note that this regex would fail if you change the argument order.

Categories : Regex

regex match within parenthesis
If you print m, you'll see gregexpr(..., perl = TRUE) gives you the positions and lengths of matches for a) your full pattern including the leading and closing quotes and b) the captured (.*). Unfortunately for you, when m is used by regmatches, it use the positions and lengths of the former. There are two solutions I can think of. Pass your final output through sub: line <- 'VARIABLES = "First [T]" "Second [L]" "Third [1/T]"' m <- gregexpr('"(.*?)"', line, perl = TRUE) z <- regmatches(line, m)[[1]] sub('"(.*?)"', "\1", z) Or use substring using the positions and lengths of the captured expressions: start.pos <- attr(m[[1]], "capture.start") end.pos <- start.pos + attr(m[[1]], "capture.length") - 1L substring(line, start.pos, end.pos) To further your understandi

Categories : Regex

RegEx match not working
Use var reg = new RegExp(".*." + inputName); The square brackets mean: one character, which is one of those within the brackets. But you want several characzters, first a dot, then the first character of inputName, etc.

Categories : Javascript



© Copyright 2017 w3hello.com Publishing Limited. All rights reserved.