python regxp 笔记
import re # python REGXP: {{{ # . matches any char (include end-of-line(EOL) if re.S is on) # vim:. (always NO EOL) # ^ $ matches start/end of string (include EOL start/end if re.M is on) # * matches 0+ # + matches 1+ vim: \+ # ? matches 0 or 1 vim: \? \= # {} matches numbers vim: \{} # *? +? ?? {}? matches the minimal vim: \{-} # \ escape # [] character set # () group # (?<!...) no preceding vim: \@<! # (?<=...) with preceding vim: \@<= # (?!...) no following vim: \@! # (?=...) with following vim: \@= # (?:...) ignore group number vim: \%(\) # (?P<NAME>...) define the group 'NAME' # (?P=name) match the group 'name' # (?(id/name)yes-pattern|no-pattern) # match the yes-pattern if the group id/name exists # (?iLmsux) re option:ignore/locale/multiline/dot2all/unicode/verbose # # NOTE # MODULE: # re.compile() Compile a regular expression pattern into a regular expression object # {{{ e.g. # prog = re.compile(pattern) # result = prog.match(string) # is equivalent to: # result = re.match(pattern, string) # }}} # finditer() returns iterable # findall() returns string list # match(string,[pos,[endpos]]) match object match with whole ptn # search(string,[pos,[endpos]]) match object if ptn exists in it # split(ptn,str) returns string list vim:split(str,ptn) # sub(ptn,rpl,str,flag) vim:substiture(str,ptn,rpl,flag) # escape() Return string with all non-alphanumerics backslashed # # Regular Object: # groups number of groups # pattern compiled pattern # flags compiled flag # # Match Object: # expand(tmpl) expand character '\1','\n' in tmpl # group() return subgroup string 0:entire match 1,2...:subgroup # groups() return subgroup tuple # groupdict() return subgroup dict with group which have NAME # start() end() return idx of the group in whole string. # span() return (start,end) tuple # pos the pos passed to search()/match() # endpos the endpos passed to search()/match() # string the string passed to search()/match() # lastgroup last matched group # re thre regular object produce this match object # # ERRORS: # XXX look-behind requires fixed-width pattern # (?<![0-9a-fA-F]|0[xX]) is wrong!! # (?<![0-9a-fA-F])|(?<!0[xX]) is wrong # (?<!([\w\#])) final use # XXX unbalanced parentheses # use raw text r'''\(......\) ''' # or escape twice '\\(......\\)' # }}}
vim 下的\s 和python 的\s 不同
python 的\s 是[ \t\n\r\f\v]
vim 的 \s 是[ \t ]