python - Unable to process this regular expression -


i have following "greeksymbols.txt"

Α α alpha Β β beta Γ γ gamma Δ δ delta Ε ε epsilon Ζ ζ zeta Η η eta Θ θ theta Ι ι iota Κ κ kappa Λ λ lambda Μ μ mu   Ν ν nu Ξ ξ xi Ο ο omicron Π π pi Ρ ρ rho Σ σ sigma Τ τ tau Υ υ upsilon Φ φ phi Χ χ chi Ψ ψ psi Ω ω omega 

i trying convert anki plain text file tab delimeter. converting each row 2 cards front symbol (in uppercase or lowercase) , name. have following.

#!/usr/local/bin/python  import re  pattern = re.compile(r"(.)\s+(.)\s+(.+)", re.unicode)  input = open("./greeksymbols.txt", "r")  output = open("./greeksymbolsformated.txt", "w+")  line = input.readline()  while line:      string = line.rstrip()      m = pattern.match(string)      if m:         output.write(m.group(1) + "\t" + m.group(3) + "\n")         output.write(m.group(2) + "\t" + m.group(3) + "\n")     else:         print("i unable process line '" + string + "' [" +  str(m) + "]")      line = input.readline()  input.close(); output.close(); 

unfortunately, getting "i unable process ..." message every line, value of str(m) being none. doing wrong?

> localhost:anki stephen$ python ./convertgreeksymbols.py  unable process line 'Α α   alpha' [none] unable process line 'Β β   beta' [none] ... 

you don't need regex this:

with (open("./greeksymbols.txt") infile,        open("./greeksymbolsformated.txt", "w+") outfile):     line in infile:         up, low, name = line.split()         outfile.write("{0}\t{1}".format(up,name))         outfile.write("{0}\t{1}".format(low,name)) 

if want stick regex, try following regex instead of yours (which should work imo, perhaps isn't explicit enough):

pattern = re.compile(r"(\s+)\s+(\s+)\s+(.+)", re.unicode) 

Comments

Popular posts from this blog

android - getbluetoothservice() called with no bluetoothmanagercallback -

sql - ASP.NET SqlDataSource, like on SelectCommand -

ios - Undefined symbols for architecture armv7: "_OBJC_CLASS_$_SSZipArchive" -