python - Unable to process this regular expression -
i have following "greeksymbols.txt"
Α α alpha Β β beta Γ γ gamma Δ δ delta Ε ε epsilon Ζ ζ zeta Η η eta Θ θ theta Ι ι iota Κ κ kappa Λ λ lambda Μ μ mu Ν ν nu Ξ ξ xi Ο ο omicron Π π pi Ρ ρ rho Σ σ sigma Τ τ tau Υ υ upsilon Φ φ phi Χ χ chi Ψ ψ psi Ω ω omega
i trying convert anki plain text file tab delimeter. converting each row 2 cards front symbol (in uppercase or lowercase) , name. have following.
#!/usr/local/bin/python import re pattern = re.compile(r"(.)\s+(.)\s+(.+)", re.unicode) input = open("./greeksymbols.txt", "r") output = open("./greeksymbolsformated.txt", "w+") line = input.readline() while line: string = line.rstrip() m = pattern.match(string) if m: output.write(m.group(1) + "\t" + m.group(3) + "\n") output.write(m.group(2) + "\t" + m.group(3) + "\n") else: print("i unable process line '" + string + "' [" + str(m) + "]") line = input.readline() input.close(); output.close();
unfortunately, getting "i unable process ..." message every line, value of str(m) being none. doing wrong?
> localhost:anki stephen$ python ./convertgreeksymbols.py unable process line 'Α α alpha' [none] unable process line 'Β β beta' [none] ...
you don't need regex this:
with (open("./greeksymbols.txt") infile, open("./greeksymbolsformated.txt", "w+") outfile): line in infile: up, low, name = line.split() outfile.write("{0}\t{1}".format(up,name)) outfile.write("{0}\t{1}".format(low,name))
if want stick regex, try following regex instead of yours (which should work imo, perhaps isn't explicit enough):
pattern = re.compile(r"(\s+)\s+(\s+)\s+(.+)", re.unicode)
Comments
Post a Comment