Skip to content Skip to sidebar Skip to footer

Rpython Ord() With Non-ascii Character

I'm making a virtual machine in RPython using PyPy. My problem is, that I am converting each character into the numerical representation. For example, converting the letter 'a' pro

Solution 1:

#!/usr/bin/env python# -*- coding: latin-1 -*-

char = 'á'printstr(int(ord(char)))
printhex(int(char))
print char.decode('latin-1')

Gives me:

225
0xe1
0xe1

Solution 2:

You are using version 2 of Python language therefore your string: "á" is a byte string, and its contents depend on the encoding of your source file. If the encoding is UTF-8, they are C3 A1 - the string contains two bytes.

If you want to convert it to Unicode codepoints (aka characters), or UTF-16 codepoints (depending on your Python installation), convert it to unicode first, for example using .decode('utf-8').

# -*- encoding: utf-8 -*-defstuff(instr):
  for char in instr:
    char = str(int(ord(char)))
    char = hex(int(char))
    # I'd replace those two lines above with char = hex(ord(char))
    char = char[2:]
    print char 

stuff("á")
print("-------")
stuff(u"á")

Outputs:

c3
a1
-------
e1

Post a Comment for "Rpython Ord() With Non-ascii Character"