exception
"你好".encode("utf8")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in
position 0: ordinal not in range(128)
1"你好".encode('utf-8')
encode converts a unicode object to a string object. But here you have invoked it on a string object (because you don’t have the u).
So python has to convert the string to a unicode object first. So it does the equivalent of
1"你好".decode().encode('utf-8')
But the decode fails because the string isn’t valid ascii. That’s why you get a complaint about not being able to decode.
“你好” 的编码有外部文本环境决定 例如 # -*- coding: utf-8 -*- 则 编码为’utf8’,而此时系统编码为默认的ascii, </br> 所以第一步 utf8–>ascii 编码肯定会错误的。。
ASCII
1 man ascii
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
For convenience, let us give more compact tables in hex and decimal.
2 3 4 5 6 7 30 40 50 60 70 80 90 100 110 120
------------- ---------------------------------
0: 0 @ P ` p 0: ( 2 < F P Z d n x
1: ! 1 A Q a q 1: ) 3 = G Q [ e o y
2: " 2 B R b r 2: * 4 > H R \ f p z
3: # 3 C S c s 3: ! + 5 ? I S ] g q {
4: $ 4 D T d t 4: " , 6 @ J T ^ h r |
5: % 5 E U e u 5: # - 7 A K U _ i s }
6: & 6 F V f v 6: $ . 8 B L V ` j t ~
7: ´ 7 G W g w 7: % / 9 C M W a k u DEL
8: ( 8 H X h x 8: & 0 : D N X b l v
9: ) 9 I Y i y 9: ´ 1 ; E O Y c m w
A: * : J Z j z
B: + ; K [ k {
C: , < L \ l |
D: - = M ] m }
E: . > N ^ n ~
F: / ? O _ o DEL
UNICODE
