[問題] UTF-8轉decimal

作者: ReiFu21 (ReiFu)   2016-09-12 15:06:45
開發平台(Platform): (Ex: VC++, GCC, Linux, ...)
Code Blocks C++
"A"這個字,在UTF8為一字節編碼
16進位表示法為:41
10進位表示法為:65
"您"這個字,在UTF8為三字節編碼
16進位表示法為:E6 82 A8
10進位表示法為:230 130 168
UTF8.txt內容為:您 \n A
我現在想將UTF8.txt內容轉化成10進位表示法
#include <stdlib.h>
#include <fstream>
#include <iostream>
#include <cstdlib>
using namespace std;
int main(void)
{
int x;int y;
char txt[80]="";
ifstream ifile("C:\\Users\\Gon\\Desktop\\UTF8.txt",ios::binary);
if(ifile.is_open())
{
while(!ifile.eof())
{
ifile >> txt;
cout << txt<< endl;
x=char (txt[0]);
switch(x)
{ case 0-127:
cout <<"1st byte~ " <<x << endl;
break;
case 240-247:
cout <<"1st byte~ " <<x << endl;
y=char (txt[1]);
cout <<"2nd byte~ " <<y << endl;
y=char (txt[2]);
cout <<"3rd byte~ " <<y << endl;
break;
default:
cout <<"1st byte~ " <<x << endl;
y=char (txt[1]);
cout <<"2nd byte~ " <<y << endl;
y=char (txt[2]);
cout << "3rd byte~ " <<y << endl;
y=char (txt[3]);
cout << "4th byte~ " <<y << endl;
y=char (txt[4]);
cout << "5th byte~ " <<y << endl;
y=char (txt[5]);
cout << "6th byte~ " <<y << endl;
}
}
}
else
cout << "fail to open file" << endl;
ifile.close(); // close file
system("pause");
return 0;
}
我想要得到的結果是:

1st byte~ 230
2nd byte~ 130
3rd byte~ 168
A
1st byte~ 65
可是實際跑出來的結果是:

1st byte~ -26
2nd byte~ -126
3rd byte~ -88
4th byte~ 0
5th byte~ 0
6th byte~ 0
A
1st byte~ 65
2nd byte~ 0
3rd byte~ -88
4th byte~ 0
5th byte~ 0
6th byte~ 0
幾個問題點:
1. "A"的1st byte是65 應該代入case 0-127 可實際上卻代入default case 為何?
2. "A"跑出來是單字節 數值65沒錯 "您"跑出來是三個字節 數值完全不對 請問修改法?
有請大大們幫忙指出問題所在 感謝!!
作者: yvb   2016-09-12 15:15:00
x=char (txt[0]); 改為 x = (unsigned char) txt[0];
作者: PkmX (阿貓)   2016-09-12 15:21:00
C和C++沒有range case的寫法啊...case m ... n: 是編譯器的 extension
作者: Caesar08 (Caesar)   2016-09-12 16:30:00
為啥default是印6 byte?然後又有印3 byte的UTF8?另外,你的while與ifstream那樣寫,會出錯喔ㄜ,標準的UTF8是沒有支援到6個byte的。另外,還是建議2跟4 byte的都做一下處理比較好
作者: uranusjr (←這人是超級笨蛋)   2016-09-13 19:28:00
UTF-8 的 encoding 規則可以 encode 到 6 bytes, 但只會用到 4-byte encoding 因為更後面的用不到了

Links booklink

Contact Us: admin [ a t ] ucptt.com