[問題] text mining的inspect() shihs PTT批踢踢實業坊

[問題] text mining的inspect()

作者: shihs (shih) 2015-07-01 16:36:44

大家好，
我想要用R做文字探勘，
但我才剛開始就卡關了...
mycorpus = Corpus(DirSource("test", encoding="UTF-8"), readerControl =
list(reader=readPlain, language = NA))
我先用了Corpus建立語料庫，再用DirSource讀進資料庫的純文字檔
然後我想要看我在語料庫的內容，所以我用了inspect(mycorpus)
但是，不知道為什麼只會顯示
<<VCorpus>>
Metadata: corpus specific: 0, document level (indexed): 0
Content: documents: 3
[[1]]
<<PlainTextDocument>>
Metadata: 7
Content: chars: 718
Content: chars: 703
Content: chars: 820
Content: chars: 85
Content: chars: 984
Content: chars: 785
Content: chars: 449
Content: chars: 0
...
完全沒有顯示我txt檔的內容，但我google了很久，還是無解...
還有，我用了insertWords()想要增加詞彙，
但有些詞好像會被切斷，是因為繁體字的關係嗎？
謝謝各位！

作者: shihs (shih) 2015-07-01 17:02:00

我知道insertWords()的解決方式了insertWords(toTrad(iconv(c("詞彙1","詞彙2"),"big5", "UTF-8"), rev=TRUE))

繼續閱讀

Re: [問題] 條件篩選資料psinqoo [分享] PTT網頁版特定看板內文章列表Praiserandrew43 [問題] merge 含並三個psinqoo [問題] 請益 Tree繪製樹狀圖的問題bardenthenry Re: [問題] 跑文字mining的錯誤訊息psinqoo [問題] shiny R能否使用bigmemory shared memorycywhale [問題] 請問package中的Rd文件與範例Edster Re: [問題] iris不同品種所占比例的圓餅圖celestialgod Re: [問題] iris不同品種所占比例的圓餅圖swedrf0112 [問題] iris不同品種所占比例的圓餅圖yeuan