韓信列傳詞雲

這幾天,回顧了一下漢初人物,韓信算是很可惜的一位。粗略的處理了一下王學孟先生譯的《淮陰侯列傳》。網上下了一個韓信的畫像,開始是個鉛筆畫,效果極差,後來將圖片處理為黑白將空白處塗黑,勉強可看。

#!usr/bin python3.7

# _*_ encoding:utf-8 _*_

import jieba
from wordcloud import WordCloud,STOPWORDS,ImageColorGenerator
from imageio import imread
import matplotlib.pyplot as plt
import wordcloud
text = open('淮陰侯列傳.txt').read()
# back_color = imread('hanxin.png')
back_color = imread('韓信.png')
wc = WordCloud(background_color='white',
max_words=100000,
mask=back_color,
min_font_size=2,
max_font_size=100,
stopwords=STOPWORDS.add('韓信'),
font_path='simfang.ttf',
width=30000,
height=24000,
random_state=42)

def process_words(text):
words_list = []
jieba.add_word('韓信')
words_generator = jieba.cut(text,cut_all=False)
with open('stopwords.txt') as f:
str_text = f.read()
unicode_text = str_text
f.close()
for word in words_generator:
if word.strip() not in unicode_text:
words_list.append(word)
return ' '.join(words_list)
text = process_words(text)
import nltk
wc.generate(text)
image_colors = ImageColorGenerator(back_color)
plt.imshow(wc,interpolation='bilinear')
plt.axis('off')
plt.figure()
plt.imshow(wc.recolor(color_func=image_colors))
plt.axis('off')
plt.show()
韓信列傳詞雲

淮陰侯詞雲(彩)

韓信列傳詞雲

淮陰侯詞雲(黑白)

plt.imshow(back_color)
plt.show()
韓信列傳詞雲

淮陰侯

wordcloud = WordCloud(max_font_size=100,
font_path='simfang.ttf').generate(text)
plt.figure()
plt.imshow(wordcloud,interpolation='bilinear')
plt.axis('off')
plt.show()
韓信列傳詞雲

淮陰侯列傳詞雲

文中最凸顯的是漢王,天下,軍隊。從常識來看也是正常。由於處理的比較粗略,如‘不能’,‘他們’等也比較大。


分享到:


相關文章: