跳到主要內容

Concise 0.3.0 Release: with 'Dead End' Collocational Network Feature


As suggested in my previous article Collocation and Interactive Collocational Network, collocational networks are networks consisting of words that co-occur in a statistically significant way in a text.  In Concise 0.2x, we introduced an interactive way to explore the co-occurencetial relationship.  Now, with Concise 0.3.x, a 'dead-end' collocational network is featured.

The 'dead-end' collocational network provides a whole picture of your fundamental 'core' word.  It keeps expanding the network until nothing left.  However, the 'core' word you're using is not exactly the 'core' of its network.

Take the upper collocational network for instance.  I was looking for the collocational network of 'farmer' (農民) among some of Council of Agriculture's (農委會) official documents.  The top five collocates (sorted by co-occurrence) are 'agriculture' (農業), 'counsel' (輔導), 'conduct' (辦理), 'promote' (推動), and 'develop' (發展).  These nodes suppose to be the central part (the 'core') of the network if documents are randomly selected.  But these official documents have very strong tendencies toward agricultural policy.  That is the reason why this dead-end network is mostly comprised of policy words.

Camilla Magnusson, in her Text Visualization for Competitive Intelligence, believes collocational network method is useful handling sequence text.  To test her theory, I put my documents into two collection by time.  The first collection ranges from 1996 to 2002; and the second collection ranges from 2003 to 2009.

Figure 1: Collocational Network of Collection 1


Figure 2: Collocational Network of Collection 2

Something interesting did show.  Figure 1 remains simple compared to the top figure, but figure 2 is much complicated.  Lots of things come together: birds flu, foot-and-mouth disease, mudslide . . . .   Of course, they are not directly related to the word 'farmer'.  But figure 2 did show the potential foci of the 2nd collection.

Magnusson may be right.  Collocational network, the text visualizational presentation did show the differences.  But does it work in the field of agriculture?  Maybe!  Maybe not!  I have not tested it.


If you are interested, the latest Concise can be found at SourceForge: https://sourceforge.net/projects/concise-text/files/

留言

熱門文章

差不多食譜:壽桃 Birthday Bunns

「壽桃」可不是老人家生日的專利,小巧玲瓏的壽桃超級受到小朋友歡迎,直說「好可愛喔!」其實壽桃就是一種造型饅頭/包子,只要掌握了這些方法,要做其他的造型都沒問題。

差不多食譜:巧克力杏仁餅乾 Chocolate Almond Cookies

老闆!來杯咖啡!這杯咖啡怎麼硬硬的?裡頭還一片片白白的東西?別擔心,差不多食譜不會向您介紹壞掉的東西,這也不是咖啡,不過是裝在咖啡杯裡頭的巧克力杏仁餅乾。這麼說來,這已經是第三份和巧克力有關的餅乾食譜了。從最早黑黑的「 手工巧克力餅乾 」,後來加了咖啡液黑白相間的「 咖啡巧克力餅乾 」,現在則是把黑白比例倒過來的「巧克力杏仁餅乾」。喜歡巧克力和手工餅乾的朋友們,千萬別錯過囉!

差不多食譜:檸檬餅乾 Lemon Biscuits

寒流來襲,氣象局持續發布低溫特報。在這冷颼颼的冬日,差不多食譜為您準備了一支有溫度的影片食譜「檸檬餅乾 Lemon Biscuits」。檸檬的酸味能夠讓您有清新的味覺,用檸檬做的餅乾則讓您解除冬日過份進補的油膩感,同時又滿足一直想吃東西的衝動。但我可沒說這種吃法的卡路里不高,對您的身材不會有影響。恐怕您還是得自己稍微節制些! 不過,說老實話,我單純是因為天氣太冷,所以把烤箱拿來當暖爐用。坐在烤箱後面等待餅乾完成,果真有暖呼呼的感覺。