Concise has an alpha release now. Several features were added to Concise as I have committed, such as keyword analysis and data outputting (both text and Excel formats). And also, I packed two little Chinese Tokenizers within Concise, interfacing CKIP and YahooCAS's services. I know this alpha version has countless problems. It's an alpha after all.
Check http://code.google.com/p/concise-text/ out for Concise.
Features:
- Simple and Clear
- Working with different encodings (e.g. Big5, Big5-HKSCS, UTF-8, etc. This is useful when dealing with Chinese texts)
- Concordancer: keyword in context search
- Concordance Plotter: visualize keywords' distribution in the text
- Collocator: collocational analysis
- Cluster: cluster analysis
- Word Lister: displaying all types and tokens in the text
- Keyword Lister: keyword analysis
- Collocational Network data generator
- Some useful little tools (for Chinese users currently)
- CKIP Tokenizer *
- Yahoo CAS Tokenizers **
- Token Joiner
* You have to register for CKIP service at http://ckipsvr.iis.sinica.edu.tw/ before using CKIP Tokenizer.
** You need to get an appid from Yahoo! at http://tw.developer.yahoo.com/cas/ before using Yahoo CAS Tokenizer.
Some screen shots:
Concordancer
Concordance Plotter
Collocator
Cluster
Word Lister
Keyword Lister
留言
張貼留言