索引法與資料庫設計
有許多語料索引可能。
lucene/ElasticSearch
CWB
Emdros
決定用 CWB,原因還是語言學上的理由。
"The IMS Open Corpus Workbench (CWB) is a collection of open-source tools for managing and querying large text corpora (ranging from 10 million to 2 billion words) with linguistic annotations. Its central component is the flexible and efficient query processor CQP.
Last updated