Commit Graph

4 Commits

Author SHA1 Message Date
Gea-Suan Lin
6247ed36cd Add a simple test case. 2024-02-16 20:43:26 +08:00
Gea-Suan Lin
ce79d2b245 Implement tokenize(). 2024-02-09 11:46:19 +08:00
Gea-Suan Lin
a5b6a3c7a1 Rewrite splitter.
Merge all english characters (like "apple", not "ap" "pp" "pl" "le"),
but keep splitting on Chinese words.
2024-02-09 11:25:26 +08:00
Gea-Suan Lin
9e455bb15a Implement gram-related functions. 2024-01-31 09:43:04 +08:00