Gea-Suan Lin
|
4a1e2b9c5e
|
Fix test function naming and the actual test.
|
2024-02-19 10:42:23 +08:00 |
|
Gea-Suan Lin
|
14e54f2393
|
Fix test function naming.
|
2024-02-19 10:41:38 +08:00 |
|
Gea-Suan Lin
|
2d1c6f161a
|
Check arguments.
|
2024-02-18 22:00:43 +08:00 |
|
Gea-Suan Lin
|
64a2507631
|
Add link so that I can validate quickly.
|
2024-02-18 21:45:29 +08:00 |
|
Gea-Suan Lin
|
de47a7fab3
|
Import my own blog.
|
2024-02-18 05:06:29 +08:00 |
|
Gea-Suan Lin
|
3dcd171227
|
Rename.
|
2024-02-16 21:02:40 +08:00 |
|
Gea-Suan Lin
|
1de46569e8
|
Add a simple test case for tokenizer.
|
2024-02-16 20:59:15 +08:00 |
|
Gea-Suan Lin
|
57c153a6c3
|
Add more test about bigram.
|
2024-02-16 20:55:23 +08:00 |
|
Gea-Suan Lin
|
55ad14e790
|
Add test cases for bigram.
|
2024-02-16 20:54:28 +08:00 |
|
Gea-Suan Lin
|
2b1e514431
|
Add "make test".
|
2024-02-16 20:52:49 +08:00 |
|
Gea-Suan Lin
|
e4d66e501d
|
Use :: in GNUmakefile in general.
|
2024-02-16 20:52:20 +08:00 |
|
Gea-Suan Lin
|
c2637af9ec
|
Add go.sum.
|
2024-02-16 20:49:42 +08:00 |
|
Gea-Suan Lin
|
8c3985c386
|
Use testify.
|
2024-02-16 20:49:00 +08:00 |
|
Gea-Suan Lin
|
6247ed36cd
|
Add a simple test case.
|
2024-02-16 20:43:26 +08:00 |
|
Gea-Suan Lin
|
79b23c32f2
|
Show article id only on score > 0.
|
2024-02-12 05:21:17 +08:00 |
|
Gea-Suan Lin
|
1e5b1dcf9a
|
Use os.Args[1] as query string, also lowercase all the time.
|
2024-02-11 14:32:09 +08:00 |
|
Gea-Suan Lin
|
d6e1c1dbf5
|
Implement query to tf-idf score.
|
2024-02-09 15:16:47 +08:00 |
|
Gea-Suan Lin
|
2d447ad45b
|
Change TF from [id][term] to [term][id].
|
2024-02-09 14:33:07 +08:00 |
|
Gea-Suan Lin
|
ade2049093
|
Implement TF & DF in tf-idf.
|
2024-02-09 14:20:08 +08:00 |
|
Gea-Suan Lin
|
18fbfa7292
|
Rename tokenize to tokenizer.
|
2024-02-09 11:47:13 +08:00 |
|
Gea-Suan Lin
|
ce79d2b245
|
Implement tokenize().
|
2024-02-09 11:46:19 +08:00 |
|
Gea-Suan Lin
|
b133273065
|
Add more data.
|
2024-02-09 11:29:49 +08:00 |
|
Gea-Suan Lin
|
f6469cc843
|
Say it's experimental.
|
2024-02-09 11:28:16 +08:00 |
|
Gea-Suan Lin
|
d5d696c8ad
|
Add license section.
|
2024-02-09 11:27:21 +08:00 |
|
Gea-Suan Lin
|
a5b6a3c7a1
|
Rewrite splitter.
Merge all english characters (like "apple", not "ap" "pp" "pl" "le"),
but keep splitting on Chinese words.
|
2024-02-09 11:25:26 +08:00 |
|
Gea-Suan Lin
|
28c1df566d
|
Implement the first part of tfidf.
|
2024-01-31 09:43:30 +08:00 |
|
Gea-Suan Lin
|
9e455bb15a
|
Implement gram-related functions.
|
2024-01-31 09:43:04 +08:00 |
|
Gea-Suan Lin
|
86bf78c762
|
Read artifact.
|
2024-01-31 09:42:42 +08:00 |
|
Gea-Suan Lin
|
07ebff32f8
|
Set internal/** as dependencies.
|
2024-01-31 09:03:11 +08:00 |
|
Gea-Suan Lin
|
043a95631b
|
Add one more entry including some English.
|
2024-01-31 09:01:37 +08:00 |
|
Gea-Suan Lin
|
093e3d65fd
|
Import article data.
|
2024-01-29 05:12:30 +08:00 |
|
Gea-Suan Lin
|
f3da8c3be3
|
Add LICENSE file.
|
2024-01-29 00:55:28 +08:00 |
|
Gea-Suan Lin
|
a51f716a23
|
Add skeleton of ir-bm25.
|
2024-01-29 00:50:23 +08:00 |
|
Gea-Suan Lin
|
6ee597dc7f
|
Add a skeleton of ir-tfidf and its related settings.
|
2024-01-29 00:49:19 +08:00 |
|
Gea-Suan Lin
|
ee7ccd6887
|
Run go mod init.
|
2024-01-29 00:46:20 +08:00 |
|
Gea-Suan Lin
|
6d8fb3d837
|
Init.
|
2024-01-29 00:46:01 +08:00 |
|