Commit Graph

  • d72fe86325 Use id instead of article.Id. main Gea-Suan Lin 2024-02-28 21:54:15 +08:00
  • 7290880a10 Add news-cna. Gea-Suan Lin 2024-02-28 15:37:00 +08:00
  • 12df5d0f5e Add news-cna. Gea-Suan Lin 2024-02-28 15:19:29 +08:00
  • 9a7d93867a Change to use wildcard. Gea-Suan Lin 2024-02-28 15:19:16 +08:00
  • 0104b2d396 Update license section. Gea-Suan Lin 2024-02-28 08:37:31 +08:00
  • 61793326a3 Implement BM25 version. Gea-Suan Lin 2024-02-28 06:16:56 +08:00
  • 53d0b162d1 Avoid from getting article number repeatly. Gea-Suan Lin 2024-02-28 06:06:38 +08:00
  • f6c595d0f0 Update. Gea-Suan Lin 2024-02-27 05:37:25 +08:00
  • 4a1e2b9c5e Fix test function naming and the actual test. Gea-Suan Lin 2024-02-19 10:42:23 +08:00
  • 14e54f2393 Fix test function naming. Gea-Suan Lin 2024-02-19 10:41:38 +08:00
  • 2d1c6f161a Check arguments. Gea-Suan Lin 2024-02-18 22:00:43 +08:00
  • 64a2507631 Add link so that I can validate quickly. Gea-Suan Lin 2024-02-18 21:45:29 +08:00
  • de47a7fab3 Import my own blog. Gea-Suan Lin 2024-02-18 04:59:55 +08:00
  • 3dcd171227 Rename. Gea-Suan Lin 2024-02-16 21:02:40 +08:00
  • 1de46569e8 Add a simple test case for tokenizer. Gea-Suan Lin 2024-02-16 20:59:15 +08:00
  • 57c153a6c3 Add more test about bigram. Gea-Suan Lin 2024-02-16 20:55:23 +08:00
  • 55ad14e790 Add test cases for bigram. Gea-Suan Lin 2024-02-16 20:54:28 +08:00
  • 2b1e514431 Add "make test". Gea-Suan Lin 2024-02-16 20:52:49 +08:00
  • e4d66e501d Use :: in GNUmakefile in general. Gea-Suan Lin 2024-02-16 20:52:20 +08:00
  • c2637af9ec Add go.sum. Gea-Suan Lin 2024-02-16 20:49:42 +08:00
  • 8c3985c386 Use testify. Gea-Suan Lin 2024-02-16 20:49:00 +08:00
  • 6247ed36cd Add a simple test case. Gea-Suan Lin 2024-02-16 20:43:26 +08:00
  • 79b23c32f2 Show article id only on score > 0. Gea-Suan Lin 2024-02-12 05:21:17 +08:00
  • 1e5b1dcf9a Use os.Args[1] as query string, also lowercase all the time. Gea-Suan Lin 2024-02-11 14:32:09 +08:00
  • d6e1c1dbf5 Implement query to tf-idf score. Gea-Suan Lin 2024-02-09 15:16:47 +08:00
  • 2d447ad45b Change TF from [id][term] to [term][id]. Gea-Suan Lin 2024-02-09 14:33:07 +08:00
  • ade2049093 Implement TF & DF in tf-idf. Gea-Suan Lin 2024-02-09 14:20:08 +08:00
  • 18fbfa7292 Rename tokenize to tokenizer. Gea-Suan Lin 2024-02-09 11:47:13 +08:00
  • ce79d2b245 Implement tokenize(). Gea-Suan Lin 2024-02-09 11:46:19 +08:00
  • b133273065 Add more data. Gea-Suan Lin 2024-02-09 11:29:49 +08:00
  • f6469cc843 Say it's experimental. Gea-Suan Lin 2024-02-09 11:28:16 +08:00
  • d5d696c8ad Add license section. Gea-Suan Lin 2024-02-09 11:27:21 +08:00
  • a5b6a3c7a1 Rewrite splitter. Gea-Suan Lin 2024-02-09 11:25:26 +08:00
  • 28c1df566d Implement the first part of tfidf. Gea-Suan Lin 2024-01-31 09:43:30 +08:00
  • 9e455bb15a Implement gram-related functions. Gea-Suan Lin 2024-01-31 09:43:04 +08:00
  • 86bf78c762 Read artifact. Gea-Suan Lin 2024-01-31 09:42:42 +08:00
  • 07ebff32f8 Set internal/** as dependencies. Gea-Suan Lin 2024-01-31 09:03:11 +08:00
  • 043a95631b Add one more entry including some English. Gea-Suan Lin 2024-01-31 09:01:37 +08:00
  • 093e3d65fd Import article data. Gea-Suan Lin 2024-01-29 05:12:30 +08:00
  • f3da8c3be3 Add LICENSE file. Gea-Suan Lin 2024-01-29 00:55:28 +08:00
  • a51f716a23 Add skeleton of ir-bm25. Gea-Suan Lin 2024-01-29 00:50:23 +08:00
  • 6ee597dc7f Add a skeleton of ir-tfidf and its related settings. Gea-Suan Lin 2024-01-29 00:49:19 +08:00
  • ee7ccd6887 Run go mod init. Gea-Suan Lin 2024-01-29 00:46:20 +08:00
  • 6d8fb3d837 Init. Gea-Suan Lin 2024-01-29 00:46:01 +08:00