- Published on: 2025-04-25 to joshleeb's blog
- Purpose of Kar: A testbed for developing Pinto's core idea and exploring technical approaches for critical components like search.
- Pinto Overview: A system for curating one's own subgraph of the web; early stages of figuring out specifics.
- Kar as a Command Line Tool: Helps assess Pinto's core idea and figure out details from product and technical perspectives.
Goals of Kar Program:
- Install, refresh, and remove documents with minimal info (only URL).
- Query documents with full-text search and filters.
- Non-Goals: Not explore or implement an RSS reader, provide a document reading list, or use document content for offline viewing.
Design Proposal:
- API inspired by Pacman with top-level operations: -S (sync), -Q (query), -R (remove), -V (version), -h (help).
- Data stored in
statedir
(default:${XDG_STATE_HOME}/kar/
).
Sync Operation:
- Install new documents by downloading content and updating database and index.
- Refresh existing documents to update content and search index.
- Validates URLs according to [URL Standard] and applies normalizations.
- Follows redirects based on specific conditions.
- Identifies duplicates based on normalized URLs and uses
ext_equiv_url
for extended equivalency. - Can rebuild metadata database from downloaded content.
Query Operation:
- Centers around full-text search of document content with additional filters and projections.
- Supports filters like
--url
,--domain
, and--limit
. - Projections include
--only-url
. - Query results are printed as static text.
- Remove Operation: Removes documents from the graph by updating multiple locations in the state directory; only supports hard deletion.
Appendix - Future Work:
- Extend support for more mime types like
text/plain
andapplication/pdf
. - Better handle rate limiting per domain.
- Improve search architecture for better query performance.
- Extend support for document display options like adding filters and projections.
- Explore a browser extension to install documents with the active page's URL.
- Extend support for more mime types like
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。