使用 cursor 实现 Roam Research 近期笔记提取

先要下载 roam-backend-api 的 sdk python 封装

把这个文件下载下来保存到脚本的同目录下，roamsdk.py
https://github.com/Roam-Research/backend-sdks/blob/master/python/roam_client/client.py

在 Roam Research 中设置 API token，只需要只读权限即可。

脚本如下，输出最近 30 天的 dailylog 日志

from roamsdk import RoamBackendClient, q, pull, pull_many
import requests
from datetime import datetime
import re

# 初始化 RoamBackendClient
token = 'roam-graph-token-xxxxxxxx'
graph = 'your-graph'
client = RoamBackendClient(token, graph)

# 设置代理配置
proxies = {
    'http': 'http://127.0.0.1:1087',
    'https': 'http://127.0.0.1:1087'
}

# 创建新的 session 需要时配置代理
client.session = requests.Session()
# client.session.proxies = proxies


# 设置较长的超时时间
client.session.timeout = 30

# 设置自定义的 headers
client.session.headers.update({
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
})

# 查询字符串
query = '''
[:find ?title ?createtime ?edittime ?uid 
 :keys title createtime edittime uid
 :where
 [?e :node/title ?title]
 [?e :create/time ?createtime]
 [?e :edit/time ?edittime]
 [?e :block/uid ?uid]
 [(> ?edittime 0)]]
'''

try:
    # 调用 q 方法
    result = q(client, query)
    
   # print(result)

    # 使用正则表达式匹配时间格式（MM-DD-YYYY）
    date_pattern = re.compile(r'\d{2}-\d{2}-\d{4}')
    filtered_results = [r for r in result if r['uid'].count('-') == 2 and date_pattern.search(r['uid'])]
    
    # 转换日期格式的函数
    def format_timestamp(timestamp):
        return datetime.fromtimestamp(timestamp/1000).strftime('%Y-%m-%d %H:%M:%S')
    
    # 对筛选后的结果进行排序和限制
    sorted_results = sorted(filtered_results, key=lambda x: x['edittime'], reverse=True)
    recent_pages = sorted_results[:30]

    # 输出结果时转换日期格式
    for page in recent_pages:
        create_time = format_timestamp(page['createtime'])
        edit_time = format_timestamp(page['edittime'])
        # print(f"{page['uid']} Title: {page['title']}, Create: {create_time}, Edit: {edit_time}")


        query2 = '''
        [:find ?e-id
        :keys e-id
        :where
        [?e-id :block/uid "{}"]]
        '''.format(page['uid'])

        result2 = q(client, query2)
        eid = result2[0]['e-id']
        # print(eid)

        def print_block_strings(block_data, level=0):
            str = ''
            # 打印当前层级的 block/string
            if ':block/string' in block_data:
                indent = "  " * (level-1)  # 每层缩进4个空格
                str = f"{indent}{block_data[':block/string']}\n"
            
            # 递归处理子块
            if ':block/children' in block_data:
                for child in block_data[':block/children']:
                    str += print_block_strings(child, level + 1)
            return str

        # 获取结果后打印
        result3 = pull(client=client, pattern='[* {:block/children [* {:block/children [* {:block/children [* {:block/children [* {:block/children [* {:block/children [* {:block/children [*]}]}]}]}]}]}]}]}]}]', eid=eid)
        markdown = print_block_strings(result3)
        if markdown and len(markdown) > 0:  # 检查 markdown 是否存在且长度大于0
            print(markdown)
        # print(markdown)

        
except requests.exceptions.ProxyError as e:
    print(f"代理错误: {e}")
except requests.exceptions.RequestException as e:
    print(f"请求错误: {e}")
except Exception as e:
    print(f"其他错误: {e}")

执行结果如下：

Roam 的开发支持很差，接口虽然简单，但需要懂 Datalog 语法，以及写 roam js 插件的玩家共享的数据结构。

取出的数据也都需要自己处理。

但 cursor 完全自动化的做到了。

注意：deepseek r1、chatgpt 4o 效果都不行，似乎完全不懂这冷门的语法。但是 claude-3.5-sonnet 效果意外的好。虽然也是磕磕碰碰修改，但能理解以下问题，给出准确答案以及相当了不起了：

在尝试冷门语言和 API 的智能编程时，我遇到了不少困难，尤其是在完全不理解特殊语法和数据结构的情况下。幸运的是，得到了朋友 Steve（葫芦笔记创始人）的帮助。他在我与他对话时不断鼓励我，给我信心，最终我还是耐心地完成了脚本的编写。整个过程就像是钻木取火一样，虽然中途多次想放弃，但最终还是坚持下来了。

提取笔记有什么用？
直接喂给 GPT，让他洞察出我最近的关注，并作为信息过滤器，去过滤 rss。

未完待续

使用 cursor 实现 Roam Research 近期笔记提取

Bruce

引用和评论

分享博客还写代码？AI静态 HTML 一键起飞！

claude 3.7 sonnet API Key 获取：轻松获取 API Key 并自定义你的 Cursor AI 编程助手

4 种免费使用 Cursor AI 的方法 (无需付费)

全球股票实时行情API：量化交易的终极数据引擎

OpenAI API Key 获取并用GPT-4o 图像生成：使用 Node JS代码调用示例

2025快递物流查询接口Top推荐：轨迹查询/单号追踪/性价比对比

Apifox不支持离线，我选择了Apipost