vLLM 教程：使用 vLLM 加载大模型进行少样本学习

该教程为在 RTX4090 上使用 vLLM 加载 AWQ 量化 Qwen2.5-3B-Instruct。

对于每个测试问题，我们使用训练数据检索一组「支持」它的类似问题。
- 考虑「construct」和「subject」等内容
使用一组类似的问题，我们创建了一个可以馈送到我们的模型的对话
- 在对话中使用最近支持的 chat（）功能
- 生成温度略高的 n 个响应，以创建不同的输出
对于每个问题/答案对，我们现在有 n 个推断的误解，对于每个误解，我们使用 BGE 嵌入检索前 25 个误解。
对于每个问题/答案对的 n 个推断错误中的每一个的 25 个最接近的误解，现在可以使用 Borda Ranking 进行组合，这有点像最简单的集成形式。

教程链接：https://go.openbayes.com/suzcp
使用云平台：OpenBayes
http://openbayes.com/console/signup?r=sony_0m6v

页面跳转后，点击右上角「克隆」，将该教程克隆至自己的容器中。

选择「NVIDIA GeForce RTX 4090」以及「vLLM」镜像，OpenBayes 平台上线了新的计费方式，大家可以按照需求选择「按量付费」或「包日/周/月」，点击「继续执行」。可以使用文章开头的邀请链接，获得 RTX 4090 使用时长！

稍等片刻，待系统分配好资源，当状态变为「运行中」后，点击「打开工作空间」。

进入到工作空间后，打开左侧目录中的「README.ipynb」文件即可查看教程的运行步骤。

下面为详细的运行步骤：

导入相关的库

import os
import gc
import ctypes
import numpy as np
import pandas as pd

from random import sample
from tqdm.auto import tqdm
from eedi_metrics import mapk, apk
from scipy.spatial.distance import cdist
from sklearn.metrics.pairwise import cosine_similarity

import torch
from vllm import LLM, SamplingParams
from transformers import AutoTokenizer, AutoModel

os.environ["CUDA_VISIBLE_DEVICES"]   = "0"
os.environ["TOKENIZERS_PARALLELISM"] = "false"

def clean_memory(deep=False):
    gc.collect()
    if deep:
        ctypes.CDLL("libc.so.6").malloc_trim(0)
    torch.cuda.empty_cache()

2. 加载数据

k = 3

train_eval = True
n_train_eval_rows = 100

comp_dir  = './eedi-mining-misconceptions-in-mathematics'

llm_model_pth   = '/input0/Qwen2.5-3B-Instruct-AWQ'

embed_model_pth = '/input0/nomic-embed-text-v1.5'


if os.getenv("KAGGLE_IS_COMPETITION_RERUN"):
    train_eval = False

if train_eval:
    test       = pd.read_csv(f'{comp_dir}/train.csv').sample(n_train_eval_rows, random_state=3)
    test       = test.sort_values(['QuestionId'], ascending=True).reset_index(drop=True)
else:
    test       = pd.read_csv(f'{comp_dir}/test.csv')

train          = pd.read_csv(f'{comp_dir}/train.csv')
sample_sub     = pd.read_csv(f'{comp_dir}/sample_submission.csv')
misconceptions = pd.read_csv(f'{comp_dir}/misconception_mapping.csv')

len(train), len(test), len(misconceptions)

(1869, 100, 2587)

3. 使用 vLLM 启动 Qwen2.5-3B-Instruct-AWQ

如果出现 OOM 错误，将 max_num_seqs减少到 4 或 8 甚至 1 可能会有所帮助（默认值为 256）。

llm = LLM(
    llm_model_pth,
    trust_remote_code=True,
    dtype="half", max_model_len=4096,
    tensor_parallel_size=1, gpu_memory_utilization=0.95, 
)

tokenizer = llm.get_tokenizer()

INFO 11-28 10:39:42 awq_marlin.py:97] The model is convertible to awq_marlin during runtime. Using awq_marlin kernel.
INFO 11-28 10:39:42 llm_engine.py:237] Initializing an LLM engine (v0.6.3.post1) with config: model='/input0/Qwen2.5-3B-Instruct-AWQ', speculative_config=None, tokenizer='/input0/Qwen2.5-3B-Instruct-AWQ', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.float16, max_seq_len=4096, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=awq_marlin, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=/input0/Qwen2.5-3B-Instruct-AWQ, num_scheduler_steps=1, chunked_prefill_enabled=False multi_step_stream_outputs=True, enable_prefix_caching=False, use_async_output_proc=True, use_cached_outputs=False, mm_processor_kwargs=None)
INFO 11-28 10:39:43 model_runner.py:1056] Starting to load model /input0/Qwen2.5-3B-Instruct-AWQ...

Loading safetensors checkpoint shards:   0% Completed | 0/1 [00:00

INFO 11-28 10:39:44 model_runner.py:1067] Loading model weights took 1.9550 GB
INFO 11-28 10:39:44 gpu_executor.py:122] # GPU blocks: 75545, # CPU blocks: 7281
INFO 11-28 10:39:44 gpu_executor.py:126] Maximum concurrency for 4096 tokens per request: 295.10x
INFO 11-28 10:39:46 model_runner.py:1395] Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI.
INFO 11-28 10:39:46 model_runner.py:1399] CUDA graphs can take additional 1~3 GiB memory per GPU. If you are running out of memory, consider decreasing gpu_memory_utilization or enforcing eager mode. You can also reduce the max_num_seqs as needed to decrease memory usage.
INFO 11-28 10:39:59 model_runner.py:1523] Graph capturing finished in 13 secs.

4. 后处理数据

answer_cols         = ["AnswerAText", "AnswerBText", "AnswerCText", "AnswerDText"]
misconception_cols  = ["MisconceptionAId", "MisconceptionBId", "MisconceptionCId", "MisconceptionDId"]

keep_cols           = ["QuestionId", "CorrectAnswer", "ConstructName", "SubjectName", "QuestionText" ]

def wide_to_long(df: pd.DataFrame) -> pd.DataFrame:
    
    # Melt the answer columns
    answers_df = pd.melt(
        id_vars=keep_cols,
        frame=df[keep_cols + answer_cols],
        var_name='Answer', value_name='Value'
    ).sort_values(["QuestionId", "Answer"]).reset_index(drop=True)
    if misconception_cols[0] not in df.columns:  # If test set
        return answers_df
        
    # Melt the misconception columns
    misconceptions_df = pd.melt(
        id_vars=keep_cols,
        frame=df[keep_cols + misconception_cols],
        var_name='Misconception', value_name='MisconceptionId'
    ).sort_values(["QuestionId", "Misconception"]).reset_index(drop=True)

    answers_df[['Misconception', 'MisconceptionId']] = misconceptions_df[['Misconception', 'MisconceptionId']]
    
    return answers_df
test  = wide_to_long(test)
train = wide_to_long(train)

test['AnswerId']  = test.Answer.str.replace('Answer', '').str.replace('Text', '')
train['AnswerId'] = train.Answer.str.replace('Answer', '').str.replace('Text', '')

train = pd.merge(train, misconceptions, on='MisconceptionId', how='left')
if train_eval:
    test = pd.merge(test, misconceptions, on='MisconceptionId', how='left')

train.head(3)

QuestionId	ConstructName	SubjectName	QuestionText	Answer	Value	Misconception	MisconceptionId	AnswerId	MisconceptionName
0	A	Use the order of operations to carry out calcu...	BIDMAS	[\n3 \times 2+4-5\n]\nWhere do the brackets ...	AnswerAText	( 3 \times(2+4)-5 )	MisconceptionAId	NaN	A	NaN
1	A	Use the order of operations to carry out calcu...	BIDMAS	[\n3 \times 2+4-5\n]\nWhere do the brackets ...	AnswerBText	( 3 \times 2+(4-5) )	MisconceptionBId	NaN	B	NaN
2	A	Use the order of operations to carry out calcu...	BIDMAS	[\n3 \times 2+4-5\n]\nWhere do the brackets ...	AnswerCText	( 3 \times(2+4-5) )	MisconceptionCId	NaN	C	NaN

test.head(3)

	QuestionId	CorrectAnswer	ConstructName	SubjectName	QuestionText	Answer	Value	Misconception	MisconceptionId	AnswerId	MisconceptionName
0	31	A	Convert between cm and m	Length Units	[450 \mathrm{~cm}=] [\square \mathrm{~m}]	AnswerAText	( 4.5 )	MisconceptionAId	NaN	A	NaN
1	31	A	Convert between cm and m	Length Units	[450 \mathrm{~cm}=] [\square \mathrm{~m}]	AnswerBText	( 45 )	MisconceptionBId	704	B	Thinks there are 10cm in a metre
2	31	A	Convert between cm and m	Length Units	[450 \mathrm{~cm}=] [\square \mathrm{~m}]	AnswerCText	( 5 )	MisconceptionCId	1272	C	Gives a rounded whole number instead of a decimal

5. 辅助函数

在给定 subject 和 construct 的情况下获取最相似的 question_ids'
以下函数首先通过检查结构top_k subject 相似的问题来返回问题 ID 的数量。
如果这没有达到top_k，则选择具有相似主题或结构的问题。如果我们仍然缺少问题 ID'，我们会为剩余的 top_k 选择随机问题。

def get_topk_similar_rows(question_id: int, construct: str, subject: str, top_k: int) -> list[int]:
    """ Gets the top n ids of questions that most similar to the given construct and subject """
    
    # Rows with similar construct and subject
    similar_cs_rows = train[(train.ConstructName == construct) & (train.SubjectName == subject)]
    similar_cs_qids = list(set(similar_cs_rows.QuestionId.values.tolist()))
    
    if train_eval and question_id in similar_cs_qids:
        similar_cs_qids.remove(question_id)
        
    if len(similar_cs_qids) >= top_k:
        k_similar_cs_qids = sample(similar_cs_qids, top_k)
        return k_similar_cs_qids
    # Rows with similar construct or subject for remainder of top_k
    similar_s_rows = train[(train.ConstructName != construct) & (train.SubjectName == subject)]
    similar_c_rows = train[(train.ConstructName == construct) & (train.SubjectName != subject)]
    similar_c_or_s_qids = list(set(similar_s_rows.QuestionId.values.tolist() + similar_c_rows.QuestionId.values.tolist()))
    
    if train_eval and question_id in similar_c_or_s_qids:
        similar_c_or_s_qids.remove(question_id)
    
    if len(similar_c_or_s_qids) >= top_k - len(similar_cs_qids):
        n_similar_c_or_s_qids = sample(similar_c_or_s_qids, top_k - len(similar_cs_qids))
        return similar_cs_qids + n_similar_c_or_s_qids
        # Random rows for remainder of top_k
    not_so_similar_rows = train[(train.ConstructName != construct) & (train.SubjectName != subject)]
    not_so_similar_rows_qids = list(set(not_so_similar_rows.QuestionId.values.tolist()))
    
    if train_eval and question_id in not_so_similar_rows_qids:
        not_so_similar_rows_qids.remove(question_id)
    
    n_not_so_similar_rows_qids = sample(not_so_similar_rows_qids, top_k - len(similar_c_or_s_qids))
    return similar_c_or_s_qids + n_not_so_similar_rows_qids

获取每个问题的聊天对话

def get_conversation_msgs(question, correct_ans, incorrect_ans, misconception):
    msgs = [
        {'role': 'user',      'content': 'Question: ' + question.strip()},
        {'role': 'assistant', 'content': 'Provide me with the correct answer for a baseline.'},
        {'role': 'user',      'content': 'Correct Answer: ' + correct_ans.strip()},
        {'role': 'assistant', 'content': 'Now provide the incorrect answer and I will anaylze the difference to infer the misconception.'},
        {'role': 'user',      'content': 'Incorrect Answer: ' + incorrect_ans.strip()},
    ]
    
    if misconception is not None:
        msgs += [{'role': 'assistant', 'content': 'Misconception for incorrect answer: ' + misconception}]
        
    return msgs

6. 使用 llm.chat

注意：llm（）是最近才推出的，仅在后续版本中可用
我们生成 n 个输出，使用更高的温度来创建输出的多样化表示，然后可以稍后用于对结果进行排名。

sampling_params = SamplingParams(
    n=10,                     # 对于每个提示，返回的输出序列数量。Number of output sequences to return for each prompt.
    # top_p=0.5,               # 控制考虑的顶部标记的累积概率的浮点数。Float that controls the cumulative probability of the top tokens to consider.
    temperature=0.7,          # 采样的随机性。randomness of the sampling
    seed=1,                   # 用于可重复性的种子。Seed for reprodicibility
    skip_special_tokens=True, # 是否在输出中跳过特殊标记。Whether to skip special tokens in the output.
    max_tokens=64,            # 每个输出序列生成的最大标记数。Maximum number of tokens to generate per output sequence.
    stop=['\n\n', '. '],      # 当生成的文本中包含这些字符串时，将停止生成过程的字符串列表。List of strings that stop the generation when they are generated.
)

submission = []
for idx, row in tqdm(test.iterrows(), total=len(test)):
    
    if idx % 50:
        clean_memory()
        clean_memory()
    
    if row['CorrectAnswer'] == row['AnswerId']: continue
    if train_eval and not row['MisconceptionId'] >= 0: continue
        
    context_qids   = get_topk_similar_rows(row['QuestionId'], row['ConstructName'], row['SubjectName'], k)
    correct_answer = test[(test.QuestionId == row['QuestionId']) & (test.CorrectAnswer == test.AnswerId)].Value.tolist()[0]
    
    messages = []
    for qid in context_qids:
        correct_option = train[(train.QuestionId == qid) & (train.CorrectAnswer == train.AnswerId)]
        incorrect_options = train[(train.QuestionId == qid) & (train.CorrectAnswer != train.AnswerId)]
        for idx, incorrect_option in incorrect_options.iterrows():
            if type(incorrect_option['MisconceptionName']) == str: # Filter out NaNs
                messages += get_conversation_msgs(
                    question = correct_option.QuestionText.tolist()[0],
                    correct_ans = correct_option.Value.tolist()[0],
                    incorrect_ans = incorrect_option['Value'],
                    misconception = incorrect_option['MisconceptionName'],
                )
                
    # 对话对于错误答案以获取误解的原因。Coversation for Incorrect answer to get misconception for
    messages += get_conversation_msgs(
        question = row['QuestionText'],
        correct_ans = correct_answer,
        incorrect_ans = row['Value'],
        misconception = None,
    )
    
    output = llm.chat(messages, sampling_params, use_tqdm=False)
    inferred_misconceptions = [imc.text.split(':')[-1].strip() for imc in output[0].outputs]
    if not train_eval:
        submission.append([f"{row['QuestionId']}_{row['AnswerId']}", inferred_misconceptions])
    else:
        submission.append([
            f"{row['QuestionId']}_{row['AnswerId']}", 
            inferred_misconceptions, 
            context_qids,
            [int(row['MisconceptionId'])],
            row['MisconceptionName']
        ])
submission = pd.DataFrame(submission, columns=['QuestionId_Answer', 'InferredMisconception', 'TopKQuestionIDs', 
                                               'MisconceptionIdGT', 'MisconceptionNameGT'][:len(submission[0])])

len(submission)

  0%|          | 0/400 [00:00

submission.head()

	QuestionId_Answer	InferredMisconception	TopKQuestionIDs	MisconceptionIdGT	MisconceptionNameGT
0	31_B	[Incorrectly divided by 100 (or multiplied by ...	[691, 1119, 1774]	[704]	Thinks there are 10cm in a metre
1	31_C	[Incorrectly divided by 100 (or used the wrong...	[691, 1119, 1774]	[1272]	Gives a rounded whole number instead of a decimal
2	31_D	[Multiplied when converting to a larger unit, ...	[691, 1119, 257]	[1651]	Multiplies when converting to a larger unit
3	61_D	[Not realizing that the star is halfway betwee...	[457, 1587, 696]	[990]	Does not realise you can use equivalent fracti...
4	65_B	[Believes the value under the square root (the...	[1196, 807, 509]	[2316]	Mixes up squaring and multiplying by 2 or doub...

7. 找到最相似的误解

删除模型并清理内存以加载嵌入模型

del llm

clean_memory(deep=True)
clean_memory(deep=True)

tokenizer   = AutoTokenizer.from_pretrained(embed_model_pth, trust_remote_code=True)
embed_model = AutoModel.from_pretrained(embed_model_pth, trust_remote_code=True).to("cuda:0")

<All keys matched successfully>

def generate_embeddings(texts, batch_size=8):
    all_embeddings = []
    for i in range(0, len(texts), batch_size):
        batch_texts = texts[i:i+batch_size]
        inputs = tokenizer(batch_texts, padding=True, truncation=True, return_tensors="pt", max_length=1024).to('cuda:0')
        with torch.no_grad():
            outputs = embed_model(**inputs)
        embeddings = outputs.last_hidden_state[:, 0, :]  # CLS token
        embeddings = torch.nn.functional.normalize(embeddings, p=2, dim=1)
        all_embeddings.append(embeddings.cpu().numpy())
        
    return np.concatenate(all_embeddings, axis=0)

all_ctx_vector  = generate_embeddings(list(misconceptions.MisconceptionName.values))

all_ctx_vector.shape

(2587, 768)

n_results = []

for results in tqdm(pd.DataFrame(submission.InferredMisconception.values.tolist()).T.values):
    all_text_vector = generate_embeddings(list(results))
    cosine_similarities = cosine_similarity(all_text_vector, all_ctx_vector)
    test_sorted_indices = np.argsort(-cosine_similarities, axis=1)
    n_results.append(test_sorted_indices)

n_results = np.array(n_results)
n_results.shape

  0%|          | 0/10 [00:00

(10, 227, 2587)

n_results = np.transpose(n_results, (1, 0, 2))
n_results.shape

(227, 10, 2587)

合并每个问题的每个生成输出的排名

Borda count 是一种非常简单的排名机制

def borda_count(rankings):
    scores = {}
    num_elements = len(next(iter(rankings)))
    
    for model_ranking in rankings:
        for idx, item in enumerate(model_ranking):
            points = num_elements - idx
            scores[item] = scores.get(item, 0) + points
            
    # 根据总分排序误解。Sort the misconceptions based on total points
    final_ranking = sorted(scores.items(), key=lambda x: x[1], reverse=True)
    ranked_results = [r for r, score in final_ranking]
    return ranked_results

# 计算最终排名 Compute the final ranking
final_rankings = np.array([borda_count(result) for result in n_results])

final_rankings.shape

(227, 2587)

submission['MisconceptionId'] = final_rankings[:, :25].tolist()

8. 提交

if train_eval:
    submission['apk@25'] = submission.apply(lambda row: apk(row['MisconceptionIdGT'], row['MisconceptionId']), axis=1)
    submission.to_csv('submission_debug.csv', index=False)
    
    print(submission['apk@25'].mean())

0.1415299510916358

submission["MisconceptionId"] = submission["MisconceptionId"].apply(lambda x: ' '.join(map(str, x)))
submission[['QuestionId_Answer', 'MisconceptionId']].to_csv('submission.csv', index=False)

submission.head(25)

QuestionId_Answer	InferredMisconception	TopKQuestionIDs	MisconceptionIdGT	MisconceptionNameGT	MisconceptionId	apk@25
0	31_B	[Multiplies by 100 instead of dividing by 100,...	[691, 1119, 1774]	[704]	Thinks there are 10cm in a metre	2187 1035 2350 1579 2335 2408 2481 752 1408 33...	0
1	31_C	[Believes there are 100 cm in a metre, Assumes...	[691, 1119, 1774]	[1272]	Gives a rounded whole number instead of a decimal	613 447 1801 1151 1795 2408 752 2187 1579 566 ...	0
2	31_D	[Multiplies by 100 instead of dividing by 100,...	[691, 1119, 257]	[1651]	Multiplies when converting to a larger unit	1341 39 61 2187 2335 1035 2481 975 2134 2350 2...	0
3	61_D	[Does not recognize that the star is halfway b...	[457, 1587, 696]	[990]	Does not realise you can use equivalent fracti...	1212 2134 1119 916 1184 684 1309 1807 579 1206...	0
4	65_B	[Believes the coefficient of ( h ) in the eq...	[1196, 807, 509]	[2316]	Mixes up squaring and multiplying by 2 or doub...	1743 2372 341 2070 1904 2256 540 2324 1390 116...	0
5	65_C	[Does not correctly identify the value under t...	[1196, 807, 634]	[2245]	When using the formula to solve a quadratic eq...	170 340 1735 2245 3 2256 341 265 994 245 1987 ...	0.25
6	69_A	[Assumes that the sample size is solely respon...	[830, 1606, 1700]	[906]	Does not know that sample size affects reliabi...	2325 880 1923 1600 63 2065 453 2207 163 2299 4...	0
7	69_C	[Assumes the sample sizes are equal or that th...	[622, 977, 734]	[906]	Does not know that sample size affects reliabi...	1923 2065 63 2325 880 1600 906 1225 724 2309 1...	0.142857
8	69_D	[Assumes reliability is independent of sample ...	[1195, 1827, 1860]	[906]	Does not know that sample size affects reliabi...	2325 906 1681 1923 2561 880 2065 1912 2207 453...	0.5
9	70_A	[The student might have added the percentage v...	[59, 1507, 548]	[2023]	Thinks when finding a percentage you divide by...	388 2276 2408 329 1601 2138 1955 2191 403 2518...	0
10	81_A	[Orders the numbers from smallest to largest b...	[1834, 1169, 473]	[1468]	Orders integers based on first digits without ...	2546 561 399 1999 1941 2540 1672 1016 22 1119 ...	0
11	81_C	[Orders the numbers incorrectly by not conside...	[657, 714, 480]	[1365]	When ordering integers, orders from the digits...	1365 561 1999 2262 1124 388 1941 1378 1672 251...	1
12	83_B	[Rounds up to the next significant figure inst...	[920, 1080, 1059]	[1988]	Rounds up instead of down	1165 1105 794 1591 1157 1705 2116 1988 1817 14...	0.125
13	83_C	[Rounds to a degree of accuracy that is not ne...	[920, 1059, 1080]	[1744]	Rounded to nearest 100 instead of 1sf	1529 739 1591 1165 2392 1105 1157 1817 1705 20...	0
14	85_A	[The correct answer for the first term of the ...	[89, 1029, 437]	[1240]	Thinks the first term of a sequence must be 1	1240 108 2475 2472 1354 2376 2139 1716 936 162...	1
15	85_B	[Assumes the first term is the coefficient of ...	[89, 1029, 456]	[2376]	When finding the nth term of a linear sequence...	2252 2475 2139 2513 1821 528 2376 849 1240 109...	0.142857
16	103_A	[Multiplied the slanted height by the length t...	[353, 1538, 1161]	[867]	When finding the area of a parallelogram does ...	2332 1788 1883 1985 669 2105 307 700 1175 590 ...	0
17	103_B	[Multiplied the slanted height by the length i...	[1161, 991, 353]	[669]	Has used slant height and base to find area of...	2332 1788 669 2105 590 1883 1985 1698 342 1926...	0.333333
18	103_D	[Uses the slanted height instead of the perpen...	[991, 1538, 1161]	[695]	Has found the area of a triangle rather than a...	2332 669 1788 2105 1883 459 590 1780 2300 396 ...	0
19	112_A	[Confuses division with subtraction when think...	[258, 1777, 580]	[2093]	Thinks the fraction bar means subtract rather ...	1672 1941 15 752 566 1971 493 481 2134 357 240...	0.043478
20	112_C	[Believes the calculation is simply the subtra...	[1281, 1162, 1131]	[2093]	Thinks the fraction bar means subtract rather ...	752 566 848 1795 1431 1088 1297 1482 2512 477 ...	0
21	112_D	[Division by a whole number does not equate to...	[759, 1457, 1257]	[1542]	Believes that a fraction means dividing the de...	58 1042 812 2559 151 839 232 2525 371 1619 200...	0
22	140_A	[When factorising a quadratic without a non-va...	[847, 1291, 1057]	[838]	When factorising a quadratic without a non var...	2240 838 2581 2479 1432 265 2142 2068 1871 102...	0.5
23	140_B	[When factorising the expression ( p^2 - 99p ...	[680, 200, 455]	[838]	When factorising a quadratic without a non var...	2240 628 2581 2479 838 319 1666 1432 320 2142 ...	0.2
24	146_A	[Assumes the fraction is represented as a deci...	[47, 1690, 818]	[1637]	Has used the decimal point to separate numerat...	1825 78 257 1166 2406 72 318 1759 169 1478 157...	0

vLLM 教程：使用 vLLM 加载大模型进行少样本学习

小白狮ww

引用和评论

VASP 教程：VASP 结合 phonopy 计算硅的声子谱

一文掌握 MCP 上下文协议：从理论到实践

被 Manus 带火的 MCP 是什么｜一文看懂

AdventureX 2025 正式启动：五天四夜，120小时极限创造！一起在杭州点燃青年创新之火！

大模型时代，后端程序员如何避免被AI卷死？

MCP 协议为何不如你想象的安全？从技术专家视角解读

🔥吐血整理 Bolt.diy 部署与应用攻略