Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string

Semantic-Perinucleus-v1

TinyLoRA adapter on meta-llama/Llama-3.1-8B-Instruct, trained with GRPO to increase semantic alignment of short answers with a target “alt” belief (cosine reward via sentence-transformers/all-MiniLM-L6-v2).

Training

Method: Group Relative Policy Optimization (TRL GRPOTrainer)
Adapter: TinyLoRA — 13 trainable parameters (u=13, weight_tying=1.0, r=2, targets q_proj, v_proj)
Reward: Cosine similarity between each sampled completion and the target Answer_Alt string
Base model: Llama 3.1 8B Instruct (frozen; only adapter weights trained)

Usage

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = "meta-llama/Llama-3.1-8B-Instruct"
adapter = "<YOUR_HF_USERNAME>/Semantic-Perinucleus-v1"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base, torch_dtype=torch.bfloat16, device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter)

You need a HF Mirror token with access to the Llama 3.1 gated model.

Limitations

Extremely small adapter; effects on downstream answers are subtle. Evaluate on your task before relying on it.
Intended for research on semantic / preference nudges, not factual guarantees.

Citation

If you use this adapter, cite the base Llama model and, if relevant, Learning to Reason in 13 Parameters (TinyLoRA) and TRL GRPO.

Downloads last month: 3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zjianyi/Semantic-Perinucleus-v1

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Adapter

(2407)

this model

Paper for zjianyi/Semantic-Perinucleus-v1

Learning to Reason in 13 Parameters

Paper • 2602.04118 • Published Feb 4 • 6