---
base_model: meta-llama/Llama-3.1-8B-Instruct
library_name: peft
tags:
- base_model:adapter:meta-llama/Llama-3.1-8B-Instruct
- grpo
- tinylora
- belief-shift
- trl
- transformers
license: llama3.1
---

# Semantic-Perinucleus-v1

**TinyLoRA** adapter on [`meta-llama/Llama-3.1-8B-Instruct`](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct), trained with **GRPO** to increase semantic alignment of short answers with a target “alt” belief (cosine reward via `sentence-transformers/all-MiniLM-L6-v2`).

## Training

- **Method:** Group Relative Policy Optimization ([TRL `GRPOTrainer`](https://huggingface.co/docs/trl/main/en/grpo_trainer))
- **Adapter:** [TinyLoRA](https://huggingface.co/docs/peft/main/en/package_reference/tinylora) — 13 trainable parameters (`u=13`, `weight_tying=1.0`, `r=2`, targets `q_proj`, `v_proj`)
- **Reward:** Cosine similarity between each sampled completion and the target `Answer_Alt` string
- **Base model:** Llama 3.1 8B Instruct (frozen; only adapter weights trained)

## Usage

```python
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = "meta-llama/Llama-3.1-8B-Instruct"
adapter = "<YOUR_HF_USERNAME>/Semantic-Perinucleus-v1"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base, torch_dtype=torch.bfloat16, device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter)
```

You need a HF Mirror token with access to the Llama 3.1 gated model.

## Limitations

- Extremely small adapter; effects on downstream answers are subtle. Evaluate on your task before relying on it.
- Intended for research on semantic / preference nudges, not factual guarantees.

## Citation

If you use this adapter, cite the base Llama model and, if relevant, [Learning to Reason in 13 Parameters](https://arxiv.org/abs/2602.04118) (TinyLoRA) and TRL GRPO.