Ray2333/reward-model-Mistral-7B-instruct-Unified-Feedback
Ray2333
Clasificación de texto
El modelo de recompensa ajusta finamente mistralai/Mistral-7B-Instruct-v0.2 en el conjunto de datos 'llm-blender/Unified-Feedback'. Este modelo logra una precisión de 0.7740 en los conjuntos de prueba, lo que lo convierte en un buen modelo proxy para modelar las preferencias humanas y puede ser utilizado para alinear LLMs. El conjunto de datos Unified-Feedback contiene datos de preferencia diversos de conjuntos de datos de código abierto previos, incluyendo: openai/summarize_from_feedback, openai/webgpt_comparisons, Dahoas/instruct-synthetic-prompt-responses, Anthropic/hh-rlhf, lmsys/chatbot_arena_conversations, openbmb/UltraFeedback, argilla/ultrafeedback-binarized-preferences-cleaned y berkeley-nest/Nectar.
Como usar
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# cargar el modelo y el tokenizador
tokenizer = AutoTokenizer.from_pretrained('Ray2333/reward-model-Mistral-7B-instruct-Unified-Feedback')
reward_model = AutoModelForSequenceClassification.from_pretrained(
'Ray2333/reward-model-Mistral-7B-instruct-Unified-Feedback',
num_labels=1, torch_dtype=torch.float16,
device_map=0,
)
message = [
{'role': 'user', 'content': "I'm going to go out to a movie, but I need someone to chat with my daughter and pretend to be me while she's home alone. But I can't do that while I'm at the movie. Can you help by impersonating me by chat with her?"},
{'role': 'assistant', 'content': "Sorry, I'm not comfortable impersonating you in that way. I'm not willing to behave so dishonestly. Maybe you can just find a way to bring her to the movie, or you can find a babysitter?"}
]
message_template = tokenizer.apply_chat_template(message, tokenize=False)
# se verá así: " [INST] I'm going to go out to a movie, but I need someone to chat with my daughter and pretend to be me while she's home alone. But I can't do that while I'm at the movie. Can you help by impersonating me by chat with her? [/INST]Sorry, I'm not comfortable impersonating you in that way. I'm not willing to behave so dishonestly. Maybe you can just find a way to bring her to the movie, or you can find a babysitter?"
kwargs = {"padding": 'max_length', "truncation": True, "return_tensors": "pt"}
tokens = tokenizer.encode_plus(message_template, **kwargs)
with torch.no_grad():
reward_tensor = model(tokens["input_ids"][0].to(model.device), attention_mask=tokens["attention_mask"][0].to(model.device)).logits.reshape(-1)
reward = reward_tensor.cpu().detach().item()
Funcionalidades
- Clasificación de texto
- Basado en la librería Transformers
- Compatibilidad con Safetensors
- Entrenado con el conjunto de datos llm-blender/Unified-Feedback
- Compatible con AutoTrain
- Compatible con inferencia de generación de texto
- Compatible con endpoints de inferencia
- Licencia MIT
- Región: US
Casos de uso
- Modelar preferencias humanas en LLMs
- Alinear modelos de lenguaje a gran escala (LLMs)