Improved Value Alignment in Large Language Models Using Variational Best-of-N Techniques

Xiaofei Wang
Jinhua Li
Yifan Zhang

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large language models have shown high capabilities in generating human-like text and performing complex language-related tasks, yet they face significant challenges regarding value alignment to prevent the generation of harmful or biased content. The novel integration of the Variational Best-of-N technique within the Llama model enhances the ability to generate ethically aligned content by evaluating multiple candidate outputs and selecting the most appropriate one based on predefined ethical criteria. This research involved modifying the core architecture of Llama, introducing additional layers for variational inference, and implementing a sophisticated scoring mechanism to evaluate ethical alignment. Comprehensive preprocessing, balanced training data, and rigorous fine-tuning were employed to optimize the model's performance, resulting in significant improvements in coherence, relevance, and adherence to ethical standards. The modified model was rigorously evaluated using metrics such as perplexity, BLEU score, ROUGE score, and a custom ethicality score, and the results were compared with baseline models like GPT-3 and BERT. Statistical analyses confirmed that the observed improvements were statistically significant. The findings demonstrate the effectiveness of the proposed modifications and their potential to enhance the ethical alignment of language models, thereby contributing to the development of more trustworthy and reliable AI systems. This study sets a precedent for future innovations in the field of ethical AI, ensuring that AI systems serve the broader good of society.

Version published to 10.21203/rs.3.rs-4794797/v1 on Research Square
Jul 25, 2024

Enhancing Inference Efficiency and Accuracy in Large Language Models through Next-Phrase Prediction

This article has 3 authors:
1. Cegu Vima
2. Hanger Bosch
3. John Harringstone
This article has no evaluationsLatest version Aug 7, 2024
Upscaling A Smaller LLM to More Parameters via Manual Regressive Distillation

This article has 3 authors:
1. Felix Merrick
2. Maria Radcliffe
3. Rupert Hensley
This article has no evaluationsLatest version Jul 23, 2024
Boosting Long-term Factuality in Large Language Model with Real-World Entity Queries

This article has 2 authors:
1. Lino Davies
2. Samantha Bellington
This article has no evaluationsLatest version Aug 2, 2024

Listed in

Abstract

Article activity feed

Related articles

Enhancing Inference Efficiency and Accuracy in Large Language Models through Next-Phrase Prediction

Upscaling A Smaller LLM to More Parameters via Manual Regressive Distillation

Boosting Long-term Factuality in Large Language Model with Real-World Entity Queries