Discussion
Loading...

Post

Log in
  • Sign up
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Harald Klinke
Harald Klinke
@HxxxKxxx@det.social  ·  activity timestamp 2 days ago

pairwiseLLM v1.1.0 offers a unified framework for generating and analyzing pairwise comparisons of writing quality with LLMs. It supports live and batch workflows across multiple providers and models results using Bradley–Terry or Elo methods to derive quality scores. The package includes introductory and advanced vignettes.

https://cran.r-project.org/package=pairwiseLLM

#RStats #LLM #TextEvaluation

pairwiseLLM: Pairwise Comparison Tools for Large Language Model-Based Writing Evaluation

Provides a unified framework for generating, submitting, and analyzing pairwise comparisons of writing quality using large language models (LLMs). The package supports live and/or batch evaluation workflows across multiple providers ('OpenAI', 'Anthropic', 'Google Gemini', 'Together AI', and locally-hosted 'Ollama' models), includes bias-tested prompt templates and a flexible template registry, and offers tools for constructing forward and reversed comparison sets to analyze consistency and positional bias. Results can be modeled using Bradley–Terry (1952) <doi:10.2307/2334029> or Elo rating methods to derive writing quality scores. For information on the method of pairwise comparisons, see Thurstone (1927) <doi:10.1037/h0070288> and Heldsinger & Humphry (2010) <doi:10.1007/BF03216919>. For information on Elo ratings, see Clark et al. (2018) <doi:10.1371/journal.pone.0190393>.
  • Copy link
  • Flag this post
  • Block

Indieweb Studio

This is a relaxed, online social space for the indieweb community, brought to you by indieweb.social.

Please abide by our code of conduct and have a nice time!

Indieweb Studio: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.2-alpha.7 no JS en
Automatic federation enabled
Log in Create account
  • Explore
  • About
  • Members
  • Code of Conduct