# Evidence Grading Methodology: How We Rate Every Peptide Claim

> The trust cornerstone of PeptideVox — how we assign an A-to-D evidence grade to every peptide claim, anchored to GRADE, the Oxford CEBM levels, USPSTF, and Cochrane RoB 2, and why human evidence is never blended with animal, in-vitro, or anecdotal data.

*Published 2026-07-01 · Updated 2026-07-01 · By The PeptideVox Editorial Desk*

The short answer
Every efficacy claim on PeptideVox carries a letter grade from **A** (proven in humans by randomized trials) to **D** (unproven — anecdote, mechanism, or marketing), anchored to internationally recognized appraisal systems and paired with an inline citation to a real primary source. The single most important discipline is that **human evidence is never blended with animal, in-vitro, or anecdotal evidence**, and a compound's legal and anti-doping status is graded on a separate axis from its efficacy.[3](https://peptidevox.com/#r3)[5](https://peptidevox.com/#r5)

The peptide space has a structural honesty problem. The most heavily marketed compounds — BPC-157, TB-500, MOTS-c, epitalon, dihexa, and the bioregulator family — have their *entire* efficacy story built on animal studies, cell cultures, or single-lab literature, yet they are routinely discussed as though those findings were established human therapies. Meanwhile a smaller set of peptides — semaglutide, teriparatide, bremelanotide — genuinely do have rigorous human trial evidence for specific indications.[17](https://peptidevox.com/#r17)[18](https://peptidevox.com/#r18) A reader has no way to tell these apart unless someone does the appraisal work and shows it transparently. This page is how we do that work.

*This article is informational and editorial content for research and educational purposes only. It is not medical advice, not a protocol, and not a sourcing guide. Most peptides discussed on this site are not FDA-approved; many are sold as "research chemicals not for human use" and several are prohibited in sport. Consult a licensed clinician before any health decision.*

## Why grade every claim at all?

The purpose of an evidence grade is to compress "how much should I trust this claim?" into a single, defensible letter — and then to show our work with an inline citation so the reader never has to take our word for it. This mirrors the discipline of the best evidence-appraisal bodies in medicine, who long ago abandoned ungraded expert pronouncements in favor of explicit, reproducible grading of the *body* of evidence.[3](https://peptidevox.com/#r3)[5](https://peptidevox.com/#r5)

The peptide field is exactly the kind of subject matter — health, safety, and money — where standards must be highest. Google's Search Quality Rater framework classifies health content as "Your Money or Your Life" (YMYL), where misleading or low-quality content can cause real-world harm, and holds it to a higher bar for Experience, Expertise, Authoritativeness, and Trust (E-E-A-T).[1](https://peptidevox.com/#r1)[2](https://peptidevox.com/#r2) This methodology is how we meet that bar. From a functional and integrative-medicine standpoint we are sympathetic to root-cause, regenerative thinking, and we cover compounds the pharma-default literature often ignores — but that lens governs what we investigate and how we frame it, never the evidentiary bar. A mechanistically elegant, root-cause-friendly hypothesis with only rat data is still a Grade C claim, and we label it as such.

## What does each grade actually require?

Our four-tier scheme is deliberately simple for readers, but each tier is defined against established methodology so the grade is reproducible rather than arbitrary. A claim earns **Grade A** only when supported by human randomized controlled trials and/or meta-analyses or systematic reviews of RCTs for the specific indication and population. This is the top of every recognized hierarchy: in the Oxford CEBM 2011 levels, Level 1 is a systematic review of randomized trials,[5](https://peptidevox.com/#r5) and in GRADE, randomized trials start as high-certainty evidence.[4](https://peptidevox.com/#r4) It corresponds to a USPSTF Grade A/B recommendation, where there is high certainty that the net benefit is substantial or moderate.[6](https://peptidevox.com/#r6) Canonical A-grade examples on this site are semaglutide for chronic weight management,[17](https://peptidevox.com/#r17) teriparatide for osteoporotic fracture reduction,[18](https://peptidevox.com/#r18) and bremelanotide for premenopausal hypoactive sexual desire disorder.[19](https://peptidevox.com/#r19)

**Grade B** is for genuine human evidence below the RCT bar: prospective cohort and observational studies, and small, open-label, single-arm, or early-phase (Phase 1/2) human trials. In GRADE, observational studies start at low certainty and must earn their way up.[4](https://peptidevox.com/#r4) A Grade B claim says humans have been studied, a signal exists, but the evidence is preliminary and could change with a proper trial. **Grade C** is the most consequential grade in the peptide field, because it is where most of the popular "healing" and "longevity" compounds sit: the only supporting evidence is animal and/or in-vitro, with no qualifying human efficacy data. BPC-157 is the textbook case — a large, consistent preclinical literature but no completed human RCT, leaving its highest grade at C.[21](https://peptidevox.com/#r21) **Grade D** is reserved for claims resting on anecdote, expert opinion, mechanism-only reasoning, or marketing copy with no controlled evidence — the situation a USPSTF "I statement" describes, where evidence is lacking, of poor quality, or conflicting.[6](https://peptidevox.com/#r6) A D is not a statement that something is false; it is a statement that it is unproven.

  How our A-D scale maps to recognized frameworks

    Our gradeGRADE certaintyOxford CEBM 2011USPSTF analogueExamine.com

    AHigh / ModerateLevel 1-2 (SR of RCTs / RCTs)A / B (high certainty, substantial-moderate net benefit)A (multiple consistent studies)
    BLow-ModerateLevel 3-4 (cohort, case-series)C (moderate certainty, small net benefit)B-C (fewer studies, possible/small effect)
    CVery Low (for human use)Level 5 (mechanism-based)I statement (insufficient human evidence)D (very little / inconsistent research)
    DBelow GRADE (no qualifying study)Below Level 5I statementD-F (no / contrary evidence)

These mappings are approximate and directional. GRADE rates the certainty of a body of evidence as High, Moderate, Low, or Very Low;[10](https://peptidevox.com/#r10) Examine.com grades interventions A-F on a per-outcome basis;[9](https://peptidevox.com/#r9) and AHRQ similarly grades a body of evidence by study limitations, consistency, directness, and precision.[8](https://peptidevox.com/#r8) We borrow their logic, not their exact arithmetic.

## Which sources count, and which do not?

Not all sources are equal, and we rank them explicitly before any claim is graded. We draw primary literature first — PubMed/MEDLINE-indexed RCTs, [Cochrane Library](https://www.cochranelibrary.com/) systematic reviews, and meta-analyses in peer-reviewed journals — and treat everything downstream of it as context, never as the basis for an efficacy grade. The tiers run from T1 (primary human RCT/meta-analytic evidence, which can support Grade A) down through cohort and early-phase human data (T2, Grade B), regulatory and official sources such as the FDA and the [WADA Prohibited List](https://www.wada-ama.org/en/prohibited-list) (which establish legal status, graded separately), pharmacology and reference databases, trial registries such as [ClinicalTrials.gov](https://clinicaltrials.gov/), specialty-society guidance, and finally preclinical, mechanistic, and single-lab literature (which can support Grade C at most and never establishes a human claim).[22](https://peptidevox.com/#r22)[23](https://peptidevox.com/#r23)

Three operating rules follow. The **cross-check rule**: every significant claim is verified against at least two independent sources, preferring primary over secondary. The **registry-is-not-evidence rule**: a trial appearing on ClinicalTrials.gov proves only that it was registered — until it reports results, it cannot raise a grade, so a registered-but-not-reporting Phase 2 trial leaves an existing Grade C verdict unchanged.[23](https://peptidevox.com/#r23) The **date-check rule**: regulatory and legal facts are time-stamped and re-verified against the current year, because they move fast — we do not rely on stale 2023-era status claims for a 2026 page.

## How is quality judged within a grade?

A grade reflects more than study design; it reflects how well the studies were done. For human trials we weigh the same five bias domains the revised Cochrane Risk of Bias 2 tool uses — the randomization process, deviations from intended interventions, missing outcome data, measurement of the outcome, and selection of the reported result — each judged low risk, some concerns, or high risk.[7](https://peptidevox.com/#r7) A small, unblinded, industry-funded trial with selective outcome reporting does not carry the weight of a large, pre-registered, double-blind RCT, even though both are technically "RCTs."

Following the GRADE approach, we treat RCT evidence as starting high and observational evidence as starting low, then move it based on five factors that lower certainty — risk of bias, inconsistency, indirectness, imprecision, and publication bias — and factors that can raise it, chiefly a large effect size.[4](https://peptidevox.com/#r4)[3](https://peptidevox.com/#r3) Imprecision matters acutely here, where many "human" peptide studies enroll only a handful of subjects: an n=2 pilot is reported as a safety signal, not as efficacy. And like GRADE and AHRQ, we grade the body of evidence for a claim, not a single favorable study — one positive small trial against a backdrop of null or conflicting trials does not earn an A, and reliance on a single lab is a caution flag that holds a grade down.[8](https://peptidevox.com/#r8)

## Where is the bright line between human, preclinical, and anecdotal?

The single most important discipline on this site is refusing to let evidence "level up" as it crosses categories. A result in rats, mice, or a cell line is reported in those exact terms and graded C — we never write "X heals tendons" on the strength of a rodent Achilles study; we write "in a rat Achilles model, X improved healing on functional, biomechanical, and histological measures (Grade C, preclinical)." The reason RCTs sit atop every hierarchy is precisely that animal and mechanistic data systematically over-predict human benefit.[5](https://peptidevox.com/#r5)

Mechanism is a hypothesis, not an outcome: a plausible receptor interaction explains how a compound might work, not that it does work in people, and in the Oxford scheme mechanism-based reasoning is the lowest level of evidence.[5](https://peptidevox.com/#r5) We also apply an **extrapolation ban** — an open-label signal in older adults is not evidence for athletes, an intravenous safety pilot is not evidence for subcutaneous efficacy, and a microgram animal dose is not a human dosing recommendation. When the literature only supports a narrow, qualified statement, that is the only statement we make; and when a popular claim has no qualifying evidence, the page says so plainly, because naming the gap is itself part of the grade.

Finally, legal and anti-doping status is graded on a separate axis. We track each compound's federal status from primary FDA sources — the 503A bulk-substance categories, and the live 2026 timeline in which the FDA removed twelve peptides from Category 2 on April 15, 2026 (because nominations were withdrawn, not because the agency found them safe) ahead of a Pharmacy Compounding Advisory Committee review.[11](https://peptidevox.com/#r11)[12](https://peptidevox.com/#r12)[13](https://peptidevox.com/#r13)[14](https://peptidevox.com/#r14) For athletes, most research peptides fall under WADA category S0 (non-approved substances), prohibited at all times, with GLP-1 agonists moving to full prohibition in 2026.[15](https://peptidevox.com/#r15)[16](https://peptidevox.com/#r16)

**Bottom line.** This methodology exists so that a reader can trust a single letter. Grade A means human randomized-trial evidence stands behind the claim; B means real but preliminary human evidence; C means the science so far is animal- or cell-based only; and D means the claim rests on anecdote, mechanism, or marketing. Human evidence is never silently merged with preclinical or anecdotal evidence, every nontrivial claim carries a verifiable citation, and a compound's legal and anti-doping status is reported separately from its efficacy. We grade the evidence honestly, show our sources, and say so plainly when the evidence is weak or absent — because in a field this heavily marketed and this consequential to health, transparency is the product.

---
Source: https://peptidevox.com/the-science/evidence-grading-methodology
Index: https://peptidevox.com/llms.txt · Full text: https://peptidevox.com/llms-full.txt
