← vstepanov.com
Project

High-Likelihood Peptide Combination Library

Published June 2026.

The High-Likelihood Peptide Combination Library is a public, machine-readable catalog of computationally enumerated peptide combinations. It concentrates enumeration in the regions of sequence space most likely to yield functional peptides — the dense neighborhoods around known active scaffolds — and publishes each candidate in a form that people, search engines, and machines can inspect and reproduce.

The public library currently spans 147 peptide packages across twelve static releases. Each row describes a materialized peptide identity and exact form variant using deterministic identifiers, hashes, shard locations, and source provenance. The purpose is to make these high-likelihood combinations openly enumerable and reproducible, not to recommend any biological use.

169,650,046 Enumerated combinations
27,726,923 Backbone sequences
147 Uploaded packages
48.83 GB Public library data

Why this exists

Most of peptide sequence space is inert. The combinations worth examining cluster tightly around scaffolds already known to fold and function. This library enumerates those high-likelihood neighborhoods exhaustively — every close variant of a known active scaffold — rather than sampling sequence space uniformly, so the catalog stays dense exactly where biological relevance is most probable.

Publishing the enumeration in bulk, with enough structure that any reader can identify exactly what was catalogued and where it lives, turns a private search problem into a public, inspectable resource. The project is not a legal, medical, or clinical opinion and does not assert that any row is safe, active, or fit for any use.

What is in the library

How each record is specified

Every row is a complete structural specification of one molecule, not a bare string. The record is meant to stand on its own terms: a reader skilled in peptide chemistry can determine exactly which compound is described and, using only routine and well-established methods, make and verify it.

Each record fixes:

Because each enumerated member carries its own deterministic identifier and content hash, the library specifies every member individually and exactly. It is not an undifferentiated genus: a given target molecule either matches a catalogued identifier exactly or it does not.

Reproducing and verifying a record

Releases are deterministic. Given a source scaffold and the published enumeration parameters, the same record set is regenerated in the same canonical form and validates against the package hash manifest, so any reader can independently confirm what was catalogued and when.

To check whether a specific molecule is in the public record:

A match establishes that the exact compound, in that exact form, was publicly and verifiably catalogued as of the release date. The members are accessible by methods already within ordinary skill — solid-phase synthesis for linear and configured peptides, recombinant expression for longer backbones, with established terminal and disulfide chemistries — so no novel or operational protocol is required, and none is published here.

Public releases

The releases are hosted as static files. Every package has a wrapper page, package manifest, hash manifest, artifact index, citation metadata, sample assets, and shard index.