Asking the pikas politely…

About the Knowledge Commons

The RMBL Knowledge Commons is a unified search and discovery platform for environmental research at the Rocky Mountain Biological Laboratory in Gothic, Colorado. It connects scientific publications, community documents, research datasets, news stories, and a knowledge graph of species, concepts, protocols, and places studied at one of the longest-running field biology stations in North America.

At a Glance

4,852

1,426

1,767

841

6,690

2,889

3,607

1,474

1,742

146

166

151,728

Entity Mentions

151,746

Citation Links

Frequently Asked Questions

What is the RMBL Knowledge Commons?

The Knowledge Commons is a search and discovery tool that brings together the scientific output of RMBL and the Gunnison Basin into one searchable platform. It includes peer-reviewed publications dating back to 1928, community and policy documents from the Sustainable Living Library, and research datasets from multiple repositories. A knowledge graph connects these resources through shared species, concepts, research methods, and geographic locations.

Who is this for?

The Hub is designed for researchers, students, land managers, community members, and policymakers interested in the environmental research and stewardship of the Gunnison Basin. It is equally useful for scientists looking for related work and for community members exploring how research connects to local policy issues.

What are Knowledge Neighborhoods?

Knowledge Neighborhoods are research communities detected automatically by analyzing the connections in the knowledge graph. Using a community-detection algorithm (Louvain), the system identifies clusters of tightly connected authors, publications, species, concepts, and places. Each neighborhood represents a distinct research theme — from marmot behavioral ecology to watershed biogeochemistry to federal land management policy. Many neighborhoods include AI-generated research primers that summarize the key findings and cite specific publications.

What are Research Frontiers?

Research Frontiers are synthesized boundaries between what scientists know and what they don't, with identifiable paths to push the boundary forward. The system extracts atomic gap-statements from neighborhood research primers, clusters them by semantic similarity, and uses a language model to weave each cluster into a narrative with context, key questions, barriers, opportunities, and concrete actions categorized by category (data, experiment, model, synthesis, framework, etc.) and effort tier (near-term, ambitious, major, consortium). Each frontier links back to its contributing neighborhoods, source statements, and the strongest concepts, species, places, and protocols involved — so you can trace any claim back to the underlying evidence.

How do I use the API or MCP server?

AI Integration

The Knowledge Commons can be queried by AI assistants via the REST API or the MCP (Model Context Protocol) server. This allows tools like Claude Desktop, ChatGPT, and custom scripts to search publications, explore research neighborhoods, and access the knowledge graph programmatically.

REST API

All API endpoints are at /api/v1/ and support ?format=text for LLM-friendly plain text. See /llms.txt for a complete list. Examples:

# Search for publications about alpine pollination
curl "https://rmblknowledgecommons.org/api/v1/search?q=alpine+pollination&format=text"

# Get publication details
curl "https://rmblknowledgecommons.org/api/v1/publications/13?format=text"

# Explore a research neighborhood with primer
curl "https://rmblknowledgecommons.org/api/v1/neighborhoods/620?format=text"

# Look up a species
curl "https://rmblknowledgecommons.org/api/v1/entities/species/8426?format=text"

# Find related works
curl "https://rmblknowledgecommons.org/api/v1/related/publications/13?format=text"

MCP Server for Claude Desktop (recommended)

The easiest way to connect: add the Knowledge Commons as a Custom Connector in Claude Desktop. No installation required — just a URL.

Option A: Remote connector (no install):

Open Claude Desktop → Settings → Connectors
Click Add custom connector
Enter URL: https://rmblknowledgecommons.org/api/mcp
10 Knowledge Commons tools are immediately available

Option B: Local server (for development):

git clone https://github.com/ikb-rmbl/RMBL_knowledge_hub.git
cd RMBL_knowledge_hub/mcp
npm install && npm run build

Then add to Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "rmbl-knowledge-commons": {
      "command": "node",
      "args": ["/path/to/RMBL_knowledge_hub/mcp/dist/index.js"],
      "env": {
        "RMBL_API_URL": "https://rmblknowledgecommons.org"
      }
    }
  }
}

Futures methodology — how the scenarios and stories were made

The Futures collection is a set of planning artifacts, not forecasts or RMBL institutional commitments. They are produced through a pipeline that pairs a human-authored specification with an AI model that drafts the prose.

The specification

The Future Scenarios Framework (currently v0.16, version-controlled at specification/Future_scenarios_framework.md) sets the rules: what a scenario is, what fields it must include, what register the prose should be in, what is forbidden, and how multiple scenarios within a set must be distinguishable from each other. Each scenario starts as a structured YAML entry with a distinguishing thesis, a magnitude bracket, a frontier portfolio, and (for upside/downside sets) the favorable or unfavorable conditions it depends on. RMBL staff write these inputs.

The drafting

Claude Opus drafts each scenario by reading the spec sections, the YAML inputs, the RMBL institutional grounding, the candidate research frontiers from the Commons, and the existing era primers. The model's output is the prose body of each scenario .md file. Stories are drafted the same way against a separate story-prompt that reads the scenario as context plus a linked Commons frontier the protagonist is pushing.

Three sets: central, upside, downside

The central set (centennial-2027, twelve scenarios) is contingency-honest about realistic-bracket campaign outcomes — each scenario names what would invalidate it and what other scenarios exist as alternatives. The upside companion set (centennial-2027-upside, three scenarios) explores what becomes possible when several favorable conditions stack. The downside companion set (centennial-2027-downside, three scenarios) explores what becomes the texture when unfavorable conditions stack. Each set has its own forbidden-pattern rules — the upside set must avoid utopian register, the downside set must avoid collapse register.

Stories as companions, not documentary

Each scenario has at least one companion story — short literary fiction (1,400–1,600 words) grounded in the scenario. Characters are fictional roles, not real RMBL staff or guest scientists. The fictional voice helps readers inhabit possibilities at a register the strategic-planning artifacts cannot reach.

Neighborhood primers — how the research-neighborhood summaries were made

The research primer at the top of each knowledge neighborhood is an AI-synthesized literature-review-style narrative. The neighborhood itself is detected algorithmically (Louvain community detection over the knowledge graph); the primer is then drafted by Claude Opus from the publications, datasets, and documents that landed in the neighborhood.

Inputs the model sees

The prompt assembles a tiered context: ~15 landmark publications (abstracts + key findings), ~15 frontier publications from the last few years, ~60 breadth publications (one finding each), plus all linked concepts with definitions. For policy-flavored neighborhoods, a parallel prompt includes management documents and federal-register notices. The prompt asks for citation-grounded prose — every claim should be traceable to a specific publication in the input.

How to read it

A primer is a synthesized map of what is in this neighborhood and how the work connects, not a peer-reviewed literature review. The cited publications are real and grounded in the Commons; the synthesis itself is the model's reading. Verification is a click away — every citation links to the publication's detail page.

View the actual prompt

The exact prompt templates Claude Opus receives live in generate-primers.ts (RESEARCH_PROMPT at L42, POLICY_PROMPT at L85). Reading them is the most direct way to verify what the model was asked to do.

Era primers — how the period summaries were made

Each entry in the Eras collection has an AI-synthesized period primer summarizing the dominant research questions, methods, and findings during that window. Two prompt registers are used: one for decade-or-bucket eras and one for the century-scale primer.

Inputs the model sees

The prompt assembles publications and documents whose year falls in the era's window, ranked by citation count and topic-distinctiveness. The model is asked to characterize the period's questions and methods without pretending to be authoritative — to surface trends, not write a history.

How to read it

Read it as a synthesized characterization of a research period, not as an authoritative history. The publications cited are real and linked; the period framing — what was distinctive, what shifted — is the model's reading of the corpus.

View the actual prompt

The exact prompt templates Claude Opus receives live in generate-era-primers.ts (PROMPT at L87, PROMPT_CENTURY at L154 for the century-scale variant).

Frontier syntheses — how the knowledge boundaries were named

Each entry in the Frontiers collection is an AI-synthesized articulation of a knowledge boundary — a coherent gap between what scientists know and what they don't. The default view shows paper-grounded frontiers: every key question and data gap on a grounded frontier cites at least one primary paper with a verbatim snippet you can verify, and the system periodically asks the LLM whether newer literature has addressed each open question (the “currency” tag on each item).

The grounded pipeline

Five stages: (1) per neighborhood, the LLM reads top papers and emits atomic frontier statements, each with a verbatim cite snippet — statements that can't be grounded in source text are dropped; (2) the statements are embedded with Voyage AI and clustered by greedy-centroid similarity with recency weighting; (3) the LLM synthesizes each cluster into a named frontier with key questions and data gaps that carry the verbatim cite snippets through; (4) the frontier loads to the DB with snapshot history; (5) a currency-validation pass asks per question whether newer papers in the same neighborhoods have addressed it.

Legacy frontiers (toggle)

An earlier (“legacy”) set of 98 frontiers was synthesized from neighborhood-primer prose rather than direct paper grounding. Those are kept in the DB because they anchor the downstream planning corpus, but they're hidden from the default /frontiers view; surface them with the Include legacy filter chip.

How to read it

Read a frontier as a synthesized articulation of where the literature points toward a knowledge boundary, not as an authoritative research agenda. Click any cite chip on a key question to open the source paper and verify the snippet word-for-word. The contributing neighborhoods are listed so you can trace the frontier back to its evidence base.

View the actual prompts

The grounded extractor lives in extract-frontiers-grounded.ts, the synthesizer in , and the currency validator in . Full design spec: .

Frontier planning — cluster and theme syntheses

The frontier planning pipeline produces several layers of AI-synthesized content that surface in board and leadership planning conversations rather than in the public detail pages — cluster descriptions of related planning items, cross-lens theme syntheses, and long-reach opportunity statements that draw across themes.

Cluster descriptions

Each cluster of related planning items (actions, questions, data gaps, barriers, or impacts) gets a title and summary synthesized by Claude Opus. View the prompt: describe-frontier-planning-clusters.ts.

Cross-lens theme syntheses

Second-order Louvain over cluster descriptions surfaces twelve cross-lens themes. Each gets an invitational opportunity statement synthesized for planning conversation. View the prompt: describe-planning-themes.ts.

Long-reach opportunities

A final cross-theme synthesis stage surfaces strategic opportunities that scale beyond the basin. View the prompt: synthesize-long-reach-opportunities.ts.

Technical Deep-Dive

The sections below describe how data flows into the Knowledge Commons and how the knowledge graph is constructed.

Data Sources

Publications are sourced from the RMBL publications database, with additional discovery via OpenAlex and CrossRef. Each record is enriched with metadata from CrossRef (authors, DOIs, abstracts, citation counts) and Unpaywall (open access links). Full text is extracted from PDFs using pdftotext with OCR fallback via Tesseract.

Datasets are discovered from eight repository sources including EDI, DataONE, Dryad, Zenodo, USGS ScienceBase, Pangaea, NCBI, and Figshare. Each dataset is enriched with EML/DataCite metadata including temporal and spatial coverage, creator information, and licensing.

Documents come from the Sustainable Living Library, a collection of community and policy documents relevant to the Gunnison Basin. These include management plans, environmental impact statements, water quality reports, and local planning documents.

Stories are news articles about RMBL and the Gunnison Basin from local newspapers (Crested Butte News, Gunnison Country Times) and national/international outlets via LexisNexis. Full text is stored for search indexing and entity extraction but is not displayed on detail pages to respect copyright. Each story links to its original source when available.

Author Deduplication

Authors are deduplicated across all collections using a two-phase process. First, authors with matching ORCID identifiers are merged. Then, authors sharing the same family name are compared by given name initials, with checks to prevent false merges when middle initials differ (e.g., “R. J. Smith” is kept separate from “R. A. Smith”). Author ordering on publications is repaired from CrossRef metadata to ensure correct first-author attribution.

Entity Extraction & Knowledge Graph

Entities (species, concepts, protocols, places, and stakeholders) are extracted from publication and document full text using Claude vision models (VLM extraction). Each entity mention is linked to its source item with a confidence score and extraction method. Entities are then deduplicated using embedding-based clustering (Voyage AI voyage-4, 1024 dimensions) with type-specific similarity thresholds.

Feedback & Contact

The Knowledge Commons is an evolving platform and we welcome feedback from the community. If you notice missing publications, incorrect data, broken links, or have ideas for new features, there are two ways to get in touch:

Report an issue on GitHub: github.com/ikb-rmbl/RMBL_knowledge_hub/issues — best for bug reports, data corrections, and feature requests.
Contact the developer: Ian Breckheimer — ikb@rmbl.org

Acknowledgments

The RMBL Knowledge Commons was developed with support from the Clark Family Foundation. Built by RMBL using data from CrossRef, OpenAlex, Unpaywall, ITIS, GNIS, and multiple data repositories.

Available MCP Tools

Tool	Description
search_rmbl	Full-text search across all collections
get_publication	Publication detail with authors, abstract, entities, citations
get_dataset	Dataset detail with creators and entities
get_document	Document detail with entities and stakeholders
get_entity	Entity lookup (species, concept, protocol, place, stakeholder)
find_related	Related works via semantic similarity, shared entities, co-authorship, citations
explore_neighborhood	Research neighborhood detail with primer
list_neighborhoods	Browse or search 146 research neighborhoods
get_frontier	Research frontier detail: key questions (with verbatim primary-paper cites), data gaps, currency state, contributing neighborhoods
list_frontiers	Browse or search paper-grounded research frontiers (sortable by breadth/leverage)