Our Methodology

Full transparency. Every data source, weight, and formula documented.

For journalists: plain-English explanation (click to expand)

LangPop tracks real developer activity across 7 independent sources: GitHub (code repositories and contributor activity), job postings (employer demand), Stack Overflow (technical Q&A), Google Trends (search interest), package registry downloads (npm, PyPI, crates.io), Reddit community discussions, and tutorial platform enrollments. Unlike indexes that rely on a single signal, the composite reflects how a language is actually used across the industry.

Sources are weighted to reflect hands-on signals more heavily than passive ones. GitHub and job postings together account for 45% of the score because they represent developers actively writing code and employers actively hiring. Search queries and tutorial interest carry less weight — someone searching for a language may be a beginner or troubleshooter, not a practitioner. This prevents beginner-search spikes (the main criticism of TIOBE) from distorting the rankings.

The composite score updates every Tuesday. Python is anchored at 100 — every other language is scored relative to it. For citation purposes: “According to LangPop (langpop.com), a weekly programming language popularity index tracking 7 data sources.”

Philosophy

Existing language popularity indexes have significant flaws:

  • TIOBE uses search queries, which critics say “measures desperation, not popularity” — hence Scratch ranking #12 above Rust.
  • PYPL only tracks tutorial searches and limits to 22 languages.
  • RedMonk relies heavily on Stack Overflow, whose question volume has fallen by more than half YoY since AI assistants became mainstream (verifiable in the Stack Exchange Data Explorer).
  • IEEE Spectrum and GitHub Octoverse update annually — data is stale before publication.

LangPop addresses these issues with a multi-source composite that updates weekly, with full transparency on how every score is calculated.

Data Sources

GitHub

Daily25%

Active repositories, stars, forks, and contributor activity

We track repositories with activity in the last 90 days, weighted by stars (30%), forks (20%), issues (20%), and recent commits (30%).

Job Postings

Weekly20%

Aggregated job listings mentioning each language

Data aggregated from major job boards. We normalize for job board size and filter out false positives (e.g., "Java" in "JavaScript").

Stack Overflow

Weekly15%

Questions, answers, and tag activity

Note: SO question volume has fallen by more than half YoY since AI assistants became mainstream (verifiable in the public Stack Exchange Data Explorer). We weight recent activity higher and supplement with answer quality metrics.

Google Trends

Weekly15%

Search interest relative to peak popularity

Normalized search volume with geographic weighting. Filtered for programming context (excludes "python snake" searches).

npm/PyPI/crates.io

Weekly10%

Package downloads and ecosystem activity

Only applicable to languages with package managers. Measures ecosystem health through download velocity and package count growth.

Reddit

Weekly10%

Subreddit activity and community engagement

Tracks r/programming mentions, language-specific subreddit growth, and sentiment analysis of discussions.

Tutorial Platforms

Weekly5%

Course enrollments and learning interest

Data from major learning platforms. Indicates new developer interest and language accessibility.

Composite Score Calculation

CompositeScore =
    (GitHub × 0.25) +
    (JobPostings × 0.20) +
    (StackOverflow × 0.15) +
    (GoogleTrends × 0.15) +
    (PackageManagers × 0.10) +
    (Reddit × 0.10) +
    (Tutorials × 0.05)

Each component is normalized to 0-100 scale
based on the maximum value in that category.

Final scores are relative to Python = 100
(the current top-ranked language).

Known Limitations

  • Package manager bias: Languages without popular package managers (C, C++) score 0 in that category.
  • Enterprise blind spots: Proprietary/closed-source usage is not captured by our sources.
  • Geographic weighting: Data skews toward English-speaking markets; we're working on regional adjustments.
  • AI assistant impact: As AI changes how developers seek help, Stack Overflow data becomes less reliable.
  • Job posting noise: Some listings mention multiple languages; we use NLP to weight primary language requirements.

Update Schedule

  • Daily: GitHub data collection
  • Weekly (Tuesdays): All other sources + composite score recalculation + newsletter
  • Monthly: Methodology review and weight adjustments
  • Quarterly: Major reports with trend analysis

Methodology Changelog

v1.0 — Launch Release

February 2026

Current
  • Initial 7-source composite formula established
  • Weekly update cadence implemented
  • GitHub (25%), Job Postings (20%), Stack Overflow (15%), Google Trends (15%), Package Managers (10%), Reddit (10%), Tutorials (5%)
  • Min-max normalization applied to all sources
  • Python baseline = 100 score anchor
  • Enterprise gaps and geographic bias documented

v1.1 — Planned

Q3 2026

Planned

Regional market segmentation, enterprise job board integration, AI/LLM language tracking

v2.0 — Roadmap

2026-2027

Roadmap

Machine learning-based trend prediction, proprietary enterprise data integration, industry/vertical-specific indexes

Note on transparency: All methodology changes will be announced in advance and the changelog updated weekly. No weight adjustments are made without explaining the rationale publicly.

Want the methodology programmatically? GET /api/v1/methodology returns the source list, weights, and formula as JSON.