2026-06-10
- SEO/discoverability pass. Fixed work-page meta descriptions that were hard-cut mid-word:
build_work_pages.pynow clips on a word boundary with an ellipsis via newclip_description()incode/src/site_nav.py(145 of 165 work descriptions corrected; rendered length ≤160). - Added Twitter Card (
summary_large_image) andog:image:alttags site-wide. Generators (build_work_pages,build_domain_pages,build_catalog,build_exports_page,build_evidence_page,build_updates_page,build_github_inventory) emit them; hand-maintained pages (index, publications, art, videos, collaborators, search, discovery, cite-verify, media, software) are covered by a new idempotentcode/orchestrators/ensure_social_meta.py. - Added the sixth research-domain landing page
domain-biomedicine.html(Genetics & Biomedicine, , 15 works) withog-biomedicine.jpg; added tositemap_policy.py; relinked the homepage card from a rawpages/BIBLIOGRAPHY.md#…anchor to the new page. - Polished homepage: removed duplicate
theme-colorand standardized to#0c0c0e(matches manifest); tightened the meta/og description to 153 chars; added word separators between publication-card title/venue/citation spans so text extractors and screen readers no longer read them run-together. - New SEO invariants in
code/src/seo_invariants.py(check_social_meta,check_work_descriptions) with tests intest_seo_invariants.pyandtest_site_nav.py; full suite 88 passing. - Deep-scan follow-ups: work-page fallback meta descriptions now include the title (eliminated 17 duplicate/templated descriptions across same-type works; 162/165 work descriptions now unique). Enriched
ScholarlyArticle.authorJSON-LD with inline@type/name/url(not just a cross-document@id) so search engines reliably attribute authorship for rich results. Applied the same word-boundaryclip_description()tobuild_paper_pages.py(148 paper-folder pages no longer truncate mid-word). Verified site-wide: 373 JSON-LD blocks all valid, full image-alt coverage (incl. the JS-rendered art gallery viaartAlt()), no broken internal links. - Reviewed Google Search Console (3-month window): 74 clicks / 3.25K impressions / 2.3% CTR / avg position 9; 118 indexed vs 111 not (43 "crawled-not-indexed" + 50 "discovered-not-indexed" thin/templated work pages — the description/author fixes target these). Confirmed the lone "Not found (404)" (
papers/2024_PopulationSearch/) is stale (crawled before publish; now live, noindex, canonicalized). Findings + roadmap recorded inreports/seo-discoverability-audit-2026-06-10.md. - Deduplicated the CEREBRUM pair: work
…118(papers/2025_CEREBRUM2, the v1.4 deposit) now setsrel=canonical+og:urlto the primary entry…010(papers/2025_CEREBRUM), consolidating ranking signals for the same paper. Added a sharedWORK_CANONICAL_OVERRIDES/canonical_work_keyincode/src/site_nav.py, used by both the work-page generator and thecheck_work_pagesinvariant, with a regression test. - Corrected three mis-attributed paper abstracts (source READMEs had the wrong paper's text), sourced from authoritative records:
2023_HoneyBeeGeneExpression(Zenodo TSGE meta-analysis abstract),2023_AII_v1(AII overview, recovered from the file's own schema block — the body had TrustFinder text), and2023_ToComment(a *Physics of Life Reviews* commentary on Manrique & Walker's "To copy or not to copy?", per Semantic Scholar — not the digital-memes text it carried). All 165 work-page meta descriptions are now unique. Confirmedwww→apex 301 redirect and self-canonical (no duplicate-content split). Flagged: works…010/…118are the same CEREBRUM paper (DOIzenodo.15170907resolves to15231156, v1.4) — a bibliography dedup/curation decision left to the maintainer.