API Reference

This section documents all modules and functions in the sw-metadata-bot package.

Analysis Runtime Module

Low-level analysis workflow helpers for pipeline orchestration.

sw_metadata_bot.analysis_runtime.extract_previous_commit(record)[source]

Return previous commit id from report records with compatibility fallback.

Parameters:

record (dict)

Return type:

str | None

sw_metadata_bot.analysis_runtime.resolve_per_repo_paths(analysis_root, repo_url)[source]

Compute per-repository output paths within the analysis root.

Parameters:
  • analysis_root (Path)

  • repo_url (str)

Return type:

dict[str, Path]

sw_metadata_bot.analysis_runtime.copy_previous_repo_artifacts(previous_repo_folder, current_repo_folder)[source]

Copy previous snapshot repository artifacts into current snapshot folder.

Parameters:
  • previous_repo_folder (Path)

  • current_repo_folder (Path)

Return type:

None

sw_metadata_bot.analysis_runtime.load_previous_repo_record(previous_snapshot_root, repo_url)[source]

Load previous per-repo record from previous snapshot if available.

Parameters:
  • previous_snapshot_root (Path | None)

  • repo_url (str)

Return type:

dict | None

sw_metadata_bot.analysis_runtime.standardize_metacheck_outputs(repo_folder)[source]

Normalize metacheck output names to stable per-repo filenames.

Parameters:

repo_folder (Path)

Return type:

None

sw_metadata_bot.analysis_runtime.run_metacheck_for_repo(repo_url, repo_folder, metacheck_command)[source]

Run metacheck for a single repository URL into its own folder.

Parameters:
  • repo_url (str)

  • repo_folder (Path)

Return type:

None

sw_metadata_bot.analysis_runtime.build_analysis_counters(records)[source]

Build analysis counters using the unified report schema.

Parameters:

records (list[dict[str, object]])

Return type:

dict[str, int]

sw_metadata_bot.analysis_runtime.build_analysis_run_report(records, *, dry_run, run_root, analysis_summary_file, previous_report)[source]

Build run-level report payload from analysis decision records.

Parameters:
Return type:

dict[str, object]

sw_metadata_bot.analysis_runtime.detect_platform_from_repo_url(repo_url)[source]

Detect publish platform from repository URL.

Parameters:

repo_url (str)

Return type:

str | None

sw_metadata_bot.analysis_runtime.is_previous_issue_open(previous_record)[source]

Infer whether previous issue was open from stored metadata only.

Parameters:

previous_record (dict[str, object])

Return type:

bool

sw_metadata_bot.analysis_runtime.build_record_entry(*, run_root, repo_url, platform, pitfalls_count, warnings_count, analysis_date, metacheck_version, pitfalls_ids, warnings_ids, action, reason_code, findings_signature, current_commit_id, previous_commit_id, previous_issue_url, previous_issue_state, dry_run, issue_persistence, issue_url, file_path, error=None)[source]

Build a per-repository analysis record payload.

Parameters:
  • run_root (Path)

  • repo_url (str)

  • platform (str | None)

  • pitfalls_count (int)

  • warnings_count (int)

  • analysis_date (str)

  • metacheck_version (str)

  • pitfalls_ids (list[str])

  • warnings_ids (list[str])

  • action (str)

  • reason_code (str)

  • findings_signature (str)

  • current_commit_id (str | None)

  • previous_commit_id (str | None)

  • previous_issue_url (str | None)

  • previous_issue_state (str | None)

  • dry_run (bool)

  • issue_persistence (str)

  • issue_url (str | None)

  • file_path (Path)

  • error (str | None)

Return type:

dict[str, object]

sw_metadata_bot.analysis_runtime.write_analysis_repo_report(repo_folder, record, *, dry_run, run_root, analysis_summary_file, previous_report)[source]

Write per-repository analysis report using analysis-stage counters.

Parameters:
Return type:

None

sw_metadata_bot.analysis_runtime.create_analysis_record(*, run_root, repo_url, repo_folder, previous_record, current_commit_id, dry_run, custom_message)[source]

Create a decision record for a repository without platform API calls.

Parameters:
Return type:

dict[str, object]

Check Parsing Module

Shared parsing helpers for RSMetacheck check identifiers.

sw_metadata_bot.check_parsing.get_check_catalog_id(check)[source]

Return full RSMetacheck catalog ID URL for a check when available.

Preferred source is the new schema key assessesIndicator.@id when it points to the RSMetacheck catalog. For backward compatibility, this falls back to the legacy pitfall key.

Parameters:

check (dict)

Return type:

str

sw_metadata_bot.check_parsing.get_short_check_code(check)[source]

Return short check code such as P001 or W004.

Parameters:

check (dict)

Return type:

str

sw_metadata_bot.check_parsing.is_check_reported(check)[source]

Return True only when a check is explicitly reported by metacheck.

Verbose metacheck output marks each evaluated check with an output key. Only values representing true are considered reported findings.

Parameters:

check (dict)

Return type:

bool

sw_metadata_bot.check_parsing.extract_check_ids(checks)[source]

Extract ordered unique pitfall and warning codes from check entries.

Parameters:

checks (list[dict])

Return type:

tuple[list[str], list[str]]

Commit Lookup Module

Repository head commit lookup utilities.

sw_metadata_bot.commit_lookup.parse_github_repo(repo_url)[source]

Parse owner/repo from a GitHub repository URL.

Parameters:

repo_url (str)

Return type:

tuple[str, str] | None

sw_metadata_bot.commit_lookup.resolve_gitlab_project_path(repo_url)[source]

Parse host and project path for GitLab repositories.

Parameters:

repo_url (str)

Return type:

tuple[str, str] | None

sw_metadata_bot.commit_lookup.is_commit_hash(value)[source]

Return True if value looks like a commit hash.

Parameters:

value (str)

Return type:

bool

sw_metadata_bot.commit_lookup.get_github_head_commit(repo_url, token=None)[source]

Fetch current head commit from GitHub API.

Parameters:
  • repo_url (str)

  • token (str | None)

Return type:

str | None

sw_metadata_bot.commit_lookup.get_gitlab_head_commit(repo_url, token=None)[source]

Fetch current head commit from GitLab API for gitlab* hosts.

Parameters:
  • repo_url (str)

  • token (str | None)

Return type:

str | None

sw_metadata_bot.commit_lookup.get_generic_git_head_commit(repo_url)[source]

Fetch current head commit via git ls-remote as generic fallback.

Parameters:

repo_url (str)

Return type:

str | None

sw_metadata_bot.commit_lookup.get_repo_head_commit(repo_url)[source]

Fetch current head commit using API-first and git fallback strategies.

Parameters:

repo_url (str)

Return type:

str | None

Config Utils Module

Helpers for the unified configuration file.

sw_metadata_bot.config_utils.normalize_repo_url(url)[source]

Normalize repository URLs for matching and persistence.

Parameters:

url (str)

Return type:

str

sw_metadata_bot.config_utils.detect_platform(url)[source]

Detect publishing platform from repository URL.

Returns "github" for GitHub URLs, "gitlab" for any GitLab URL, or None when the URL does not match a known platform.

Parameters:

url (str)

Return type:

str | None

sw_metadata_bot.config_utils.load_config(config_path)[source]

Load and validate a unified configuration file.

Parameters:

config_path (Path)

Return type:

dict

sw_metadata_bot.config_utils.get_repositories(config)[source]

Return normalized repositories preserving order and uniqueness.

Parameters:

config (dict)

Return type:

list[str]

sw_metadata_bot.config_utils.get_custom_message(config)[source]

Return the configured issue custom message if present.

Parameters:

config (dict)

Return type:

str | None

sw_metadata_bot.config_utils.get_opt_out_repositories(config)[source]

Return normalized repository URLs configured as inline opt-outs.

Parameters:

config (dict)

Return type:

set[str]

sw_metadata_bot.config_utils.append_opt_out_repository(config_path, repo_url)[source]

Persist a repository to the inline opt-outs list when not already present.

Parameters:
  • config_path (Path)

  • repo_url (str)

Return type:

bool

sw_metadata_bot.config_utils.resolve_output_root(config, config_path)[source]

Return the configured output root, resolving relative paths from project root.

Parameters:
Return type:

Path

sw_metadata_bot.config_utils.resolve_run_name(config, config_path)[source]

Return the configured run name or a sensible default.

Parameters:
Return type:

str

sw_metadata_bot.config_utils.resolve_snapshot_tag(config, explicit_snapshot_tag)[source]

Resolve the snapshot tag from CLI override or config defaults.

Parameters:
  • config (dict)

  • explicit_snapshot_tag (str | None)

Return type:

str | None

sw_metadata_bot.config_utils.sanitize_repo_name(repo_url)[source]

Sanitize repository URL to a safe folder name format.

Uses a generic URL-safe transformation so non-standard URLs still map to deterministic folder names.

Parameters:

repo_url (str) – Repository URL or identifier string

Returns:

Sanitized folder name (lowercase, underscores only)

Return type:

str

sw_metadata_bot.config_utils.copy_config_to_analysis_root(config_path, analysis_root)[source]

Copy the configuration file to the analysis root directory.

Parameters:
  • config_path (Path) – Path to the input configuration file

  • analysis_root (Path) – Root analysis directory where config will be copied

Raises:

IOError – If copying fails

Return type:

None

Create Issues Module

GitHub API Module

GitHub API client.

class sw_metadata_bot.github_api.GitHubAPI(token=None, dry_run=False)[source]

Bases: object

Simple GitHub API client.

Parameters:
  • token (str | None)

  • dry_run (bool)

__init__(token=None, dry_run=False)[source]

Initialize GitHub API client.

Parameters:
  • token (str | None)

  • dry_run (bool)

static parse_url(url)[source]

Parse GitHub URL to extract owner and repo.

Returns:

Tuple of (owner, repo_name)

Parameters:

url (str)

Return type:

tuple[str, str]

test_auth()[source]

Test if authentication works.

Return type:

bool

verify_auth()[source]

Verify authentication and return detailed information.

Returns:

Dictionary with authentication details including user, scopes, and permissions.

Return type:

dict

create_issue(repo_url, title, body)[source]

Create an issue on GitHub.

Returns:

URL of created issue (or fake URL in dry-run mode)

Parameters:
Return type:

str

static parse_issue_url(issue_url)[source]

Parse a GitHub issue URL and return owner/repo/number.

Parameters:

issue_url (str)

Return type:

tuple[str, str, int]

get_issue(issue_url)[source]

Fetch issue details from GitHub.

Parameters:

issue_url (str)

Return type:

dict

get_issue_comments(issue_url)[source]

Fetch issue comments and return bodies.

Parameters:

issue_url (str)

Return type:

list[str]

add_issue_comment(issue_url, body)[source]

Add a comment to an issue.

Parameters:
Return type:

None

close_issue(issue_url)[source]

Close an existing issue.

Parameters:

issue_url (str)

Return type:

None

GitLab API Module

GitLab API client.

class sw_metadata_bot.gitlab_api.GitLabAPI(token=None, dry_run=False)[source]

Bases: object

Simple GitLab API client.

Parameters:
  • token (str | None)

  • dry_run (bool)

__init__(token=None, dry_run=False)[source]

Initialize GitLab API client.

Parameters:
  • token (str | None)

  • dry_run (bool)

static parse_url(url)[source]

Parse GitLab URL to extract host, owner, and repo.

Returns:

Tuple of (host, owner, repo_name)

Parameters:

url (str)

Return type:

tuple[str, str, str]

get_base_url(host)[source]

Get API base URL for GitLab host.

Parameters:

host (str)

Return type:

str

test_auth(host='gitlab.com')[source]

Test if authentication works.

Parameters:

host (str)

Return type:

bool

verify_auth(host='gitlab.com')[source]

Verify authentication and return detailed information.

Returns:

Dictionary with authentication details including user, scopes, and permissions.

Parameters:

host (str)

Return type:

dict

create_issue(repo_url, title, body)[source]

Create an issue on GitLab.

Returns:

URL of created issue (or fake URL in dry-run mode)

Parameters:
Return type:

str

static parse_issue_url(issue_url)[source]

Parse a GitLab issue URL and return host/owner/repo/iid.

Parameters:

issue_url (str)

Return type:

tuple[str, str, str, int]

get_issue(issue_url)[source]

Fetch issue details from GitLab.

Parameters:

issue_url (str)

Return type:

dict

get_issue_comments(issue_url)[source]

Fetch issue comments and return note bodies.

Parameters:

issue_url (str)

Return type:

list[str]

add_issue_comment(issue_url, body)[source]

Add a comment to an issue.

Parameters:
Return type:

None

close_issue(issue_url)[source]

Close an existing issue.

Parameters:

issue_url (str)

Return type:

None

History Module

Helpers for loading and querying previous issue reports.

sw_metadata_bot.history.load_previous_report(report_path)[source]

Load report.json and index issue-lifecycle entries by repository URL.

Parameters:

report_path (Path | None)

Return type:

dict[str, dict]

sw_metadata_bot.history.load_previous_commit_report(report_path)[source]

Load report.json and index entries by repository for commit-based pre-skip.

Parameters:

report_path (Path | None)

Return type:

dict[str, dict]

sw_metadata_bot.history.findings_signature(pitfall_ids, warning_ids)[source]

Build a deterministic findings signature from pitfall and warning IDs.

Parameters:
Return type:

str

Incremental Module

Decision engine for incremental issue lifecycle handling.

class sw_metadata_bot.incremental.Decision(action, reason)[source]

Bases: object

Decision outcome for a repository in incremental mode.

Parameters:
action: str
reason: str
__init__(action, reason)
Parameters:
Return type:

None

sw_metadata_bot.incremental.evaluate(*, previous_exists, unsubscribed, repo_updated, has_findings, identical_findings, previous_issue_open)[source]

Evaluate the configured decision tree and return action + reason.

Parameters:
  • previous_exists (bool)

  • unsubscribed (bool)

  • repo_updated (bool)

  • has_findings (bool)

  • identical_findings (bool)

  • previous_issue_open (bool)

Return type:

Decision

Main Module

CLI entry point for sw-metadata-bot.

sw_metadata_bot.main.main()[source]

Entry point for the CLI.

MetaCheck Wrapper Module

Pipeline Module

Pipeline command to run analysis workflows.

sw_metadata_bot.pipeline.find_latest_previous_report(output_root, run_name, current_snapshot_tag)[source]

Find latest previous report path from same run folder.

Parameters:
  • output_root (Path)

  • run_name (str)

  • current_snapshot_tag (str | None)

Return type:

Path | None

sw_metadata_bot.pipeline.run_pipeline(config_file, dry_run, snapshot_tag, previous_report)[source]

Run analysis and write issue decision records without API side effects.

Parameters:
  • config_file (Path)

  • dry_run (bool)

  • snapshot_tag (str | None)

  • previous_report (Path | None)

Return type:

None

Pitfalls Module

Pitfalls data loading and parsing.

sw_metadata_bot.pitfalls.load_pitfalls(file_path)[source]

Load pitfalls from JSON-LD file.

Parameters:

file_path (Path)

Return type:

dict

sw_metadata_bot.pitfalls.get_repository_url(data)[source]

Extract repository URL from pitfalls data.

Parameters:

data (dict)

Return type:

str

sw_metadata_bot.pitfalls.get_pitfalls_list(data)[source]

Get list of pitfall checks from data.

Parameters:

data (dict)

Return type:

list[dict]

sw_metadata_bot.pitfalls.get_warnings_list(data)[source]

Get list of warning checks from data.

Parameters:

data (dict)

Return type:

list[dict]

sw_metadata_bot.pitfalls.get_metacheck_version(data)[source]

Get the version of RSMetacheck used for analysis.

New schema (0.2.1+): Version is in checkingSoftware.softwareVersion Falls back to “unknown” if not found.

Parameters:

data (dict)

Return type:

str

sw_metadata_bot.pitfalls.format_report(repo_url, data)[source]

Format pitfalls data into a readable report.

Parameters:
Return type:

str

sw_metadata_bot.pitfalls.create_issue_body(report, custom_message=None)[source]

Wrap report in issue template using optional custom message or default greetings.

Parameters:
  • report (str)

  • custom_message (str | None)

Return type:

str

Publish Module

Publish issues from an existing analysis snapshot.

sw_metadata_bot.publish.publish_analysis(analysis_root, retry_failed=False)[source]

Publish issues from an existing analysis snapshot without re-running analysis.

Parameters:
  • analysis_root (Path)

  • retry_failed (bool)

Return type:

None

Token Resolver Module

Token resolution helpers for API clients.

sw_metadata_bot.token_resolver.resolve_token(*, explicit_token, env_var_name, dry_run)[source]

Resolve token with precedence: explicit > env > .env fallback.

Parameters:
  • explicit_token (str | None)

  • env_var_name (str)

  • dry_run (bool)

Return type:

str | None

Verify Tokens Module

Token verification command - check if tokens have correct permissions.