Pencheff

Code security

SAST and secrets

Semgrep, Bandit, gosec, Brakeman, ESLint security, tree-sitter rules, and gitleaks.

ScopeSecurity Surfaces

Run web, API, code, dependency, cloud, AI, and internal-network assessments from one queue with unified findings, evidence, remediation, and audit output.

OutputUnified evidence

Findings, reports, dashboards, exports, integrations, and retests all read from the same normalized record.

MethodDeterministic first

Pencheff favors repeatable checks, then uses AI for triage, enrichment, orchestration, and remediation where it adds signal.

From the Pencheff docs

Scanners

/repos/scanners

Each repo scan fans out to several scanners. Every match is normalised into a shared RepoFinding row so the UI and the API don't care which engine produced it.

CodeQL was removed in v0.7 — the CodeQL CLI is not licensed for commercial use on third-party code, and Pencheff scans customer code. The SAST role is now filled by the five permissively-licensed tools listed below, all run as subprocesses (no static linking).

Semgrep OSS — multi-language SAST

Pinned to an explicit allowlist of OSS Semgrep Registry packs — never --config=auto, never any Semgrep Pro Engine / Pro rules. Default pack list:

p/owasp-top-ten p/security-audit p/cwe-top-25 p/secrets p/jwt
p/django p/flask p/express p/nodejs p/golang p/r2c-security-audit

Override per-deployment with the PENCHEFF_SEMGREP_PACKS env var (comma-separated). The runner script lives at bench/runners/semgrep.sh. License: LGPL-2.1 (subprocess-only).

Severity maps via the existing _canonical_severity helper — ERROR/WARNING/INFO collapse to our five-level scale.

Bandit — Python SAST

Apache-2.0; runs bandit -r <repo> skipping B101 (assert in tests). Captures CWE ids when Bandit emits them.

gosec — Go SAST

Apache-2.0; only fires when the staged tree contains .go files outside vendor/. Reports CWE id + confidence on every issue.

Brakeman — Ruby on Rails SAST

MIT; auto-skips when the tree isn't a Rails app (no app/ + config/ directories). Confidence levels collapse to severity: highhigh, mediummedium, weaklow.

ESLint + eslint-plugin-security — JS / TS SAST

Both MIT. Invoked via npx --no-install eslint against a pinned flat config at bench/runners/eslint_security.config.cjsignores any .eslintrc in the target repo so the security ruleset is identical on every scan. Only security/* rule hits surface as findings.

Tree-sitter pack — niche-language SAST

Phase 2.3 — per-language sub-packs under plugins/pencheff/pencheff/modules/sast/treesitter_pack/ cover languages that Semgrep OSS / Bandit / gosec / Brakeman / ESLint don't reach cleanly. Solidity ships at v0.7 (4 hand-curated rules: tx.origin auth, weak-randomness, deprecated selfdestruct, unchecked low-level calls). Lua, Scala, Dart, Kotlin, Swift, COBOL, Erlang sub-packs scaffold-ready — drop a queries.scm + rules.json pair into a sibling directory. Each sub-pack is gracefully skipped when the language grammar isn't installed.

GHSA Advisory DB — SCA

Dependency-vulnerability scan against the GitHub Advisory Database, sourced via osv-scanner (which mirrors GHSA along with PyPA, RustSec, Go Vulndb, and several other ecosystem feeds).

Walks every manifest the engine recognises:

  • package-lock.json, yarn.lock, pnpm-lock.yaml
  • requirements.txt, Pipfile.lock, poetry.lock
  • Gemfile.lock, Cargo.lock, composer.lock
  • go.sum, pom.xml, build.gradle

Findings include package, installed_version, fixed_version, and the GHSA-prefixed alias as rule_id when present (otherwise the OSV ID). CVE aliases populate the cve field. Severity maps from the CVSS v3 score: 9+ critical, 7+ high, 4+ medium, else low.

For App-installed repos, Dependabot push webhooks deliver alerts straight into the same bucket — they merge with the on-disk scan.

gitleaks — secrets

Scans the working tree for credential patterns: AWS keys, GCP service accounts, Slack tokens, private SSH keys, generic high-entropy strings. Every match is high severity — the right call is almost always to revoke and rotate.

YARA — malware / backdoor patterns

Runs the YARA engine against every file using Pencheff's bundled rule pack at bench/rules/yara/. Targets that actually appear in real source trees:

  • Minimal PHP webshells (eval($_GET[…]) families)
  • Obfuscated JS loaders (eval(atob(…)), Function(decodeURIComponent(…)))
  • Crypto-miner pool configs (stratum+tcp://, xmrig)
  • Python pickle RCE gadgets
  • Classic reverse-shell oneliners

Drop your own *.yar files into bench/rules/yara/ to extend the pack without touching Pencheff code.

Trivy IaC — infrastructure misconfigurations

Runs trivy config over the staged repo. Picks up Terraform, CloudFormation, Helm charts, Kubernetes manifests, and Dockerfiles without configuration. Includes CIS benchmarks and AWS / Azure / GCP provider-specific rules.

Checkov — policy-as-code

1,000+ policy-as-code rules across the same IaC surface as Trivy plus ARM, Bicep, Serverless, OpenAPI. Useful complement when an organisation cares about specific compliance frameworks (Trivy is broader, Checkov is opinionated).

Filtering — what gets scanned

Before any scanner runs, the repo is staged into a clean directory using hardlinks (cheap, no byte copy on the same filesystem). Staging respects:

  • .gitignore (root and nested)
  • A default-deny list: .git, .env*, node_modules, .venv, build / dist directories, __pycache__, …

stats.filter on each RepoScan records included / excluded counts and the method (git ls-files if available, fallback walk).

From the Pencheff docs

GitHub Check Run + SARIF + Pencheff Suggest

/features/github-check-runs

When the Pencheff GitHub App is installed on a repo, every PR scan posts a Pencheff Check Run on the head commit with inline annotations on the diff, and uploads a SARIF v2.1.0 document to Security → Code scanning.

A separate bot — Pencheff Suggest — reads PR comments and acts on pencheff: suppress … directives so reviewers can mark findings noise without leaving GitHub.

(The bot name is provisional pending the Phase 0.6 trademark search; final name TBD.)

Check Run surface

LayerWhat you see
Per-commit checkPencheff check appears alongside lint / test / build on every PR. Conclusion is success when no critical/high; failure otherwise.
Inline annotationsOne annotation per finding, anchored at (file_path, line_start..line_end) with severity → failure / warning / notice. GitHub caps at 50 per Check-Run POST; Pencheff pages remaining annotations via PATCH.
SummaryCount strip — N critical · N high · N medium · N low · N info — rendered in the check's output.summary.

SARIF upload

A separate path uploads the same findings as a SARIF v2.1.0 document to GitHub's Code Scanning ingest endpoint. The findings then show up under the repo's Security → Code scanning alerts tab and inherit the standard GitHub triage UI (dismiss, mark resolved, alert routing).

POST /repos/{owner}/{repo}/code-scanning/sarifs
Authorization: Bearer <installation-token>
Content-Type: application/json

{ "commit_sha": "...", "ref": "refs/heads/main",
  "sarif": "<base64-gzip>", "tool_name": "Pencheff",
  "checkout_uri": "https://github.com/owner/repo" }

The Pencheff GitHub App requires the security_events permission (write) for SARIF upload. Customers using the PAT path need a token scoped to security_events.

Pencheff Suggest — PR-comment suppression

Reviewers can suppress a finding directly from a PR comment:

Looks fine to me — running on staging only.

pencheff: suppress 47bf3c92 reason="accepted_risk" notes="staging-only test fixture"

The bot parses the directive, validates the reason against the allowlist, and calls POST /findings/{id}/suppress on your behalf. Valid reasons: accepted_risk, wont_fix, false_positive, duplicate, out_of_scope. Anything else is rejected silently — there's no way to inject a custom reason via the comment surface.

How to enable

  1. Install the Pencheff GitHub App on the org or specific repos (see Connect a repo).
  2. Grant the Checks permission (write) and security_events permission (write) when the app installer prompts.
  3. The next push or PR triggers an automatic Check Run + SARIF upload alongside the existing scan.

For repos connected via PAT (no GitHub App), the Check Run / SARIF features require a PAT with security_events write — the standard fine-grained PAT path doesn't expose this scope, so most PAT-only deployments use the unified findings stream + DOCX report instead.

Source

apps/api/pencheff_api/services/github_check_runs.py

References

Authoritative sources

FAQ

Common questions

What is SAST and why does it matter?
SAST (Static Application Security Testing) analyses source code, bytecode, or binaries without executing the application. It finds injection flaws, hardcoded secrets, insecure library use, and logic errors earlier in the development cycle than DAST.
Which programming languages does Pencheff SAST support?
Pencheff runs CodeQL, Semgrep, Bandit (Python), gosec (Go), Brakeman (Ruby on Rails), ESLint security rules (JavaScript/TypeScript), and a tree-sitter pack for additional languages including Rust, PHP, and Java.
How does Pencheff find hardcoded secrets in code?
Pencheff runs gitleaks over the full git history and working tree, detecting API keys, tokens, passwords, and private keys across all commits — not just the current HEAD. YARA rules additionally flag malware patterns and backdoors.
Does SAST replace DAST, or do they complement each other?
They complement each other. SAST finds flaws in code that may not be reachable at runtime, while DAST finds runtime vulnerabilities that may not be apparent from reading the source. Pencheff combines both into a unified findings stream with de-duplication.

Related

Keep exploring Platform.