Run web, API, code, dependency, cloud, AI, and internal-network assessments from one queue with unified findings, evidence, remediation, and audit output.
Code security
SAST and secrets
Semgrep, Bandit, gosec, Brakeman, ESLint security, tree-sitter rules, and gitleaks.
Findings, reports, dashboards, exports, integrations, and retests all read from the same normalized record.
Pencheff favors repeatable checks, then uses AI for triage, enrichment, orchestration, and remediation where it adds signal.
From the Pencheff docs
Scanners
/repos/scannersEach repo scan fans out to several scanners. Every match is normalised
into a shared RepoFinding row so the UI and the API don't care which
engine produced it.
CodeQL was removed in v0.7 — the CodeQL CLI is not licensed for commercial use on third-party code, and Pencheff scans customer code. The SAST role is now filled by the five permissively-licensed tools listed below, all run as subprocesses (no static linking).
Semgrep OSS — multi-language SAST
Pinned to an explicit allowlist of OSS Semgrep Registry packs — never
--config=auto, never any Semgrep Pro Engine / Pro rules. Default
pack list:
p/owasp-top-ten p/security-audit p/cwe-top-25 p/secrets p/jwt
p/django p/flask p/express p/nodejs p/golang p/r2c-security-audit
Override per-deployment with the PENCHEFF_SEMGREP_PACKS env var
(comma-separated). The runner script lives at
bench/runners/semgrep.sh. License: LGPL-2.1 (subprocess-only).
Severity maps via the existing _canonical_severity helper —
ERROR/WARNING/INFO collapse to our five-level scale.
Bandit — Python SAST
Apache-2.0; runs bandit -r <repo> skipping B101 (assert in tests).
Captures CWE ids when Bandit emits them.
gosec — Go SAST
Apache-2.0; only fires when the staged tree contains .go files
outside vendor/. Reports CWE id + confidence on every issue.
Brakeman — Ruby on Rails SAST
MIT; auto-skips when the tree isn't a Rails app (no app/ + config/
directories). Confidence levels collapse to severity:
high→high, medium→medium, weak→low.
ESLint + eslint-plugin-security — JS / TS SAST
Both MIT. Invoked via npx --no-install eslint against a pinned flat
config at bench/runners/eslint_security.config.cjs — ignores any
.eslintrc in the target repo so the security ruleset is identical
on every scan. Only security/* rule hits surface as findings.
Tree-sitter pack — niche-language SAST
Phase 2.3 — per-language sub-packs under
plugins/pencheff/pencheff/modules/sast/treesitter_pack/ cover
languages that Semgrep OSS / Bandit / gosec / Brakeman / ESLint don't
reach cleanly. Solidity ships at v0.7 (4 hand-curated rules:
tx.origin auth, weak-randomness, deprecated selfdestruct,
unchecked low-level calls). Lua, Scala, Dart, Kotlin, Swift, COBOL,
Erlang sub-packs scaffold-ready — drop a queries.scm + rules.json
pair into a sibling directory. Each sub-pack is gracefully skipped
when the language grammar isn't installed.
GHSA Advisory DB — SCA
Dependency-vulnerability scan against the GitHub Advisory Database,
sourced via osv-scanner
(which mirrors GHSA along with PyPA, RustSec, Go Vulndb, and several
other ecosystem feeds).
Walks every manifest the engine recognises:
package-lock.json,yarn.lock,pnpm-lock.yamlrequirements.txt,Pipfile.lock,poetry.lockGemfile.lock,Cargo.lock,composer.lockgo.sum,pom.xml,build.gradle
Findings include package, installed_version, fixed_version, and
the GHSA-prefixed alias as rule_id when present (otherwise the OSV
ID). CVE aliases populate the cve field. Severity maps from the CVSS
v3 score: 9+ critical, 7+ high, 4+ medium, else low.
For App-installed repos, Dependabot push webhooks deliver alerts straight into the same bucket — they merge with the on-disk scan.
gitleaks — secrets
Scans the working tree for credential patterns: AWS keys, GCP service accounts, Slack tokens, private SSH keys, generic high-entropy strings. Every match is high severity — the right call is almost always to revoke and rotate.
YARA — malware / backdoor patterns
Runs the YARA engine against every file using Pencheff's bundled rule
pack at bench/rules/yara/. Targets that actually appear in real
source trees:
- Minimal PHP webshells (
eval($_GET[…])families) - Obfuscated JS loaders (
eval(atob(…)),Function(decodeURIComponent(…))) - Crypto-miner pool configs (
stratum+tcp://, xmrig) - Python pickle RCE gadgets
- Classic reverse-shell oneliners
Drop your own *.yar files into bench/rules/yara/ to extend the pack
without touching Pencheff code.
Trivy IaC — infrastructure misconfigurations
Runs trivy config over the staged repo. Picks up Terraform,
CloudFormation, Helm charts, Kubernetes manifests, and Dockerfiles
without configuration. Includes CIS benchmarks and AWS / Azure / GCP
provider-specific rules.
Checkov — policy-as-code
1,000+ policy-as-code rules across the same IaC surface as Trivy plus ARM, Bicep, Serverless, OpenAPI. Useful complement when an organisation cares about specific compliance frameworks (Trivy is broader, Checkov is opinionated).
Filtering — what gets scanned
Before any scanner runs, the repo is staged into a clean directory using hardlinks (cheap, no byte copy on the same filesystem). Staging respects:
.gitignore(root and nested)- A default-deny list:
.git,.env*,node_modules,.venv, build / dist directories,__pycache__, …
stats.filter on each RepoScan records included / excluded
counts and the method (git ls-files if available, fallback walk).
From the Pencheff docs
GitHub Check Run + SARIF + Pencheff Suggest
/features/github-check-runsWhen the Pencheff GitHub App is installed on a repo, every PR scan posts a Pencheff Check Run on the head commit with inline annotations on the diff, and uploads a SARIF v2.1.0 document to Security → Code scanning.
A separate bot — Pencheff Suggest — reads PR comments and
acts on pencheff: suppress … directives so reviewers can mark
findings noise without leaving GitHub.
(The bot name is provisional pending the Phase 0.6 trademark search; final name TBD.)
Check Run surface
| Layer | What you see |
|---|---|
| Per-commit check | Pencheff check appears alongside lint / test / build on every PR. Conclusion is success when no critical/high; failure otherwise. |
| Inline annotations | One annotation per finding, anchored at (file_path, line_start..line_end) with severity → failure / warning / notice. GitHub caps at 50 per Check-Run POST; Pencheff pages remaining annotations via PATCH. |
| Summary | Count strip — N critical · N high · N medium · N low · N info — rendered in the check's output.summary. |
SARIF upload
A separate path uploads the same findings as a SARIF v2.1.0 document to GitHub's Code Scanning ingest endpoint. The findings then show up under the repo's Security → Code scanning alerts tab and inherit the standard GitHub triage UI (dismiss, mark resolved, alert routing).
POST /repos/{owner}/{repo}/code-scanning/sarifs
Authorization: Bearer <installation-token>
Content-Type: application/json
{ "commit_sha": "...", "ref": "refs/heads/main",
"sarif": "<base64-gzip>", "tool_name": "Pencheff",
"checkout_uri": "https://github.com/owner/repo" }
The Pencheff GitHub App requires the security_events permission
(write) for SARIF upload. Customers using the PAT path need a token
scoped to security_events.
Pencheff Suggest — PR-comment suppression
Reviewers can suppress a finding directly from a PR comment:
Looks fine to me — running on staging only.
pencheff: suppress 47bf3c92 reason="accepted_risk" notes="staging-only test fixture"
The bot parses the directive, validates the reason against the
allowlist, and calls
POST /findings/{id}/suppress
on your behalf. Valid reasons: accepted_risk, wont_fix,
false_positive, duplicate, out_of_scope. Anything else is
rejected silently — there's no way to inject a custom reason via
the comment surface.
How to enable
- Install the Pencheff GitHub App on the org or specific repos (see Connect a repo).
- Grant the
Checkspermission (write) andsecurity_eventspermission (write) when the app installer prompts. - The next push or PR triggers an automatic Check Run + SARIF upload alongside the existing scan.
For repos connected via PAT (no GitHub App), the Check Run / SARIF
features require a PAT with security_events write — the standard
fine-grained PAT path doesn't expose this scope, so most PAT-only
deployments use the unified findings stream + DOCX report instead.
Source
References
Authoritative sources
FAQ
Common questions
- What is SAST and why does it matter?
- SAST (Static Application Security Testing) analyses source code, bytecode, or binaries without executing the application. It finds injection flaws, hardcoded secrets, insecure library use, and logic errors earlier in the development cycle than DAST.
- Which programming languages does Pencheff SAST support?
- Pencheff runs CodeQL, Semgrep, Bandit (Python), gosec (Go), Brakeman (Ruby on Rails), ESLint security rules (JavaScript/TypeScript), and a tree-sitter pack for additional languages including Rust, PHP, and Java.
- How does Pencheff find hardcoded secrets in code?
- Pencheff runs gitleaks over the full git history and working tree, detecting API keys, tokens, passwords, and private keys across all commits — not just the current HEAD. YARA rules additionally flag malware patterns and backdoors.
- Does SAST replace DAST, or do they complement each other?
- They complement each other. SAST finds flaws in code that may not be reachable at runtime, while DAST finds runtime vulnerabilities that may not be apparent from reading the source. Pencheff combines both into a unified findings stream with de-duplication.
Related