Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 30 additions & 9 deletions vulnfeeds/cmd/combine-to-osv/README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,51 @@
# combine-to-osv

## What

Combine [`PackageInfo`](https://github.com/google/osv.dev/blob/2c22e9534a521c6c6350275427f80e481065ca39/vulnfeeds/vulns/vulns.go#L165-L171) file fragments into a single OSV record.

This tool combines CVEs from CVE5 and NVD security advisories of the same ID into a single OSV record.
## Why

To address the generation of CVE records from multiple disparate sources (all requiring a common record prefix):

* Alpine, by [this code](../alpine)
* the NVD, by [this code](../converters/cve/nvd-cve-osv)
* CVEList, by [this code](../converters/cve/cve5)
* NVD, by [this code](../converters/cve/nvd-cve-osv)

## How

See [`run_combine_to_osv_convert.sh`](run_combine_to_osv_convert.sh):

* Reads from [`gs://cve-osv-conversion/parts`](https://storage.googleapis.com/cve-osv-conversion/index.html?prefix=parts/)
* Merges with CVE data from NVD (obtained from GCS mirror maintained by [`download-cves`](../mirrors/download-cves/mirror_nvd.sh))
* Reads from [`gs://cve-osv-conversion/nvd-osv`](https://storage.googleapis.com/cve-osv-conversion/index.html?prefix=nvd-osv/) and [`gs://cve-osv-conversion/cve5`](https://storage.googleapis.com/cve-osv-conversion/index.html?prefix=cve5/)
* Writes an OSV record to [`gs://cve-osv-conversion/osv-output`](https://storage.googleapis.com/cve-osv-conversion/index.html?prefix=osv-output/)
* This is the import source for [`cve-osv`](https://github.com/google/osv.dev/blob/2c22e9534a521c6c6350275427f80e481065ca39/source.yaml#L96)
* What gets written can be overridden by OSV records in [`gs://cve-osv-conversion/osv-output-overrides`](https://storage.googleapis.com/cve-osv-conversion/index.html?prefix=osv-output-overrides/)

## Operational matters

* Runs every hour (on the half hour) as a [Kubernetes CronJob](https://github.com/google/osv.dev/blob/master/deployment/clouddeploy/gke-workers/base/combine-to-osv.yaml)

## Usage

```bash
go run main.go [flags]
```

### Flags

- `-cve5-path`: Path to CVE5 OSV files (default: "cve5")
- `-nvd-path`: Path to NVD OSV files (default: "nvd")
- `-osv-output-path`: Local output path of combined OSV files, or GCS prefix if uploading (default: "osv-output")
- `-output-bucket`: The GCS bucket to write to (default: "osv-test-cve-osv-conversion")
- `-overrides-bucket`: The GCS bucket to read overrides from (default: "osv-test-cve-osv-conversion")
- `-upload-to-gcs`: If true, upload to GCS bucket instead of writing to local disk (default: false)
- `-workers`: Number of workers to process records (default: 64)
- `-sync-deletions`: If false, do not delete files in bucket that are not local (default: false)
Comment thread
jess-lowe marked this conversation as resolved.
Outdated

## Description

The tool performs the following steps:
1. Loads CVE5 OSVs from the specified path.
2. Loads NVD OSVs from the specified path.
3. Lists Debian and Alpine CVEs from GCS buckets to ensure mandatory CVEs are created.
4. Combines the loaded data into OSV records.
5. Uploads the results to GCS or writes them to the local filesystem.

### Overriding an OSV record

#### Situation
Expand Down
25 changes: 25 additions & 0 deletions vulnfeeds/cmd/converters/alpine/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Alpine Converter

This tool converts Alpine security database records to OSV format and uploads them to GCS.

## Usage

```bash
go run main.go [flags]
```

### Flags

- `-output-path`: Path to output general alpine affected package information (default: "alpine")
Comment thread
jess-lowe marked this conversation as resolved.
- `-output-bucket`: The GCS bucket to write to (default: "osv-test-cve-osv-conversion")
- `-workers`: Number of workers to process records (default: 64)
- `-upload-to-gcs`: If true, do not write to GCS bucket and instead write to local disk (default: false)
Comment thread
jess-lowe marked this conversation as resolved.
Outdated
- `-sync-deletions`: If false, do not delete files in bucket that are not local (default: false)
Comment thread
jess-lowe marked this conversation as resolved.
Outdated

## Description

The tool performs the following steps:
1. Downloads the Alpine SecDB data from `https://secdb.alpinelinux.org/`.
2. Loads existing NVD CVEs data to extract human readable information like details and severity.
3. Generates OSV vulnerabilities by mapping Alpine security fixes to CVEs.
4. Uploads the results to GCS or writes them to the local filesystem.
47 changes: 47 additions & 0 deletions vulnfeeds/cmd/converters/cve/cve5/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# CVE5 Converters

In this directory you will find two tools to convert CVEs from the CVEListV5 repository to OSV format. The bulk converter is designed to convert a large number of CVEs in parallel from the CVEListV5 repository, while the single converter is designed to convert a single CVE.

These converters are a continuation of the work described in the [Introducing broad C/C++ vulnerability management support](https://osv.dev/blog/posts/introducing-broad-c-c++-support/)

See [bulk-converter/run_cvelist-converter.sh](https://github.com/google/osv.dev/blob/master/vulnfeeds/cmd/converters/cve/cve5/bulk-converter/run_cvelist-converter.sh) for how this is invoked in Production.

## Usage

### Bulk Converter

```bash
go run bulk-converter/main.go [flags]
```

#### Flags

- `-cve5-repo`: CVEListV5 directory path (default: "cvelistV5")
- `-out-dir`: Path to output results (default: "cvelist2osv")
- `-start-year`: The first in scope year to process (default: "2022")
- `-workers`: The number of concurrent workers to use for processing CVEs (default: 30)
- `-cnas-allowlist`: A comma-separated list of CNAs to process. If not provided, defaults to `cna_allowlist.txt`.

#### Description

The tool performs the following steps:
1. Walks the specified CVEListV5 directory for JSON files starting from the `start-year`.
2. Filters CVEs based on the CNA allowlist and state ("PUBLISHED").
Comment thread
jess-lowe marked this conversation as resolved.
Outdated
3. Converts valid CVEs to OSV format using `cvelist2osv`.
4. Outputs the OSV records and metrics to the specified output directory.

### Single Converter

```bash
go run single-converter/main.go <path/to/cve.json> [flags]
```

#### Flags
- `-out-dir`: Path to output results (default: "cvelist2osv")

#### Description

The tool performs the following steps:
1. Reads the specified CVE JSON file.
2. Converts the CVE to OSV format using `cvelist2osv`.
3. Outputs the OSV record to the specified output directory.
26 changes: 26 additions & 0 deletions vulnfeeds/cmd/converters/debian/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Debian Converter

This tool converts Debian Security Tracker information to OSV format.

## Usage

```bash
go run main.go [flags]
```

### Flags

- `-output-path`: Path to output OSV files (default: "debian-cve-osv").
Comment thread
jess-lowe marked this conversation as resolved.
- `-output-bucket`: The GCS bucket to write to (default: "debian-osv").
- `-workers`: Number of workers to process records (default: 64).
- `-upload-to-gcs`: If true, do not write to GCS bucket and instead write to local disk (default: false).
Comment thread
jess-lowe marked this conversation as resolved.
Outdated
- `-sync-deletions`: If false, do not delete files in bucket that are not local (default: false).
Comment thread
jess-lowe marked this conversation as resolved.
Outdated

## Description

The tool performs the following steps:
1. Downloads the Debian Security Tracker data from `https://security-tracker.debian.org/tracker/data/json`.
2. Downloads Debian Distro Info data to map release names to version numbers.
3. Loads existing CVEs from `cve_jsons`.
4. Generates OSV vulnerabilities by mapping Debian security tracker entries to CVEs.
5. Uploads the results to GCS or writes them to the local filesystem.
25 changes: 25 additions & 0 deletions vulnfeeds/cmd/ids/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# IDs Tool

This utility assigns IDs to OSV records in a directory. It ensures that IDs are unique and follow a specified prefix and year format.

It is predominately used by [PYSEC](https://github.com/pypa/advisory-database/blob/main/.github/workflows/automation.yaml) and [Malicious Packages](https://github.com/ossf/malicious-packages/blob/7b1ba332528dba6b0a2df23e9a43b384623c0251/.github/workflows/assign-osv-ids.yml#L35).
Comment thread
jess-lowe marked this conversation as resolved.
Outdated

## Usage

```bash
go run main.go [flags]
```

### Flags

- `-prefix`: Vulnerability prefix (e.g., "PYSEC").
- `-dir`: Path to vulnerabilities.
Comment thread
jess-lowe marked this conversation as resolved.
Outdated
- `-format`: Format of OSV reports in the repository. Must be "json" or "yaml" (default: "yaml").

## Description

The tool performs the following steps:
1. Walks the specified directory to find unassigned vulnerabilities (files starting with `PREFIX-0000-`).
2. Determines the maximum allocated ID for each year.
3. Assigns new IDs to unassigned vulnerabilities, incrementing the counter for the respective year.
4. Renames the files to match the new IDs.
42 changes: 20 additions & 22 deletions vulnfeeds/cmd/mirrors/cpe-repo-gen/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ for example output.
It can utilise Debian copyright metadata for additional inference. Populate that
metadata mirror with:

```
```bash
wget \
--directory debian_copyright \
--mirror \
Expand All @@ -18,31 +18,29 @@ wget \
https://metadata.ftp-master.debian.org/changelogs/main
```

```
cpe-repo-gen analyzes the NVD CPE Dictionary for Open Source repository information.
`cpe-repo-gen` analyzes the NVD CPE Dictionary for Open Source repository information.
It reads the NVD CPE Dictionary JSON files and outputs a JSON map of CPE products
to discovered repository URLs.

It can also output on stdout additional data about colliding CPE package names.

Usage:

go run cmd/cpe-repo-gen/main.go [flags]

The flags are:

--cpe_dictionary_dir
The path to the directory of NVD CPE Dictionary JSON files, see https://nvd.nist.gov/products/cpe
## Usage
```bash
go run main.go [flags]
```

--debian_metadata_path
The path to a directory containing a local mirror of Debian copyright metadata, see README.md
### Flags

--output_dir
The directory to output cpe_product_to_repo.json and cpe_reference_description_frequency.csv in
- `-cpe-dictionary-dir`: The path to the directory of NVD CPE Dictionary JSON files (default: "cve_json/nvdcpe-2.0-chunks"). See https://nvd.nist.gov/products/cpe
- `-debian-metadata-path`: The path to a directory containing a local mirror of Debian copyright metadata.
- `-output-dir`: The directory to output `cpe_product_to_repo.json` and `cpe_reference_description_frequency.csv` in (default: ".").
Comment thread
jess-lowe marked this conversation as resolved.
Outdated
- `-validate`: Perform remote validation of repositories and only include ones that validate successfully (default: true).
- `-verbose`: Output additional telemetry to stdout (default: false).
Comment thread
jess-lowe marked this conversation as resolved.
Outdated
- `-gcp-logging-project`: GCP project ID to use for logging (default: "oss-vdb"). Set to the empty string to log to stdout

--gcp_logging_project
The GCP project ID to utilise for Cloud Logging. Set to the empty string to log to stdout
## Description

--verbose
Output additional telemetry to stdout
```
The tool performs the following steps:
1. Loads CPEs from the specified dictionary directory.
2. Analyzes the CPEs to find repository URLs in their references.
3. Optionally attempts to derive repository URLs from Debian copyright metadata.
4. Validates the discovered repositories by checking if they are reachable and have usable refs.
5. Outputs the product-to-repo map and a description frequency CSV.
20 changes: 20 additions & 0 deletions vulnfeeds/cmd/mirrors/download-cves/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Download CVEs

This tool downloads CVE data from the NVD 2.0 API data dumps.
Comment thread
jess-lowe marked this conversation as resolved.
Outdated

## Usage

```bash
go run main.go [flags]
```

### Flags

- `-cve-path`: Where to download CVEs to (default: "cve_jsons").

## Description

The tool performs the following steps:
1. Downloads CVE JSON files for each year from 2002 to the current year.
2. Downloads "modified" and "recent" CVE feeds.
Comment thread
jess-lowe marked this conversation as resolved.
Outdated
3. Saves the downloaded files to the specified directory.
Loading