zshot/cliDownload

--archive-cdx

ValuePATH
LicenseSTD

Write a CDXJ index of the WARC’s records. Each line carries a canonicalized (SURT) URL key, a 14-digit timestamp, and a JSON block with the record’s URL, MIME, HTTP status, payload digest, byte offset, length, and the WARC’s filename. Lines are sorted, so replay tools like pywb can serve the archive by binary search instead of re-indexing it:

zshot -f site.warc --archive-cdx site.cdxj https://zshot-cli.com

Requires a WARC: either warc output as above, or a --warc sidecar alongside another output. Offsets address whole gzip members in the default per-record-gzip WARC and raw byte spans with --warc-no-gzip. Response, revisit, and resource records are indexed; request and warcinfo records are not.

This is a Standard-tier flag.

On the HTTP server, request this as with_archive_cdx=true instead of a path. The index is returned as an additional asset in a Link response header, like with_warc. See the API reference for reading those links.