如何创建 HTML/ZIP/PNG 多语种文件

  • Introduction: Web archiving tool SingleFile stores web page resources as data URIs, which can be inefficient for large resources. Combining ZIP format with HTML and encapsulating in PNG is a more elegant solution.
  • The Power of ZIP: ZIP format has an organized structure with file entries and a central directory. It's flexible in data placement, allowing prepending and appending data. This makes it suitable for creating polyglot files.
  • Creating HTML/ZIP Polyglot Files: A self-extracting archive is created by storing the page and its resources in a ZIP file and embedding the ZIP data in an HTML comment. The assets/main.js script reads the ZIP data using the lib/zip.min.js library. But there's a problem with retrieving ZIP data directly from the filesystem due to the same-origin policy.
  • Reading ZIP Data from the DOM: To overcome the filesystem limitation, ZIP data is read directly from the DOM. This requires handling character encoding issues like decoding to the wrong character set and converting carriage returns. An association table and "consolidation data" are used to fix these issues.
  • Adding PNG to the Mix: The PNG format consists of a signature and chunks. The final implementation combines HTML, ZIP, and PNG into a single file. HTML's fault tolerance allows this complex structure, but it introduces challenges like visible signature chunks and rendering in quirks mode.
  • Optimization Through Image Reuse: The main image is removed from the ZIP file and the page is reused as a PNG to replace it in the displayed page for optimization.
  • Resulting File: The resulting file is demo.png.zip.html (a macOS issue with "Archive Utility" is mentioned; unzip can be used). It can also be viewed at demo.png.zip.html.
阅读 5
0 条评论