writing a Go program to download the file, extract its contents, and compute the required information (MD5, SHA1, SHA256 hashes, filename, and size) for each file in the ZIP archive.
package main import ( "archive/zip" "bytes" "crypto/md5" "crypto/sha1" "crypto/sha256" "fmt" "io" "net/http" ) func main() { // Step 1: Download the ZIP file resp, err := http.Get("https://github.com/hungptran/sample-files/raw/main/2_Levels_Archive.zip") if err != nil { panic(err) } defer resp.Body.Close() // Read the body into a buffer buf, err := io.ReadAll(resp.Body) if err != nil { panic(err) } // Step 2: Read the ZIP file zipReader, err := zip.NewReader(bytes.NewReader(buf), int64(len(buf))) if err != nil { panic(err) } // Step 3: Iterate through each file for _, file := range zipReader.File { f, err := file.Open() if err != nil { panic(err) } // Read file content content, err := io.ReadAll(f) if err != nil { panic(err) } f.Close() // Compute hashes md5Hash := md5.Sum(content) sha1Hash := sha1.Sum(content) sha256Hash := sha256.Sum256(content) // Step 4: Output the information fmt.Printf("Filename: %s\n", file.Name) fmt.Printf("Size: %d bytes\n", file.UncompressedSize64) fmt.Printf("MD5: %x\n", md5Hash) fmt.Printf("SHA1: %x\n", sha1Hash) fmt.Printf("SHA256: %x\n\n", sha256Hash) } }
To check the information of the ZIP file itself (i.e., the hash values and size of the 2_Levels_Archive.zip
file), you don’t need to extract the contents. Instead, you can directly compute the hashes and get the size of the downloaded ZIP file. Here’s the updated Go code for this purpose:
package main import ( "crypto/md5" "crypto/sha1" "crypto/sha256" "fmt" "io" "net/http" ) func main() { // Step 1: Download the ZIP file resp, err := http.Get("https://github.com/hungptran/sample-files/raw/main/2_Levels_Archive.zip") if err != nil { panic(err) } defer resp.Body.Close() // Create hash writers md5Hash := md5.New() sha1Hash := sha1.New() sha256Hash := sha256.New() // Create a multi-writer to write to all hash writers simultaneously writer := io.MultiWriter(md5Hash, sha1Hash, sha256Hash) // Copy the response body to the multi-writer size, err := io.Copy(writer, resp.Body) if err != nil { panic(err) } // Step 2: Output the information fmt.Printf("Filename: 2_Levels_Archive.zip\n") fmt.Printf("Size: %d bytes\n", size) fmt.Printf("MD5: %x\n", md5Hash.Sum(nil)) fmt.Printf("SHA1: %x\n", sha1Hash.Sum(nil)) fmt.Printf("SHA256: %x\n", sha256Hash.Sum(nil)) }
The usage of io.MultiWriter
in combination with io.Copy
and the direct hash computation using md5.Sum(content)
, sha1.Sum(content)
, and sha256.Sum256(content)
represents two different approaches for calculating hashes in Go, each suitable for different scenarios:
- Using
io.MultiWriter
withio.Copy
:- This approach is stream-oriented and is particularly efficient when dealing with large data sets or when the data is being received as a stream (like reading from a network response, as in your case).
- With
io.MultiWriter
, you write the data once, and it gets passed simultaneously to multiple writers—in this case, the hash functions. This method is memory efficient because it doesn’t require loading the entire data into memory; it processes the data as it is being read. - It’s ideal for situations where you are reading data from an
io.Reader
(like an HTTP response or a file) and you want to process this data in multiple ways (like computing different hashes) while reading it.
- Using
md5.Sum(content)
,sha1.Sum(content)
, andsha256.Sum256(content)
:- This approach is more straightforward but requires you to have the entire data available in a byte slice (
content
in your example). - It’s suitable when you have small data or when the entire data is already present in memory. Since the entire data is available, you can directly pass it to the hash function to get the result.
- However, this method can be less memory-efficient for large data sets because it requires the entire data to be loaded into memory.
- This approach is more straightforward but requires you to have the entire data available in a byte slice (
In your original context (downloading a file and computing its hash), using io.MultiWriter
with io.Copy
is more efficient, especially for large files, because it processes the data in a streaming manner without the need to load the entire file into memory. The direct hash computation method (using md5.Sum(content)
, etc.) is simpler and more direct but is more suitable for situations where the data is already in memory or when dealing with small amounts of data.