First of all, thanks for the wonderful project! colly has saved our team a lot of time!!
Context
ref: #8, #33
According to RFC7578 section 4.3:
4.3. Multiple Files for One Form Field
The form data for a form field might include multiple files.
[RFC2388] suggested that multiple files for a single form field be transmitted using a nested "multipart/mixed" part. This usage is deprecated.
To match widely deployed implementations, multiple files MUST be sent by supplying each file in a separate part but all with the same "name" parameter.
Receiving applications intended for wide applicability (e.g., multipart/form-data parsing libraries) SHOULD also support the older method of supplying multiple files.
and this practice is unsurprisingly common, and I am facing the exact same case.
The issue
The name field does not have to be unique. There are few common cases when a duplicated name field is required (e.g., when uploading an array of files), and this case should be properly covered.
|
// PostMultipart starts a collector job by creating a Multipart POST request |
|
// with raw binary data. PostMultipart also calls the previously provided callbacks |
|
func (c *Collector) PostMultipart(URL string, requestData map[string][]byte) error { |
|
boundary := randomBoundary() |
|
hdr := http.Header{} |
|
hdr.Set("Content-Type", "multipart/form-data; boundary="+boundary) |
|
hdr.Set("User-Agent", c.UserAgent) |
|
return c.scrape(URL, "POST", 1, createMultipartReader(boundary, requestData), nil, hdr, true) |
|
} |
|
buffer.WriteString("Content-type: multipart/form-data; boundary=" + boundary + "\n\n") |
|
for contentType, content := range data { |
|
buffer.WriteString(dashBoundary + "\n") |
|
buffer.WriteString("Content-Disposition: form-data; name=" + contentType + "\n") |
|
buffer.WriteString(fmt.Sprintf("Content-Length: %d \n\n", len(content))) |
|
buffer.Write(content) |
|
buffer.WriteString("\n") |
|
} |
|
buffer.WriteString(dashBoundary + "--\n\n") |
Unfortunately, the current implementaion accepts map[string][]byte, which enforces name to be unique.
Suggestion
Maybe we can accept []Subpart so that:
- The order of subparts is guaranteed
filename and other metadata can be optionally included
- Duplicate
name fields are allowed
and so on.
I would love to hear your opinion! If you think this is feasible, I will start working on it.
First of all, thanks for the wonderful project!
collyhas saved our team a lot of time!!Context
ref: #8, #33
According to RFC7578 section 4.3:
and this practice is unsurprisingly common, and I am facing the exact same case.
The issue
The
namefield does not have to be unique. There are few common cases when a duplicatednamefield is required (e.g., when uploading an array of files), and this case should be properly covered.colly/colly.go
Lines 551 to 559 in 99b7fb1
colly/colly.go
Lines 1461 to 1469 in 99b7fb1
Unfortunately, the current implementaion accepts
map[string][]byte, which enforcesnameto be unique.Suggestion
Maybe we can accept
[]Subpartso that:filenameand other metadata can be optionally includednamefields are allowedand so on.
I would love to hear your opinion! If you think this is feasible, I will start working on it.