Optimize data cluster format for VHD and QCOW#6895
Optimize data cluster format for VHD and QCOW#6895last-genius wants to merge 11 commits intoxapi-project:26.1-lcmfrom
Conversation
|
Opened the corresponding xs-opam PR: xapi-project/xs-opam#755 |
| | x :: y :: _ -> | ||
| (to_int x, to_int y) | ||
| | _ -> | ||
| raise (Invalid_argument "Invalid JSON") |
There was a problem hiding this comment.
You might want to report the json.
There was a problem hiding this comment.
It can be rather large and pollute the logs... This shouldn't happen unless you have a version incompatibility, really
d41347a to
bcf47a8
Compare
|
Changed the approach:
This way, XS (which only uses VHD) can keep using the legacy format until they can ensure the new format didn't accidentally break anything else. XCP-ng will flip the feature flag to make VHD consistent with QCOW - we will use the interval-based format. |
| cohttp | ||
| cohttp-lwt | ||
| conf-libssl | ||
| (conf-qemu-img :with-test) |
There was a problem hiding this comment.
There's no conf package for qemu-img, even though for some reason I thought it existed.
Another way of configuring it is:
depexts: [
["qemu-img"] {with-test}
]
There was a problem hiding this comment.
I added it last year, seems to still be there: https://github.com/ocaml/opam-repository/blob/master/packages/conf-qemu-img/conf-qemu-img.1/opam
There was a problem hiding this comment.
I don't see it in xs-opam, though
There was a problem hiding this comment.
Is this waiting for a review from Edwin, then? I think it was him that raised concerns
d431ca9 to
b2b80aa
Compare
This allows to switch on the more efficient interval format later. (QCOW always uses the new format) Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
This is just an easy way to make sure the semantics are preserved in any future refactorings, without having to run full VHD exports. Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
This command returns a more efficient representation of allocated clusters (when compared to read_headers), utilizing a sparse interval format instead of returning every single allocated cluster. This is the more efficient option, decreasing the filesize and memory usage in vhd-tool, but it's currently under a feature flag, so it's added as a new command instead of replacing read_headers immediately. Cram test for read_headers is still passing, so this refactoring has preserved the legacy format. Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
Since the runtime feature flag vhd_legacy_blocks_format determines which block format is used to describe allocated VHD clusters, this requires duplicate parse_header_interval functions for VHD and QCOW. The right functions are selected in stream_vdi based on the feature flag. Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
…er allocation Instead of using a set with every individual allocated cluster index as a member, use a sorted list of intervals to verify if cluster is allocated - this uses much less memory and directly follows from the JSON format qcow-stream-tool and vhd-tool output now. Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
…_clusters nonzero_clusters no longer contain every single allocated cluster and instead are intervals of allocated clusters. Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
b2b80aa to
581edae
Compare
|
I've switched approaches after more thorough testing. Now This (along with I've dropped I've fixed more bugs in preceding commits (two in |
| else | ||
| match left_block with | ||
| | Some x -> | ||
| (`List [`Int x; `Int (i - 1)] :: acc, None) |
There was a problem hiding this comment.
Is the JSON our own design or do we need to conform to something? If it's our own choice, I would prefer an object with two entries over a list if this JSON is consumed by software. If it's just for display and never parsed it's fine to do it this way because it is more compact.
There was a problem hiding this comment.
it is our design, it's parsed in vhd_qcow_parsing. this two-element list represents the start and the end of the interval, so I don't see it expanding to more elements (which could be a reason to turn it into an object). we still want the compactness, though
| (cluster_size + chunk_size - 1) / chunk_size | ||
| in | ||
| (* Iterate over allocated intervals, copying every cluster inside *) | ||
| let _ = |
There was a problem hiding this comment.
Is a fold the right way if we ignore the result?
There was a problem hiding this comment.
couldn't think of a nicer way to thread through a value that's updated on every iteration (without refs)
This proves much more reliable than code based on ocaml-qcow. Since qemu-img has a different format (with the needed information spread across two files resulting from calls to 'qemu-img info' and 'qemu-img map'), change the parsing code in vhd_qcow_parsing and qcow2-to-stdout. Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
qemu-img is now used to determine allocated clusters, so this command is no longer needed. Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
581edae to
d1beba5
Compare
|
Apparently |
psafont
left a comment
There was a problem hiding this comment.
We need to coordinate the merge and release with xs-opam.
xs-opam's PR should be merged first.
Then pass the CI here.
Then merge this, and then we can make a release in xs-opam.
As usual with the breaking updates, this will break the CI in all PRs and feature branches, triggering a rebase.
Since this is done for LCM, I don't think there will be much of an impact, but this will happening when merging to master
Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
d1beba5 to
15afce2
Compare
|
I've dropped the conf-qemu-img dependency from xapi, kept it in vhd-tool. xapi's spec file already has a dependency on qemu. |
mirage/ocaml-qcow#134 changes the type of the data structure containing info on allocated data clusters, returning allocated intervals instead of all the virtual cluster addresses. Change
qcow2-to-stdoutto the new interval-based format.Add
vhd-tool read_headers_intervalcommand which also conforms to this new format, and change the parsing code instream_vdito accept both formats depending on a feature flag. Add cram tests verifying legacy format is preserved as-is.I've ran vm export and vdi integrity quicktests and tested this extensively locally. The PR will only build once the new ocaml-qcow version is packaged into xs-opam, so keeping this as draft for now.