Replace the ray-to-dem intersection algorithm with the secant method#5408
Replace the ray-to-dem intersection algorithm with the secant method#5408oleg-alexandrov wants to merge 0 commit intoDOI-USGS:devfrom
Conversation
|
Just to keep you in the loop with tests on this @oleg-alexandrov, here is partial output from one of the camstats tests: We are exceptionally close to within our test tolerance but I don't know if there is much we can do about it |
|
One may try to see if decreasing the tolerance in the stopping criterion in the secant method changes anything, or one could just call these the new reference results. Given that one peeks all the way from orbit, 1e-8 is small, whether these are pixels or lon-lat. |
|
I realized that jigsaw also uses this function. I am not sure what it is used for, but there's a possibility that the current stopping tolerance, of 1/100th of a pixel, may be too high. Especially if one has to do numerical differentiation based on the result of this. In CSM ground-to-image, the tolerance used is about 1e-8. Here likely 1/1000 of a pixel may be a better tolerance, and likely will result in at most another iteration, given that the method converges very fast. |
|
@acpaquette What is the status on this? More work needed or waiting on bandwidth for additional review? |
|
@jlaura @oleg-alexandrov I still need to look into all of the failing tests here and do some manual verification. I should be able to get to this under continuous support but it will take a but longer to fully explore the changes |
|
Just to loop back around here, this seems like a good change for ISIS as it allows ISIS to use more local DEMs for both landed and orbital data. Still need to test on a network through jigsaw to see how the results differ. |
|
Sounds good. I would suggest making the tolerance for convergence smaller, maybe by a factor of 10, and seeing the effect. For ground-level DEMs, whose grid can grow with the distance from the rover, current tolerance looked to be on the high side. |
|
@oleg-alexandrov I tested it against orbital data/networks and everything seems nominal. I was also able to replicate the images produced through ASP for the landed data. I think this PR from a functional stand point is good to go. When you have time Oleg, would you be able to update the tests? No rush on our end as the grounded support hasn't been widely publicized. Thanks for an amazing addition to the code base! |
|
I will not be coming back to CSM / ISIS work till middle of June. I have never worked with the ISIS test suite and know very little about its structure. Can the ISIS maintainers pick up the slack? :) Likely not the first time the regression suite broke. |
|
@oleg-alexandrov we've always asked PR authors to write and update their own tests. People would be pretty updates if we suddenly made an exception. The process for updating docs is pretty well documented and new outside contributors seem to be able to pick up it pretty fast: https://astrogeology.usgs.gov/docs/how-to-guides/isis-developer-guides/writing-isis-tests-with-ctest-and-gtest/ |
|
Thanks for the user guide. I believe I updated all the tests. I also pushed making the tolerance for convergence 1/1000 pixel rather than 1/100 pixel. This had the added bonus that more tests passed even before I started looking at them. I brought up an issue in #5459. Currently some unit tests take a very long time while not doing much than computing a number or two. Multiply by the very large number of tests and potentially several iterations and that becomes a pain to deal with. |
|
Any thoughts on this? |
|
@oleg-alexandrov Apologies for little activity on this, Jenkins has been dead for sometime now. I will work in the background this week to double check tests and provide a review here |
|
Looking into this now that I have a handle on test failures on mac with a generic mac test config from #5517 |
|
This is reminder to perhaps attempt to pull this in. This pull request has been around for 6 months. A few months ago I fixed all the tests I could see failing. That meant dealing with large amount of editing of files and plugging in new numbers. Not fun. I can do that second time, if provided if a list of failures (after merging into latest), but then hoping it will be merged promptly afterwards. |
|
@oleg-alexandrov I am unable to run tests until merge conflicts have been resolved. |
f056bce to
e912de9
Compare
|
I fixed the conflict. Look that the test results changed in different ways in my branch and the main one, so surely some tests will fail now. When you have the test results I will fix what is failing. |
I will run the tests now. It takes about an hour. I will post the results in the PR. |
|
|
Were these tests passing before? Most of these have no relation to the function I changed, which projects points from image to ground. |
|
@oleg-alexandrov I have been running tests for all the PRs today and have not run across these test failures. I can run the tests again and see if anything changes. Currently, there are 19 test failures in the dev branch that I did not include in this list. |
|
Thanks for confirming. I am running the tests locally with ctest (both
before and after my changes) as I need to see the precise failures and the
numbers. I will try to fix what I can.
…On Tue, Jul 16, 2024 at 3:42 PM Amy Stamile ***@***.***> wrote:
@oleg-alexandrov <https://github.com/oleg-alexandrov> I have been running
tests for all the PRs today and have not run across these test failures. I
can run the tests again and see if anything changes.
Currently, there are 19 test failures in the dev branch that I did not
include in this list.
—
Reply to this email directly, view it on GitHub
<#5408 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKDU3ELT7YS5BJZKOV6I53ZMWOV5AVCNFSM6AAAAABBYZKDI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZRHEZTQOBTHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
A problem here is that it is hard to ensure I have the same dependencies,
machine, and test data as you folks.
Will it be possible to paste all the failures in detail? Such as, which
test had what value before and what value now, which is what ctest prints
when run in verbose mode, I believe. That way I can plug the right numbers
in the failing tests (if the change is small enough), and then on your side
you can rerun them again and see if they pass.
On Tue, Jul 16, 2024 at 3:48 PM Oleg Alexandrov ***@***.***>
wrote:
… Thanks for confirming. I am running the tests locally with ctest (both
before and after my changes) as I need to see the precise failures and the
numbers. I will try to fix what I can.
On Tue, Jul 16, 2024 at 3:42 PM Amy Stamile ***@***.***>
wrote:
> @oleg-alexandrov <https://github.com/oleg-alexandrov> I have been
> running tests for all the PRs today and have not run across these test
> failures. I can run the tests again and see if anything changes.
>
> Currently, there are 19 test failures in the dev branch that I did not
> include in this list.
>
> —
> Reply to this email directly, view it on GitHub
> <#5408 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAKDU3ELT7YS5BJZKOV6I53ZMWOV5AVCNFSM6AAAAABBYZKDI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZRHEZTQOBTHA>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
|
Here are the test logs: tests-part1.log Please ignore the follow tests that are failing on dev: |
|
Thank you. The useful information I was able to extract from those is: EVENTS 1721165603679 The difference between hist->Average() and 127.46759871644132 is 3.0619243602814095e-06, which exceeds .000001, where These numbers are much larger than would be expected, and I don't understand what they mean, and even if they even have anything to do do with the code I changed. Last time around I spent at least a day on this. I regretfully no longer have time to go through this process again. The tests take 2-3 hours to run on my machine. Then one has to plug and chug the right numbers in the right files. Mistakes are unavoidable so one has to rerun the tests. One has to also configure rclone and fetch gigabytes of test data, set up the env, fetch ISIS, and wait for a couple of hours for it to build. I hope the effort I spent modifying the image-to-ground code, which required a new algorithm, is appreciated, and that under the umbrella of the relevant project the ISIS developers can take it from there. |
|
This is an important contribution to have merged in support of work that I am doing. Given the challenges getting access to the test data #5550, the inability for external contributors to merge updates to the test data, and the fact that @oleg-alexandrov had GTests passing on or about April 8, I would propose that the remaining updates be addressed under an upcoming continuous support sprint. Additionally, I would love to see process improvements that (1) make the test data more accessible, e.g., the proposed solution from #5550, (2) process improvements such that the turn around between contributions and review are significantly shorter. |
|
Thank you for your contribution! Unfortunately, this pull request hasn't received much attention lately, so it is labeled as 'stale.' If no additional action is taken, this pull request will be automatically closed in 180 days. If you want to participate in our support prioritization meetings or be notified when support sprints are happening, you can sign up the support sprint notification emails here. Read more about our support processs here |
|
This should stay on. I should fix it when back on the project. |
|
The build and test suite have started for your pull request. To view your build log, please reference the build with source version: "PR_5408". Additionally, check the latest "dev" source version to identify existing test failures. Please note that you are not responsible for the test failures that exist on both your PR and the dev branch. |
3 similar comments
|
The build and test suite have started for your pull request. To view your build log, please reference the build with source version: "PR_5408". Additionally, check the latest "dev" source version to identify existing test failures. Please note that you are not responsible for the test failures that exist on both your PR and the dev branch. |
|
The build and test suite have started for your pull request. To view your build log, please reference the build with source version: "PR_5408". Additionally, check the latest "dev" source version to identify existing test failures. Please note that you are not responsible for the test failures that exist on both your PR and the dev branch. |
|
The build and test suite have started for your pull request. To view your build log, please reference the build with source version: "PR_5408". Additionally, check the latest "dev" source version to identify existing test failures. Please note that you are not responsible for the test failures that exist on both your PR and the dev branch. |
|
The build and test suite have started for your pull request. To view your build log, please reference the build with source version: "PR_5408". Additionally, check the latest "dev" source version to identify existing test failures. Please note that you are not responsible for the test failures that exist on both your PR and the dev branch. |
1 similar comment
|
The build and test suite have started for your pull request. To view your build log, please reference the build with source version: "PR_5408". Additionally, check the latest "dev" source version to identify existing test failures. Please note that you are not responsible for the test failures that exist on both your PR and the dev branch. |
|
The build and test suite have started for your pull request. To view your build log, please reference the build with source version: "PR_5408". Additionally, check the latest "dev" source version to identify existing test failures. Please note that you are not responsible for the test failures that exist on both your PR and the dev branch. |
|
I fixed all tests except for "app" and "module" tests. Those require latest "truth" to be stored in your data repo that I don't have write access to. Kelvin is saying what is now may do, and this can be merged in, if Adam agrees. Here's the build that ran after squashing all commits: https://us-west-2.codebuild.aws.amazon.com/project/eyJlbmNyeXB0ZWREYXRhIjoiNDJNZ2MxbHFKTkwxV1RyQUxJekdJY3FIanNqU29rMHB4Nk1YUzk4REIrZUZDeEtEaW9HQlZ1dTZOSHpML2VUTGVDekYydmVFcU9sUHJKN20wQzd1Q0UzSzJscnB0MElDb1M3Ti9GTlJYR1RuMWJTV3V1SkJTa3NoYmc9PSIsIml2UGFyYW1ldGVyU3BlYyI6IjF3U2NTSGlDcEtCc29YVnEiLCJtYXRlcmlhbFNldFNlcmlhbCI6MX0%3D/build/048c6416-0022-410e-bd8f-b9e01532b358 Note that I made some minor changes to some tests. There were some comparing PVL tables, and these would bail out at the first failure, requiring around 8 cycles of edit/build/test/fix per test, multiplied by 3 such tests, which I think is counterproductive. I replaced those with a handful of EXPECT_NEAR statements. I am quite confident the changes due to this are very reasonable. All sensible tests give comparable results as before, including the test where the logic being implemented is meant to intentionally fail if the ray points away from the planet. Thanks. To add, I wrote a little python script to auto-edit tests and plug in the right numbers into EXPECT_NEAR and EXPECT_EQ statements. Plugging numbers by hand is a bit of a giant chore, and this may help going forward if more tests are standardized. |
acpaquette
left a comment
There was a problem hiding this comment.
Couple discussion points. I haven't checked the changes to tests/data as my suggests may impact the test results
| // Compute the position along the ray | ||
| for (size_t i = 0; i < 3; i++) | ||
| intersectionPoint[i] = observerPos[i] + t * lookDirection[i]; |
There was a problem hiding this comment.
I know it's not ideal, but can this be moved out of the demError function? Reading over the new algorithm, I could not find how the intersection point was being updated. Maybe into a small function that just updates the intersection point. That would also simplify the function signature here to just take the intersection point and success.
There was a problem hiding this comment.
It is essential that the demError() function takes a parameter named t, which parameterizes the line. Later the secant method calls this with values of t that helps it find the root.
The intersection point is being updated in the same function, demError(), a little down. If I recall, I had to work within the constraints of the previous logic, in which the updated intersection point is a member of the class.
There was a problem hiding this comment.
Right, I think I understand what you mean. If updating the point cannot be separated from the error generation, I think it would be helpful to update the name of the function. Maybe updateIntersection, this way it's more obvious that the intersection is being updated.
There was a problem hiding this comment.
Sure. The function could be called calcDemErrUpdateIntersection().
| * Returned value is in km. A more robust method would return a couple | ||
| of values, and later try one if the other one fails. | ||
| */ | ||
| double DemShape::findDemValue() { |
There was a problem hiding this comment.
Is there a particular reason this deviates from using the ellipsoid to do an initial DEM value guess?
I don't know for sure, but this method seems to interact with global DEMs oddly. In that it would start at the upper left corner, and traverse the cube until a valid point is found. It would be nice to avoid this for global DEMs.
I think we can avoid this by using the ellipsoid intersection as an initial guess and getting a 5 x 5 pixel square around the initial intersection point. Then iterating over that to get an initial DEM value.
There was a problem hiding this comment.
The reason the ellipsoid is not used to find the initial DEM value guess is because this would fail for a rover that would have a little DEM in front of it, at an oblique angle, and it would be 8,000 m below the datum.
So, this is an intentional fix for problems with the previous method.
Much effort made into ensuring one searches around along the ray till an intersection point is found, by querying the DEM.
There was a problem hiding this comment.
Maybe I am missing something but this function seems to step through the DEM until it finds a valid value/radii. For global DEMs, the first valid values would likely be found at high latitudes around the poles.
If my data is closer to the equator, it seems like a significant jump for the algorithm to have to converge toward the radii at the equator from a radii obtained closer to the poles. I just wonder if that is contributing to a larger difference in intersection for closer to nadir data
There was a problem hiding this comment.
When the ray-to-DEM algorithm starts, it needs some estimate of the DEM height which is better than just the zero datum elevation, which can be really off, especially for a rover.
Yes, the initial estimate is found by just sampling the DEM and stopping at first good height value. The intersection with the ellipsoid at that value usually succeeds, even if the true value we need is off by a few km.
If no luck, the algorithm initiates a search along the ray. It could have started just with that, but this is somewhat slower than seeding with a valid DEM height. So, the strategy from above is an optimization, the algorithm works without it.
Eventually, whichever way the initial estimate is found, a rigorous root-searching starts, so it should not produce a large difference in intersection no matter what. It will stop when the tolerance is reached, or it will fail.
|
The build and test suite have started for your pull request. To view your build log, please reference the build with source version: "PR_5408". Additionally, check the latest "dev" source version to identify existing test failures. Please note that you are not responsible for the test failures that exist on both your PR and the dev branch. |
1 similar comment
|
The build and test suite have started for your pull request. To view your build log, please reference the build with source version: "PR_5408". Additionally, check the latest "dev" source version to identify existing test failures. Please note that you are not responsible for the test failures that exist on both your PR and the dev branch. |
|
Suggested change implemented. Tweak the tolerance for stopping criterion. Note: This resulted in a few more failed tests, but by small amounts, and this is fully expected, given the change in tolerance. That very few extra few tests failed it means that the change was not overall major. The new change forces the solver to work a little harder in some circumstances. |
|
The build and test suite have started for your pull request. To view your build log, please reference the build with source version: "PR_5408". Additionally, check the latest "dev" source version to identify existing test failures. Please note that you are not responsible for the test failures that exist on both your PR and the dev branch. |

This is a proposed fix for the issue of intersecting a ray from a rover camera with a DEM. There were several problems:
This fix first finds a representative elevation based on sampling the DEM, then uses the secant method.
One issue that needs looking into is the stopping criterion. I left it the same, at 100th of pixel resolution. For MSL the computed resolution ranged between 2 m and 10 m, which is a bit too coarse I think for MSL. I am not sure if the provided resolution() function is accurate enough.
To test this, one should first merge the ALE code, regenerate the ISD, and reattach it to the cub file with same shapefile as before. Must fetch latest ASP if doing mapprojection, as there's a bug fix there too.
I tested both the old fixed point method and new secant method after initializing with a representative elevation, rather than zero datum. The old approach does well with campt when line and sample are 200 or more, when the ray looks more vertically, but the old approach fails for line and sample between 120 and 200 (approx) when rays are more oblique. The new secant method still does well then. But it breaks down when line and sample is 120 and below (approx) when one looks closer to the horizon. This was done with dataset NLB_450492465EDR_F0310634NCAM00357M1.cub with the fixed ISD and same shape file as what @acpaquette prepared.
This is a large change. It is suggested to have a build with and another without this change and make some sanity checks for orbital sensors. This is my last week on CSM work and I will not have much time for a large amount of work if this change has a big impact.
Licensing
This project is mostly composed of free and unencumbered software released into the public domain, and we are unlikely to accept contributions that are not also released into the public domain. Somewhere near the top of each file should have these words: