Skip to content

test cholesky_solve ... FAILED #91

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
termoshtt opened this issue Oct 26, 2017 · 7 comments
Closed

test cholesky_solve ... FAILED #91

termoshtt opened this issue Oct 26, 2017 · 7 comments

Comments

@termoshtt
Copy link
Member

Sometimes this test fails
https://app.wercker.com/termoshtt/ndarray-linalg/runs/test-openblas/59f195128e888f0001cf3ef5?step=59f19584791109000146172c

@termoshtt
Copy link
Member Author

termoshtt commented Jan 12, 2018

Error in #98
https://app.wercker.com/termoshtt/ndarray-linalg/runs/test-openblas/5a576c74f591740001562d94?step=5a576c9dda96a800019a4d42

thread 'rcond' panicked at 'called `Result::unwrap()` on an `Err` value: 0.23086353780922303'
---- rcond stdout ----
	thread 'rcond' panicked at 'called `Result::unwrap()` on an `Err` value: 0.23086353780922303', /checkout/src/libcore/result.rs:906:4
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
stack backtrace:
   0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
             at /checkout/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::_print
             at /checkout/src/libstd/sys_common/backtrace.rs:69
   2: std::panicking::default_hook::{{closure}}
             at /checkout/src/libstd/sys_common/backtrace.rs:58
             at /checkout/src/libstd/panicking.rs:381
   3: std::panicking::default_hook
             at /checkout/src/libstd/panicking.rs:391
   4: std::panicking::rust_panic_with_hook
             at /checkout/src/libstd/panicking.rs:577
   5: std::panicking::begin_panic
             at /checkout/src/libstd/panicking.rs:538
   6: std::panicking::begin_panic_fmt
             at /checkout/src/libstd/panicking.rs:522
   7: rust_begin_unwind
             at /checkout/src/libstd/panicking.rs:498
   8: core::panicking::panic_fmt
             at /checkout/src/libcore/panicking.rs:71
   9: core::result::unwrap_failed
             at /checkout/src/libcore/macros.rs:23
  10: <core::result::Result<T, E>>::unwrap
             at /checkout/src/libcore/result.rs:772
  11: solve::rcond
             at tests/solve.rs:40
  12: <F as test::FnBox<T>>::call_box
             at /checkout/src/libtest/lib.rs:1480
             at /checkout/src/libcore/ops/function.rs:223
             at /checkout/src/libtest/lib.rs:141
  13: __rust_maybe_catch_panic
             at /checkout/src/libpanic_unwind/lib.rs:99

@davenza
Copy link

davenza commented Apr 27, 2018

If it helps, this is an instance problem that always fails. Copied from the cholesky test, but without randomness:

use ndarray::{arr1,arr2};
#[test]
fn cholesky_solve_fail() {

    let a: Array2<f64> = arr2(&[[2.7518138512052546, 0.0, 0.0],
                                    [-3.0030534432813334, 0.5503895871967794, 0.0],
                                    [0.0, 0.0, 0.4472135954999579]]);
    let x: Array1<f64> = arr1(&[-0.3633966736383765772089, -9.2503560214093027980198, 0.0]);
    let b = a.dot(&x);
    println!("a = \n{:#?}", a);
    println!("x = \n{:#?}", x);
    println!("b = \n{:#?}", b);
    println!("Solution = \n{:#?}", &a.solvec(&b).unwrap());
    assert_close_l2!(&a.solvec(&b).unwrap(), &x, 1e-9);
    assert_close_l2!(&a.solvec_into(b.clone()).unwrap(), &x, 1e-9);
    assert_close_l2!(&a.solvec_inplace(&mut b.clone()).unwrap(), &x, 1e-9);
    assert_close_l2!(&a.factorizec(UPLO::Upper).unwrap().solvec(&b).unwrap(), &x, 1e-9);
    assert_close_l2!(&a.factorizec(UPLO::Lower).unwrap().solvec(&b).unwrap(), &x, 1e-9);
    assert_close_l2!(&a.factorizec(UPLO::Upper).unwrap().solvec_into(b.clone()).unwrap(), &x, 1e-9);
    assert_close_l2!(&a.factorizec(UPLO::Lower).unwrap().solvec_into(b.clone()).unwrap(), &x, 1e-9);
    assert_close_l2!(&a.factorizec(UPLO::Upper).unwrap().solvec_inplace(&mut b.clone()).unwrap(), &x, 1e-9);
    assert_close_l2!(&a.factorizec(UPLO::Lower).unwrap().solvec_inplace(&mut b.clone()).unwrap(), &x, 1e-9);
}

Output:

---- cholesky_solve_fail stdout ----
	a = 
[[2.7518138512052546, 0.0, 0.0],
 [-3.0030534432813334, 0.5503895871967794, 0.0],
 [0.0, 0.0, 0.4472135954999579]] shape=[3, 3], strides=[3, 1], layout=C (0x1)
x = 
[-0.3633966736383766, -9.250356021409303, 0.0] shape=[3], strides=[1], layout=C | F (0x3)
b = 
[-1.0, -4.0, 0.0] shape=[3], strides=[1], layout=C | F (0x3)
Solution = 
[-0.36339667363837663, -7.26757935296819, 0.0] shape=[3], strides=[1], layout=C | F (0x3)
thread 'cholesky_solve_fail' panicked at 'called `Result::unwrap()` on an `Err` value: 0.21418077766308938', libcore/result.rs:945:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.

First (and third) component are OK though.

@jturner314
Copy link
Member

@davenza Thanks for taking a look at this. This particular example fails because a is asymmetric. (Solving with Cholesky decomposition requires a Hermitian (or real symmetric) positive definite coefficient matrix.) Did you get that a matrix from running the cholesky_solve test (which calls random_hpd to generate a random Hermitian positive-definite matrix), or did you create it yourself? In other words, is random_hpd the real problem here?

This does bring up another question: In this particular example, a bad value is returned since a is asymmetric. Should we explicitly check if a is Hermitian and positive definite, and if not return an Err?

@davenza
Copy link

davenza commented Apr 27, 2018

@jturner314 Thanks, I didn't notice the symmetric requirement. I was trying to do a different thing, and I think I misunderstood the documentation.

I had a covariance matrix (not created with random_hpd), let's call U:

[7.572479471685095, -8.263844061131207, 0.0]
[-8.263844061131206, 9.321258680898515, 0.0]
[0.0, 0.0, 0.2]

I found the lower cholesky decomposition U = LL^T. L is the matrix that I used in the code above:

[2.7518138512052546, 0.0, 0.0]
[-3.0030534432813334, 0.5503895871967794, 0.0]
[0.0, 0.0, 0.4472135954999579]

Then, I tried to solve Lx = [1 4 0] using solvec.


However, related with the problem: while I was doing some cargo test runs, I sometimes had errors. I could try to reproduce the errors again, printing a failing matrix if it is interesting to you.

@jturner314
Copy link
Member

Fwiw, if you need both the triangular factor (L matrix) and a solution (with .solvec()), you can factor once with .factorizec() or .factorizec_into() to get a CholeskyFactorized instance. You can then solve, take the inverse, take the determinant, get the lower/upper triangular factor, etc., using the CholeskyFactorized instance without having to factor the matrix multiple times.

Regarding the test failures: I think they're due to floating-point errors. (In the cases I've observed the tests failing, the errors have always been pretty small. If you see a case where the error is large, please post it here.) One idea to minimize issues associated with floating-point errors in the tests is to throw out randomly-generated matrices that are ill-conditioned (and generate new ones that are well-conditioned), but I haven't had a chance to try this.

@davenza
Copy link

davenza commented May 3, 2018

In two of my executions of the cholesky_solve() test, the test failed with these matrices:

Fail 1

---- cholesky_solve stdout ----
a =
[[0.33597532169190203, -0.8317558045767911, -0.33804679493344636],
[-0.8317558045767911, 7.026626194373293, 5.091368342750945],
[-0.33804679493344636, 5.091368342750945, 6.387675256291089]] shape=[3, 3], strides=[3, 1], layout=C (0x1)
x =
[-1.3640677969149184, 1.4365643322080148, -0.2864128482272794] shape=[3], strides=[1], layout=C | F (0x3)
a =
[[3.147686, -0.6140107, -1.1712251],
[-0.6140107, 3.554442, -0.27945665],
[-1.1712251, -0.27945665, 0.51092684]] shape=[3, 3], strides=[3, 1], layout=C (0x1)
x =
[0.2852047, -0.2701597, -0.70091474] shape=[3], strides=[1], layout=C | F (0x3)
thread 'cholesky_solve' panicked at 'called Result::unwrap() on an Err value: 0.0027820447', libcore/result.rs:945:5``

Fail 2 (with backtrace)

---- cholesky_solve stdout ----
a =
[[1.2219888536170944, -1.3044421701103923, 1.0844884489891133],
[-1.3044421701103923, 1.622040998971165, -0.25782460724863565],
[1.0844884489891133, -0.25782460724863565, 4.503214907761472]] shape=[3, 3], strides=[3, 1], layout=C (0x1)
x =
[2.3896768802521486, -0.5822467373409934, 1.1002035853153536] shape=[3], strides=[1], layout=C | F (0x3)
a =
[[5.918905, 4.401211, 1.5132039],
[4.401211, 4.052539, -0.07915869],
[1.5132039, -0.07915869, 2.2470446]] shape=[3, 3], strides=[3, 1], layout=C (0x1)
x =
[-0.01866009, -0.6332641, -0.04560638] shape=[3], strides=[1], layout=C | F (0x3)
thread 'cholesky_solve' panicked at 'called Result::unwrap() on an Err value: 0.00379905', libcore/result.rs:945:5
note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace.
stack backtrace:
0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
at libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
1: std::sys_common::backtrace::_print
at libstd/sys_common/backtrace.rs:71
2: std::panicking::default_hook::{{closure}}
at libstd/sys_common/backtrace.rs:59
at libstd/panicking.rs:380
3: std::panicking::default_hook
at libstd/panicking.rs:390
4: std::panicking::rust_panic_with_hook
at libstd/panicking.rs:576
5: std::panicking::begin_panic
at libstd/panicking.rs:537
6: std::panicking::begin_panic_fmt
at libstd/panicking.rs:521
7: rust_begin_unwind
at libstd/panicking.rs:497
8: core::panicking::panic_fmt
at libcore/panicking.rs:71
9: core::result::unwrap_failed
at /checkout/src/libcore/macros.rs:23
10: <core::result::Result<T, E>>::unwrap
at /checkout/src/libcore/result.rs:782
11: cholesky::cholesky_solve
at tests/cholesky.rs:146
12: <F as alloc::boxed::FnBox>::call_box
at libtest/lib.rs:1462
at /checkout/src/libcore/ops/function.rs:223
at /checkout/src/liballoc/boxed.rs:788
13: __rust_maybe_catch_panic
at libpanic_unwind/lib.rs:102

It seems that both failures happen with the f32 type and due to floating-point errors, as you said.

@termoshtt
Copy link
Member Author

One idea to minimize issues associated with floating-point errors in the tests is to throw out randomly-generated matrices that are ill-conditioned

It will be a right way. Since the implementations of LAPACK are trusted, we do not have to check the singular cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants