Skip to content

Extremely slow output of julia arrays in Python with juliacall #263

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
VPetukhov opened this issue Feb 3, 2023 · 3 comments
Closed

Extremely slow output of julia arrays in Python with juliacall #263

VPetukhov opened this issue Feb 3, 2023 · 3 comments

Comments

@VPetukhov
Copy link

VPetukhov commented Feb 3, 2023

I'm wrapping a julia package to a python Jupyter notebook with juliacall. But the data obtained from julia can't be properly visualized in Python, as it takes really long. A minimal example below takes 19s to render if ran in console ipython (and even longer in VSCode, for whatever reason):

from juliacall import Main as jl
jl.rand(10000, 1000)

Same goes for large DataFrames.

Julia version 1.8.5, juliacall version 0.9.7 , python v3.8.0.

Update
From what I understand, the problem comes from pyjlany_repr, which calls repr, which interpolates the whole array into a single string:

import juliacall
len(juliacall.PythonCall.repr(jl.rand(10000, 1000)))

Out[12]: 192706630

So, we get linear dependency on array size. On the other hand, calling juliacall.PythonCall.display(jl.rand(10000, 1000)) works perfectly fine and with reasonable speed.

Is there any way to override the behavior? Honestly, I'd be even fine if we can disable input from Julia objects altogether and require them to be converted to Python first. But the current situation is extremely unfriendly to potential users of my package, as any accidental output of a julia object causes the REPL to hang without any possibility to even interrupt it...

@cjdoris
Copy link
Collaborator

cjdoris commented Feb 9, 2023

Right now there's no override, but I'm open to changing the implementation of __repr__ to return a truncated representation instead.

Note that list(range(1000000)) will also print a ton of junk, so this isn't unprecedented behaviour, but I agree that the truncated output that numpy returns is nicer. Plus you can always call jl.repr(x) if you really want the untruncated string.

@cjdoris
Copy link
Collaborator

cjdoris commented Feb 9, 2023

FYI you can do jl.rand(1000,1000)._jl_display() to display the object as you would in the Julia REPL.

@cjdoris
Copy link
Collaborator

cjdoris commented Feb 28, 2023

This has been implemented on the main branch, to appear in the next release.

@cjdoris cjdoris closed this as completed Feb 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants