-
Notifications
You must be signed in to change notification settings - Fork 88
Runtime speed improvements #75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Are you using big.json to do the testing? |
Yes, see test case 9. I've also got some larger ones that I haven't committed. |
ok, I was thinking a good place to start would be to run some of the tests through gprof and gprof2dot. (This will create pretty call graphs with timing info.) gprof doesn’t work on mac, so I may muck around and see if I can get travis-ci to do it for me. The timings won’t be great since it’s running in a busy VM, but it’s at least a starting point. If you want to muck around with it, you can install gprof2dot with |
@jacobwilliams and @zbeekman |
@szaghi I am “poorer” than @jacobwilliams: no VS for me. BTW, @szaghi did you see my post on cpp-coveralls about what I think the issue is that you’re having? Any luck getting it to work? |
@zbeekman |
@szaghi no, we don’t use container based for this project due to the gfortran 4.9 dependency. There is no sane way, that I know of, to use a container based build and improve the build speed. An insane way would be to:
But I don’t particularly fancy scripting the installation of GCC from source on a remote VM. Which is why I don’t think that this is a sane approach. I just have |
@zbeekman thank you. I agree with you, I do not like to locally compile gfortran. Thank you for your suggestions. P.S. If I will find the time, I will try to support gprof in FoBiS and improve gcov one. Reading your nice idea to maintain a wiki page of untested procedures I have had the idea that this can be automatically accomplished by FoBiS: run the build with coverage flag should build the targets (your tests in this case) with coverage instruments, run the tests themself, collect the coverage data and produce a markdown page report. In FoBiS is simple to produce markdown pages ... Do you have suggestions? |
Hi Stefano, I’m not sure what the current FoBiS.py support for gcov is beyond compiling/linking with coverage flags. I seem to recall you mentioning some additional support of some variety… if so can you remind me what the additional functionality does? As for generating markdown pages for the coverage info… that is quite an interesting idea! I guess ideally it should be stylistically compatible-with/similar-looking-to with FORD since FoBiS.py and FORD use similar Markdown engines, yes? (Of course the ability to customize the appearance, for example, to be more similar to ROBODoc output, would be wonderful.) As far as the actual output content goes, it would be great to:
|
Hi Izaak, I try to summarize some points. FoBiS.py coverage supportPresently, there are a
Unexecuted proceduresFirst of all FoBiS.py does not use any markdown inputs or outputs. Ford uses markdown for its input and jnija2 as template engine for its html outputs. I said that is simple for FoBiS.py to generate mardown reports because I have strong experience with Python+markdown due to another project of mine MaTiSSe. Consequently, I will very quickly integrate a markdown generator into FoBiS if necessary. As uncovered procedures searching algorithm is concerned, I am very confused. Yesterday, with the fews minutes of my lunch break... I tried to extract uncovered informations by means of different tools, but it has been very frustrating:
Indeed my failing is surely due to the fews minutes I tested the tools, but I think I need your help to find a simple way to extract the uncovered procedures information:
Today I should find some more minutes to dig this depeer. Thank you for help. |
yes, I think this is a good idea. I was confused and surprised a few times when coverage info wasn’t created after adding
I’m not sure what a good way to do this would be. Who knows if the targets will require special command line arguments or input files in a particular location?
On my system there are two ways to get info about procedures. If you checkout the master branch of json-fortran and then
Note that the gcov man page says that percentages will never be listed as exactly 0 or 100 unless NONE or ALL of the lines are executed, respectively. So Either way, my suggestions will very likely be quite challenging to implement, since the output of gcov needs to be parsed to extract the relevant information. I think this is what cpp-coveralls does, and then encodes all this info in a json payload to send to coveralls.io. FoBiS.py is a great tool, and anything beyond what you have already done is extra, in my mind. If you are up for this challenge, great, but if it’s too complicated, it doesn’t add that much compared to the other great functionality you’ve already implemented. I just thought I’d give you my “Christmas wish list” for what the ideal markdown output would look like.
I wouldn’t worry to much about gprof and callgrind. I think that extracting coverage information should be done with gcov or similar. Profiling info can be nice, but beyond sending gprof output through gprof2dot and then
As discussed above the best way is to parse the *.f90.gcov file for function and subroutine declarations preceded by
I’m not sure whether or not this is critical. It should be clear to the user how to find the procedure in the source, since functions and subroutines cannot have the same name in the same scoping unit. However, if you parse the *.f90.gcov file, the line with the function or subroutine declaration, will include exactly that information.
No, robodoc doesn’t know anything about markdown or do any parsing of markdown. What I meant was that it would be nice to be able to have output in a format that is stylistically compatible with other documentation, whether it is FORD or ROBODoc. The ability to have HTML output would also be great, then the
I’m happy to help, but as I mentioned above all of these are bonus features. FoBiS.py is great as is! Only work on this if you are inspired to do so and have the time to spare. |
Perfect! I never take a look at the .gcov ouputs, I had wrongly supposed that there were only lines coverage without info about procedures... Your Christmas wish list will be completed before the next Christmas... It should be very simple to parse the .gcov files. For our first step I will try to produce a very simple markdown report, in a second phase I will try to support other format. Following your wise suggestions, I freeze (for the moment) any other experiments with gprof (indeed I have already added a Tomorrow I hope to push a first FoBiS.py version with your P.S. the missing coverage flags into the cflags heritage is really a bug! Thank you very much! |
You guys have taken over my issue ticket! Ha ha! Notes on the original topic: two things are making the parsing generally way too slow
I have some fixes in the work for both of these that greatly speed it up... but I'm still doing some experiments, and there are some other things I want to try. Stay tuned. |
sorry 😥 but looking forwards to seeing what you come up with. |
Update: to parse the big.json file (~7 MB) on my laptop:
So there is some promise here. Still working on it. I need to see how it can work with unicode files (I'm using unformatted stream read). |
@jacobwilliams Please, forgive me for polluting your issue... For pie chart I must select the best plug-in for markdown. The inter-procedure analysis (metrics inside each procedure) will come later. See you soon. |
awesome, looks great! |
@jacobwilliams I did some more research on unicode/utf-8 encoding as it pertains to speeding up reading in large files. (I also posted some more questions on your Intel forums thread.) One thing that may be helpful to us in diagnosing whether files may have utf-8 encoded characters with A reference-counting-like scheme could be used to keep track of how many sub-objects contain non-ascii characters which can then be used to determine whether or not to use ‘utf-8’ encoding if the object is ever written out to a file. I think it is possible to use unformatted stream io with uff-8 files, by reading into a string of 1 byte characters. Then you can look at the first bit (e.g. |
Here’s another thought in terms of managing speed and UTF8/UCS4:
|
Well, those were all nice theories, but after doing some tests, it seems—at least on Mac OS X—that there is no way to determine the encoding or contents of the file simply using the So it seems that, other than examining the leading bit of each byte read in, (which I imagine would be slow…or at least difficult to code) there is no good way other than formatted IO to deal with UCS4 characters. So, in light of this, this might be a good thing to let the client code control: provide a safe but slow API interface to read files with non-ascii, UCS4 characters in them, as well as a fast API that will read files containing only ASCII characters correctly, but will fail for UCS4. Thoughts? |
btw, here is the code I was using to test: program main
use iso_fortran_env ,only : file_storage_size
implicit none
integer, parameter :: CK=merge(tsource=selected_char_kind('ISO_10646'),&
fsource=selected_char_kind('DEFAULT'),&
mask=selected_char_kind('ISO_10646') /= -1)
integer :: lun
character(kind=CK,len=:),allocatable :: ucs4str
character(len=:),allocatable :: plainstr
character(len=16) :: strm, enc, frm, unfrm
integer :: sz
ucs4str = CK_'Hello World!'
plainstr = 'Hello ascii world!'
open(newunit=lun,file='utf8.txt',access='stream',encoding='utf-8',form='formatted')
write(lun,'(A)') ucs4str
close(lun)
open(newunit=lun,file='ascii.txt',access='stream',form='formatted')
write(lun,'(A)') plainstr
close(lun)
ucs4str = CK_'\u3053\u3093\u306b\u3061\u306f\u4e16\u754c' ! hello world in japanese
open(newunit=lun,file='hello-jp-utf8.txt',access='stream',encoding='utf-8',form='formatted')
write(lun,'(A)') ucs4str
close(lun)
open(newunit=lun,file='ascii-utf-8.txt',access='stream',encoding='utf-8',form='formatted')
write(lun,'(A)') plainstr
close(lun)
! stream encoding size formatted unformatted
print'(A)','stream '//'encoding '//'formatted '//'unformatted '//'size '//'file'
inquire(file='utf8.txt',stream=strm,encoding=enc,formatted=frm,unformatted=unfrm,size=sz)
write(*,'(A,I4,A)') strm//enc//frm//unfrm, sz, ' utf8.txt'
inquire(file='ascii.txt',stream=strm,encoding=enc,formatted=frm,unformatted=unfrm,size=sz)
write(*,'(A,I4,A)') strm//enc//frm//unfrm, sz, ' ascii.txt'
inquire(file='hello-jp-utf8.txt',stream=strm,encoding=enc,formatted=frm,unformatted=unfrm,size=sz)
write(*,'(A,I4,A)') strm//enc//frm//unfrm, sz, ' hello-jp-utf8.txt'
inquire(file='ascii-utf-8.txt',stream=strm,encoding=enc,formatted=frm,unformatted=unfrm,size=sz)
write(*,'(A,I4,A)') strm//enc//frm//unfrm, sz, ' ascii-utf-8.txt'
end program main If you compile with gfortran add the |
This is still on my to do list...just haven't had time to do it yet. |
no worries, I’ve been slammed too |
Update: Good news and bad news! I'm getting this all merged together now (see the unicode-speed branch)...not finished yet, but I think I have most of what I did a couple months ago working again. It seems to compile/run fine with the gfortran 5.0 build I was using before. However, I also updated my laptop with the latest gfortran 5.1 build (from http://coudert.name) and with that one the unicode tests all fail (even for the master branch). Haven't looked into it in detail yet... |
Oh, nice find: gfortran 5.1 for OS X. I’ll try to take a look at the test failures in a few days. I’m guessing this is a regression on gcc’s part, but I’ll take a look. If it is a failure, it would be great to get them a bug report & reproducer ASAP so they can fix it before the compiler is released. |
I went ahead and merged this into master. Note: the STREAM read mode is not enabled when using Unicode...still need to look into that further. |
I think due to the nature of UTF 8 encoding, other than rolling your own UTF8 parser, it will be hard/impossible to use stream reads for unicode, since UTF8 encoding is variable length per character to maintain backwards compatibility with ASCII. Not that I’m trying to discourage you, but I think at the end of the day we need compiler support for fast reads of UTF8 encoded characters… |
Need to do some speed tests, and also see if there are any changes that can be made to speed up the code. The file reading and/or parsing seems rather slow. See also #37.
The text was updated successfully, but these errors were encountered: