Skip to content

Work on symbolic accesses to structure fields #162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Feb 25, 2018

Conversation

pkgw
Copy link
Collaborator

@pkgw pkgw commented Feb 25, 2018

To understand the line_break routine so that I can futz with it, it helps a lot to turn opaque memory_word references into named structure field accesss. As far as I can tell, this has to largely be done manually. So, let's just get started.

Coccinelle helps a lot. Almost none of the field accesses are uniquely interpretable in the current code on their own, but contextual hints can be very useful for figuring out what's being done.

pkgw added 30 commits February 1, 2018 10:03
Researching how to do this will be generally useful for cleaning the codebase,
and will help me be more confident in how to futz with this routine for
semantic pagebreaking.
…eferences

I've started exploring Coccinelle since it seems like it should be a great
refactoring tool. But, I have a PhD in astrophysics and I still find the
documentation to be confusing. I did get this example patch to work, though:

```
@@
expression E;
@@

- mem[E + 1].b32.s1 = INF_PENALTY
+ PENALTY_NODE_penalty(E) = INF_PENALTY
```

Applied with:

```
spatch --sp-file structs.sp --in-place --dir tectonic
```
When Coccinelle works, it works well!

```
@@
expression E;
identifier I =~ ".*_NODE";
@@

- mem[E].b16.s1 = I
+ NODE_type(E) = I

@@
expression E;
binary operator O;
identifier I =~ ".*_NODE";
@@

- mem[E].b16.s1 O I
+ NODE_type(E) O I
```

Run as:

```
spatch --sp-file structs.sp --in-place --dir tectonic
```
**The following Coccinelle patch was over-eager** and matched some expressions
that were clearly NODE_subtypes:

```
@@
expression E;
binary operator O;
@@

- mem[E].b16.s0 O NORMAL
+ GLUE_SPEC_shrink_order(E) O NORMAL

@@
expression E;
@@

- mem[E].b16.s0 = NORMAL
+ GLUE_SPEC_shrink_order(E) = NORMAL
```
Almost all usages of MIN_HALFWORD in the original WEB code come through the
NULL macro, which defines a NULL "pointer" into the memory array. I've tried
to preserve the places that genuinely use the MIN_HALFWORD symbolic constant;
there are only a dozen or so.
Thanks Coccinelle!

```
@@
expression E;
@@

- E >= hi_mem_min
+ is_char_node(E)
```
```
@@
expression E;
@@

- f = mem[E + 1].b16.s1;
+ f = LIGATURE_NODE_lig_font(E);
```
```
@@
expression E;
@@

- f = mem[E].b16.s1;
+ f = CHAR_NODE_font(E);
```
```
@@
expression E;
@@

- c = mem[E].b16.s0;
+ c = CHAR_NODE_character(E);
```
```
@@
expression E1, E2, E3;
@@

- effective_char(E1, E2, mem[E3 + 1].b16.s0)
+ effective_char(E1, E2, LIGATURE_NODE_lig_char(E3))
```
```
@@
expression E1, E2, E3;
@@

- effective_char(E1, E2, mem[E3].b16.s0)
+ effective_char(E1, E2, CHAR_NODE_character(E3))
```
It's packed in a funky way. This is a start:

```
@@
expression E1, E2;
@@

- font_info[char_base[E1] + E2]
+ FONT_CHARACTER_INFO(E1, E2)
```
We interact with the existing code more smoothly if the charinfo is the
`memory_word.b32` member, not `memory_word` itself.
Some manual tweaking, plus:

```
@@
expression F, I;
@@

- font_info[height_base[F] + (I.s2) / 16].b32.s1
+ FONT_CHARINFO_HEIGHT(F, I)
```
Again:

```
@@
expression F, I;
@@

- font_info[depth_base[F] + (I.s2) % 16].b32.s1
+ FONT_CHARINFO_DEPTH(F, I)
```
Again:

```
@@
expression F, I;
@@

- font_info[italic_base[F] + (I.s1) / 4].b32.s1
+ FONT_CHARINFO_ITALCORR(F, I)
```
```
@@
expression E;
@@

- delete_glue_ref(mem[E + 4].b32.s1)
+ delete_glue_ref(INSERTION_NODE_split_top_ptr(E))
```
```
@@
expression E;
@@

- mem[E + 6].gr
+ BOX_glue_set(E)
```
pkgw added 4 commits February 25, 2018 13:39
```
@@
expression E;
binary operator O;
@@

- mem[E].b16.s0 O EXPLICIT
+ NODE_subtype(E) O EXPLICIT

@@
expression E;
@@

- mem[E].b16.s0 = EXPLICIT;
+ NODE_subtype(E) = EXPLICIT;
```
With:

```
@@
expression E;
binary operator O;
@@

- mem[E].b16.s0 O SPACE_ADJUSTMENT
+ NODE_subtype(E) O SPACE_ADJUSTMENT

@@
expression E;
@@

- mem[E].b16.s0 = SPACE_ADJUSTMENT;
+ NODE_subtype(E) = SPACE_ADJUSTMENT;

@@
expression E;
binary operator O;
@@

- mem[E].b16.s0 O ACC_KERN
+ NODE_subtype(E) O ACC_KERN

@@
expression E;
@@

- mem[E].b16.s0 = ACC_KERN;
+ NODE_subtype(E) = ACC_KERN;
```
@pkgw
Copy link
Collaborator Author

pkgw commented Feb 25, 2018

Rust stable CI failure due to this bug, triggered by this PR. Looks like it can safely be ignored for now. (Although it prevents us from getting our code coverage report.)

@pkgw pkgw merged commit f8918d4 into tectonic-typesetting:master Feb 25, 2018
@pkgw pkgw deleted the symbolic branch February 25, 2018 20:02
@Mrmaxmeier
Copy link
Contributor

Nice! 👍
Slowly but steadily approaching the original code quality.

Mrmaxmeier pushed a commit to Mrmaxmeier/tectonic that referenced this pull request Oct 28, 2019
…or-486

xetex get_date_and_time: pull from SOURCE_DATE_EPOCH
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants