Skip to content

Commit 7875362

Browse files
committed
[flang] Add the proposal document and rationale for the internal naming module that was previously added.
Summary: This document describes how uniquing of internal names is done. This name uniquing is done to support the constraints and invariants of the FIR dialect of MLIR. Reviewers: jeanPerier, mehdi_amini, DavidTruby, jdoerfert, sscalpone, kiranchandramohan Reviewed By: jeanPerier, sscalpone, kiranchandramohan Subscribers: tskeith, kiranchandramohan, rriddle, llvm-commits Tags: #llvm, #flang Differential Revision: https://reviews.llvm.org/D79089
1 parent 5d46e4b commit 7875362

File tree

1 file changed

+118
-0
lines changed

1 file changed

+118
-0
lines changed
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
## Bijective Internal Name Uniquing
2+
3+
FIR has a flat namespace. No two objects may have the same name at
4+
the module level. (These would be functions, globals, etc.)
5+
This necessitates some sort of encoding scheme to unique
6+
symbols from the front-end into FIR.
7+
8+
Another requirement is
9+
to be able to reverse these unique names and recover the associated
10+
symbol in the symbol table.
11+
12+
Fortran is case insensitive, which allows the compiler to convert the
13+
user's identifiers to all lower case. Such a universal conversion implies
14+
that all upper case letters are available for use in uniquing.
15+
16+
### Prefix `_Q`
17+
18+
All uniqued names have the prefix sequence `_Q` to indicate the name has
19+
been uniqued. (Q is chosen because it is a
20+
[low frequency letter](http://pi.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html)
21+
in English.)
22+
23+
### Scope Building
24+
25+
Symbols can be scoped by the module, submodule, or procedure that contains
26+
that symbol. After the `_Q` sigil, names are constructed from outermost to
27+
innermost scope as
28+
29+
* Module name prefixed with `M`
30+
* Submodule name prefixed with `S`
31+
* Procedure name prefixed with `F`
32+
33+
Given:
34+
```
35+
submodule (mod:s1mod) s2mod
36+
...
37+
subroutine sub
38+
...
39+
contains
40+
function fun
41+
```
42+
43+
The uniqued name of `fun` becomes:
44+
```
45+
_QMmodSs1modSs2modFsubPfun
46+
```
47+
48+
### Common blocks
49+
50+
* A common block name will be prefixed with `B`
51+
52+
### Module scope global data
53+
54+
* A global data entity is prefixed with `E`
55+
* A global entity that is constant (parameter) will be prefixed with `EC`
56+
57+
### Procedures/Subprograms
58+
59+
* A procedure/subprogram is prefixed with `P`
60+
61+
Given:
62+
```
63+
subroutine sub
64+
```
65+
The uniqued name of `sub` becomes:
66+
```
67+
_QPsub
68+
```
69+
70+
### Derived types and related
71+
72+
* A derived type is prefixed with `T`
73+
* If a derived type has KIND parameters, they are listed in a consistent
74+
canonical order where each takes the form `Ki` and where _i_ is the
75+
compile-time constant value. (All type parameters are integer.) If _i_
76+
is a negative value, the prefix `KN` will be used and _i_ will reflect
77+
the magnitude of the value.
78+
79+
Given:
80+
```
81+
module mymodule
82+
type mytype
83+
integer :: member
84+
end type
85+
...
86+
```
87+
The uniqued name of `mytype` becomes:
88+
```
89+
_QMmymoduleTmytype
90+
```
91+
92+
Given:
93+
```
94+
type yourtype(k1,k2)
95+
integer, kind :: k1, k2
96+
real :: mem1
97+
complex :: mem2
98+
end type
99+
```
100+
101+
The uniqued name of `yourtype` where `k1=4` and `k2=-6` (at compile-time):
102+
```
103+
_QTyourtypeK4KN6
104+
```
105+
106+
* A derived type dispatch table is prefixed with `D`. The dispatch table
107+
for `type t` would be `_QDTt`
108+
* A type descriptor instance is prefixed with `C`. Intrinsic types can
109+
be encoded with their names and kinds. The type descriptor for the
110+
type `yourtype` above would be `_QCTyourtypeK4KN6`. The type
111+
descriptor for `REAL(4)` would be `_QCrealK4`.
112+
113+
### Compiler generated names
114+
115+
Compiler generated names do not have to be mapped back to Fortran. These
116+
names will be prefixed with `_QQ` and followed by a unique compiler
117+
generated identifier. There is, of course, no mapping back to a symbol
118+
derived from the input source in this case as no such symbol exists.

0 commit comments

Comments
 (0)