Skip to content

Commit 00c545b

Browse files
Merge pull request #281 from WorksApplications/pre/v0.6.9
Pre v0.6.9
2 parents ea794e3 + aa72b20 commit 00c545b

16 files changed

+457
-311
lines changed

.github/workflows/build-python-wheels.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ jobs:
6969
strategy:
7070
matrix:
7171
os: [windows-latest, macOS-latest]
72-
python-version: [ "3.8", "3.9", "3.10", "3.11", "3.12" ]
72+
python-version: [ "3.9", "3.10", "3.11", "3.12", "3.13" ]
7373

7474
steps:
7575
- uses: actions/checkout@v4

.github/workflows/python-upload-test.yml

Lines changed: 25 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,10 @@ jobs:
1919
- name: Install dependencies
2020
run: |
2121
python -m pip install --upgrade pip
22-
python -m pip install --upgrade setuptools setuptools-rust build
22+
python -m pip install --upgrade setuptools setuptools-rust build packaging
2323
24-
- name: Make .devXX version
25-
run: python ./python/latest_dev_version.py
24+
- name: Modify version for TestPyPI upload
25+
run: python ./python/modify_version_for_testpypi.py
2626

2727
- name: Build sdist
2828
working-directory: ./python
@@ -52,8 +52,18 @@ jobs:
5252
target/
5353
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
5454

55-
- name: Make .devXX version
56-
run: python ./python/latest_dev_version.py
55+
- name: Setup python
56+
uses: actions/setup-python@v5
57+
with:
58+
python-version: '3.11'
59+
60+
- name: Install dependencies
61+
run: |
62+
python -m pip install -U pip
63+
python -m pip install -U packaging
64+
65+
- name: Modify version for TestPyPI upload
66+
run: python ./python/modify_version_for_testpypi.py
5767

5868
- uses: eiennohito/gha-manylinux-build@master
5969
with:
@@ -70,7 +80,7 @@ jobs:
7080
strategy:
7181
matrix:
7282
os: [windows-latest, macOS-latest]
73-
python-version: [ "3.8", "3.9", "3.10", "3.11", "3.12" ]
83+
python-version: [ "3.9", "3.10", "3.11", "3.12", "3.13" ]
7484

7585
steps:
7686
- uses: actions/checkout@v4
@@ -89,17 +99,17 @@ jobs:
8999
target/
90100
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
91101

92-
- name: Make .devXX version
93-
run: python ./python/latest_dev_version.py
94-
95-
- name: Add aarch64 target for Rust
96-
run: rustup target add aarch64-apple-darwin
97-
if: startsWith(matrix.os, 'macOS')
98-
99102
- name: Install dependencies
100103
run: |
101104
python -m pip install -U pip
102-
python -m pip install -U setuptools setuptools_rust build
105+
python -m pip install -U setuptools setuptools_rust build packaging
106+
107+
- name: Modify version for TestPyPI upload
108+
run: python ./python/modify_version_for_testpypi.py
109+
110+
- name: Add aarch64/x86 target for Rust
111+
run: rustup target add aarch64-apple-darwin x86_64-apple-darwin
112+
if: startsWith(matrix.os, 'macOS')
103113

104114
- name: Build wheel
105115
working-directory: ./python
@@ -139,7 +149,7 @@ jobs:
139149
strategy:
140150
matrix:
141151
os: [ ubuntu-latest, windows-latest, macOS-latest ]
142-
python-version: [ "3.8", "3.9", "3.10", "3.11" ]
152+
python-version: [ "3.9", "3.10", "3.11", "3.12", "3.13" ]
143153
fail-fast: false
144154
runs-on: ${{ matrix.os }}
145155

CHANGELOG.md

Lines changed: 131 additions & 100 deletions
Original file line numberDiff line numberDiff line change
@@ -1,148 +1,179 @@
1-
# [0.6.8](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.8) (2023-12-14)
1+
# Changelog
22

3-
## Highlights
3+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
44

5-
* Produce builds for Python 3.12 (#236)
6-
* Add a simple [configuration API](https://worksapplications.github.io/sudachi.rs/python/api/sudachipy.html#config-config)
7-
* Add surface projections (#230)
5+
Also check [python changelog](python/CHANGELOG.md).
86

9-
## Surface Projections
7+
## [Unreleased]
108

11-
* For chiTra compatibility SudachiPy can now directly produce different tokens in the surface field.
12-
* Original surface is accessible via [`Morheme.raw_surface()`](https://worksapplications.github.io/sudachi.rs/python/api/sudachipy.html#sudachipy.Morpheme.raw_surface) method
13-
* It is possible to customize projection dictionary-wise, via Config object, passing it on a dictionary creation, or for a single pre-tokenizer.
14-
* [Config API](https://worksapplications.github.io/sudachi.rs/python/api/sudachipy.html#sudachipy.config.Config.projection)
15-
* [Pretokenizer API](https://worksapplications.github.io/sudachi.rs/python/api/sudachipy.html#sudachipy.Dictionary.pre_tokenizer)
9+
## [0.6.9](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.9) (2024-11-20)
1610

17-
# [0.6.7](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.7) (2023-02-16)
11+
### Added
1812

19-
## Highlights
13+
- freebsd support (#222 by @KonstantinDjairo, #251)
14+
- Add rust minimum support version (#255)
15+
- Add option for embedded config and fallback resources (#262 by @Kuuuube)
2016

21-
* Provide binary wheels for Python 3.11
22-
* Add `Dictionary.lookup()` method which allows you to enumerate morphemes from the dictionary without performing analysis.
17+
### Changed
2318

24-
# [0.6.6](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.6) (2022-07-25)
19+
- `fetch_dictionary.sh` targets latest dictionary by default (#240)
20+
- update dependencies (#241, #246)
21+
- Migrate from structopt to clap (#248 by @tkhshtsh0917)
2522

26-
## Highlights
27-
* Add [boundary matching mode](https://github.com/WorksApplications/Sudachi/blob/develop/docs/oov_handlers.md) to regex oov handler
28-
* macOS binary builds are now unversal2 (arm+x64)
23+
## [0.6.8](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.8) (2023-12-14)
2924

30-
## MacOS
31-
* Binary builds are universal2
32-
* Caveat: we don't run tests on arm because there are no public arm instances, so builds may be broken without any warning
25+
### Highlights
3326

34-
# [0.6.5](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.5) (2022-06-21)
27+
- Produce builds for Python 3.12 (#236)
28+
- Add a simple [configuration API](https://worksapplications.github.io/sudachi.rs/python/api/sudachipy.html#config-config)
29+
- Add surface projections (#230)
3530

36-
## Highlights
31+
### Surface Projections
3732

38-
* Fixed invalid POS tags which appeared when using user-defined POS tags both in user dictionaries and OOV handlers.
33+
- For chiTra compatibility SudachiPy can now directly produce different tokens in the surface field.
34+
- Original surface is accessible via [`Morheme.raw_surface()`](https://worksapplications.github.io/sudachi.rs/python/api/sudachipy.html#sudachipy.Morpheme.raw_surface) method
35+
- It is possible to customize projection dictionary-wise, via Config object, passing it on a dictionary creation, or for a single pre-tokenizer.
36+
- [Config API](https://worksapplications.github.io/sudachi.rs/python/api/sudachipy.html#sudachipy.config.Config.projection)
37+
- [Pretokenizer API](https://worksapplications.github.io/sudachi.rs/python/api/sudachipy.html#sudachipy.Dictionary.pre_tokenizer)
38+
39+
## [0.6.7](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.7) (2023-02-16)
40+
41+
### Highlights
42+
43+
- Provide binary wheels for Python 3.11
44+
- Add `Dictionary.lookup()` method which allows you to enumerate morphemes from the dictionary without performing analysis.
45+
46+
## [0.6.6](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.6) (2022-07-25)
47+
48+
### Highlights
49+
50+
- Add [boundary matching mode](https://github.com/WorksApplications/Sudachi/blob/develop/docs/oov_handlers.md) to regex oov handler
51+
- macOS binary builds are now unversal2 (arm+x64)
52+
53+
### MacOS
54+
55+
- Binary builds are universal2
56+
- Caveat: we don't run tests on arm because there are no public arm instances, so builds may be broken without any warning
57+
58+
## [0.6.5](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.5) (2022-06-21)
59+
60+
### Highlights
61+
62+
- Fixed invalid POS tags which appeared when using user-defined POS tags both in user dictionaries and OOV handlers.
3963
You are not affected by this bug if you did not use user-defined POS in OOV handlers.
4064

41-
# [0.6.4](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.3) (2022-06-16)
65+
## [0.6.4](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.4) (2022-06-16)
4266

43-
## Highlights
67+
### Highlights
4468

45-
* Remove Python 3.6 support which reached end-of-life status on [2021-12-23](https://endoflife.date/python)
46-
* OOV handler plugins support user-defined POS, [similar to Java version](https://github.com/WorksApplications/Sudachi/releases/tag/v0.6.0)
47-
* Added Regex OOV handler
69+
- Remove Python 3.6 support which reached end-of-life status on [2021-12-23](https://endoflife.date/python)
70+
- OOV handler plugins support user-defined POS, [similar to Java version](https://github.com/WorksApplications/Sudachi/releases/tag/v0.6.0)
71+
- Added Regex OOV handler
4872

49-
## Regex OOV Handler
73+
### Regex OOV Handler
5074

51-
* For details, see [Java version changelog](https://github.com/WorksApplications/Sudachi/releases/tag/v0.6.0)
52-
* In Rust/Python Regexes do not support backtracking and backreferences
53-
* `maxLength` setting defines maximum length in unicode codepoints, not in utf-8 bytes as in Java (will be changed to codepoints later)
75+
- For details, see [Java version changelog](https://github.com/WorksApplications/Sudachi/releases/tag/v0.6.0)
76+
- In Rust/Python Regexes do not support backtracking and backreferences
77+
- `maxLength` setting defines maximum length in unicode codepoints, not in utf-8 bytes as in Java (will be changed to codepoints later)
5478

55-
# [0.6.3](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.3) (2022-02-10)
79+
## [0.6.3](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.3) (2022-02-10)
5680

57-
## Highlights
81+
### Highlights
5882

59-
* Fixed path resolution algorithm for resources. They are now resolved in the following order (first existing file wins):
83+
- Fixed path resolution algorithm for resources. They are now resolved in the following order (first existing file wins):
6084
1. Absolute paths stay as they are
6185
2. Relative to "path" value of the config file
6286
3. Relative to "resource_dir" parameter of the config object during creation
63-
* For SudachiPy it is the parameter of `Dictionary` constructor
87+
- For SudachiPy it is the parameter of `Dictionary` constructor
6488
4. Relative to the location of the configuration file
6589
5. Relative to the current directory
6690

67-
## Python
91+
### Python
92+
93+
- `Dictionary` now has `__repr__()` function which displays absolute paths to dictionaries in use.
94+
- `Dictionary` now has `pos_of()` function which returns a POS tuple for a given POS id.
95+
- `PosMatcher` supports set operations
96+
- union (`m1 | m2`)
97+
- intersection (`m1 & m2`)
98+
- difference (`m1 - m2`)
99+
- negation (`~m1`)
100+
101+
## [0.6.2](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.2) (2021-12-09)
102+
103+
### Fixes
104+
105+
- Fix analysis differences with 0.5.4
106+
107+
## [0.6.1](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.1) (2021-12-08)
108+
109+
### Highlights
110+
111+
- Added Fuzzing (see `sudachi-fuzz` subdirectory), Sudachi.rs seems to be pretty robust towards arbitrary inputs (no crashes and panics)
112+
- Issues like https://github.com/WorksApplications/sudachi.rs/issues/182 should never occur more
113+
- ~5% analysis speed improvement over 0.6.0
114+
- Added support for Unicode combining symbols, now Sudachi.rs/py should be much better with emoji (🎅🏾) and more complex Unicode (İstanbul)
68115

69-
* `Dictionary` now has `__repr__()` function which displays absolute paths to dictionaries in use.
70-
* `Dictionary` now has `pos_of()` function which returns a POS tuple for a given POS id.
71-
* `PosMatcher` supports set operations
72-
* union (`m1 | m2`)
73-
* intersection (`m1 & m2`)
74-
* difference (`m1 - m2`)
75-
* negation (`~m1`)
116+
### Rust
76117

77-
# [0.6.2](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.2) (2021-12-09)
118+
- Added partial dictionary read functionality, it is now possible to skip reading certain fields if they are not needed
119+
- Improved startup times, especially for debug builds
78120

79-
## Fixes
121+
### Python
80122

81-
* Fix analysis differences with 0.5.4
123+
- See [Python changelog](./python/CHANGELOG.md)
82124

83-
# [0.6.1](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.1) (2021-12-08)
125+
## [0.6.0](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.0) (2021-11-11)
84126

85-
## Highlights
86-
* Added Fuzzing (see `sudachi-fuzz` subdirectory), Sudachi.rs seems to be pretty robust towards arbitrary inputs (no crashes and panics)
87-
* Issues like https://github.com/WorksApplications/sudachi.rs/issues/182 should never occur more
88-
* ~5% analysis speed improvement over 0.6.0
89-
* Added support for Unicode combining symbols, now Sudachi.rs/py should be much better with emoji (🎅🏾) and more complex Unicode (İstanbul)
127+
### Highlights
90128

91-
## Rust
92-
* Added partial dictionary read functionality, it is now possible to skip reading certain fields if they are not needed
93-
* Improved startup times, especially for debug builds
129+
- Full feature parity with Java version
130+
- ~15% analysis speed improvement over 0.6.0-rc1
94131

95-
## Python
96-
* See [Python changelog](./python/CHANGELOG.md)
132+
### Rust
97133

98-
# [0.6.0](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.0) (2021-11-11)
99-
## Highlights
100-
* Full feature parity with Java version
101-
* ~15% analysis speed improvement over 0.6.0-rc1
134+
- Added dictionary build functionality
135+
- https://github.com/WorksApplications/sudachi.rs/pull/143
136+
- Added an option to perform analysis without sentence splitting
137+
- Use it with `--split-sentences=no`
102138

103-
## Rust
104-
* Added dictionary build functionality
105-
* https://github.com/WorksApplications/sudachi.rs/pull/143
106-
* Added an option to perform analysis without sentence splitting
107-
* Use it with `--split-sentences=no`
139+
### Python
108140

109-
## Python
110-
* Added bindings for dictionary build (undocumented and not supported as API).
111-
* See https://github.com/WorksApplications/sudachi.rs/issues/157
112-
* `sudachipy build` and `sudachipy ubuild` should work once more
113-
* Report on build times and dictionary part sizes can differ from the original SudachiPy
141+
- Added bindings for dictionary build (undocumented and not supported as API).
142+
- See https://github.com/WorksApplications/sudachi.rs/issues/157
143+
- `sudachipy build` and `sudachipy ubuild` should work once more
144+
- Report on build times and dictionary part sizes can differ from the original SudachiPy
114145

146+
## [0.6.0-rc1](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.0-rc1) (2021-10-26)
115147

116-
# [0.6.0-rc1](https://github.com/WorksApplications/sudachi.rs/releases/tag/v0.6.0-rc1) (2021-10-26)
117-
## Highlights
148+
### Highlights
118149

119-
* First release of Sudachi.rs
120-
* SudachiPy compatible Python bindings
121-
* ~30x speed improvement over original SudachiPy
122-
* Dictionary build mode will be done before 0.6.0 final (See #13)
150+
- First release of Sudachi.rs
151+
- SudachiPy compatible Python bindings
152+
- ~30x speed improvement over original SudachiPy
153+
- Dictionary build mode will be done before 0.6.0 final (See #13)
123154

124-
## Rust
155+
### Rust
125156

126-
* Analysis: feature parity with Python and Java version
127-
* Dictionary build is not supported in rc1
128-
* ~2x faster than Java version (with sentence splitting)
129-
* No public API at the moment (contact us if you want to use Rust version directly, internals will significantly change and names are not finalized)
157+
- Analysis: feature parity with Python and Java version
158+
- Dictionary build is not supported in rc1
159+
- ~2x faster than Java version (with sentence splitting)
160+
- No public API at the moment (contact us if you want to use Rust version directly, internals will significantly change and names are not finalized)
130161

131-
## Python
162+
### Python
132163

133-
* Mostly compatible with SudachiPy 0.5.4
134-
* We provide binary wheels for popular platforms
135-
* ~30x faster than 0.5.4
136-
* IgnoreYomigana input text plugin is now supported (and enabled by default)
137-
* We provide [binary wheels for convenience (and additional speed on Linux)](https://worksapplications.github.io/sudachi.rs/python/wheels.html)
164+
- Mostly compatible with SudachiPy 0.5.4
165+
- We provide binary wheels for popular platforms
166+
- ~30x faster than 0.5.4
167+
- IgnoreYomigana input text plugin is now supported (and enabled by default)
168+
- We provide [binary wheels for convenience (and additional speed on Linux)](https://worksapplications.github.io/sudachi.rs/python/wheels.html)
138169

139-
## Known Issues
170+
### Known Issues
140171

141-
* List of deprecated SudachiPy API:
142-
* `MorphemeList.empty(dict: Dictionary)`
143-
* This also needs a dictionary as an argument.
144-
* `Morpheme.split(mode: SplitMode)`
145-
* `Morpheme.get_word_info()`
146-
* Most of instance attributes are not exported: e.g. `Dictionary.grammar`, `Dictionary.lexicon`.
147-
* See [API reference page](https://worksapplications.github.io/sudachi.rs/python/) for supported APIs.
148-
* Dictionary Build is not supported: `sudachipy build` and `sudachipy ubuild` will not work, please use 0.5.3 in another virtual environment for the time being until the feature is implemented: #13
172+
- List of deprecated SudachiPy API:
173+
- `MorphemeList.empty(dict: Dictionary)`
174+
- This also needs a dictionary as an argument.
175+
- `Morpheme.split(mode: SplitMode)`
176+
- `Morpheme.get_word_info()`
177+
- Most of instance attributes are not exported: e.g. `Dictionary.grammar`, `Dictionary.lexicon`.
178+
- See [API reference page](https://worksapplications.github.io/sudachi.rs/python/) for supported APIs.
179+
- Dictionary Build is not supported: `sudachipy build` and `sudachipy ubuild` will not work, please use 0.5.3 in another virtual environment for the time being until the feature is implemented: #13

0 commit comments

Comments
 (0)