`is_type_comment` now specifically deals with general type comments for a leaf.
`is_type_ignore_comment` now handles type comments contains ignore annotation for a leaf
`is_type_ignore_comment_string` used to determine if a string is an ignore type comment
Avoids a Python 3.12 deprecation warning.
Subtle difference: previously, timestamps in diff filenames had the
`+0000` separated from the timestamp by space. With this, the space is
there no more, and there is a colon, as in `+00:00`.
This patch changes the preview style so that string splitters respect
Unicode East Asian Width[^1] property. If you are not familiar to CJK
languages it is not clear immediately. Let me elaborate with some
examples.
Traditionally, East Asian characters (including punctuation) have
taken up space twice than European letters and stops when they are
rendered in monospace typeset. Compare the following characters:
```
abcdefg.
글、字。
```
The characters at the first line are half-width, and the second line
are full-width. (Also note that the last character with a small
circle, the East Asian period, is also full-width.) Therefore, if we
want to prevent those full-width characters to exceed the maximum
columns per line, we need to count their *width* rather than the number
of characters. Again, the following characters:
```
글、字。
```
These are just 4 characters, but their total width is 8.
Suppose we want to maintain up to 4 columns per line with the following
text:
```
abcdefg.
글、字。
```
How should it be then? We want it to look like:
```
abcd
efg.
글、
字。
```
However, Black currently turns it into like this:
```
abcd
efg.
글、字。
```
It's because Black currently counts the number of characters in the line
instead of measuring their width. So, how could we measure the width?
How can we tell if a character is full- or half-width? What if half-width
characters and full-width ones are mixed in a line? That's why Unicode
defined an attribute named `East_Asian_Width`. Unicode grouped every
single character according to their width in fixed-width typeset.
This partially addresses #1197, but only for string splitters. The other
parts need to be fixed as well in future patches.
This was implemented by copying rich's own approach to handling wide
characters: generate a table using wcwidth, check it into source
control, and use in to drive helper functions in Black's logic. This
gets us the best of both worlds: accuracy and performance (and let's us
update as per our stability policy too!).
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Fixes#3506
We can't simply escape the quotes in a naked f-string when merging string groups, because backslashes are invalid.
The quotes in f-string expressions should be toggled (this is safe since quotes can't be reused).
This fix also means implicitly concatenated f-strings with different quotes can now be merged or quote-normalized by changing the quotes used in expressions. e.g.:
```diff
raise sa_exc.UnboundExecutionError(
"Could not locate a bind configured on "
- f'{", ".join(context)} or this Session.'
+ f"{', '.join(context)} or this Session."
)
```
When trying to format a project from the outside, the verbose output
shows says that there are symbolic links that points outside of the
project, but displays the wrong project path, meaning that these
messages are false positives.
This bug is triggered when the command is executed from outside a
project on a folder inside it, causing an inconsistency between the
path to the detected project root and the relative path to the target
contents.
The fix is to normalize the target path using the project root before
processing the sources, which removes the presence of the incorrect
messages.
---
The test attemps to emulate the behavior of the CLI as closely as
posible by patching some `pathlib.Path` methods and passing certain
reference paths to the context object and `black.get_sources`.
Before the associated fix was introduced, this test failed because
some of the captured files reported the presence of a symlink due to
an incorrectly formated path. The test also asserts that only a single
file is reported as ignored, which is part of the expected behavior.
Signed-off-by: Antonio Ossa Guerra <aaossa@uc.cl>
Co-authored-by: Jordan Ephron <JEphron@users.noreply.github.com>
Co-authored-by: Richard Si <63936253+ichard26@users.noreply.github.com>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
The bug is in the `get_leaves_inside_matching_brackets` on the third line below:
```python
assert xxxxxxxxx.xxxxxxxxx.xxxxxxxxx(
xxxxxxxxx
).xxxxxxxxxxxxxxxxxx(), (
"xxx {xxxxxxxxx} xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
)
```
Including the invisible paren, third line is `).xxxxxxxxxxxxxxxxxx()), (`, that it has a matched pair then an unmatched closing paren afterwards. This PR ensures the returned leaves are actually matched.
Fixes#3414.
Currently, empty and whitespace-only (with or without newlines) are
not modified. In some discussions (issues and pull requests) consensus
was to reformat whitespace-only files to empty or single-character
files, preserving line endings when possible. With that said, this
commit introduces the following behaviors:
* Empty files are left as is
* Whitespace-only files (no newline) reformat into empty files
* Whitespace-only files (1 or more newlines) reformat into a single
newline character
To implement these changes, we moved the initial check at
`format_file_contents` that raises `NothingChanged` if the source
(with no whitespaces) is an empty string. In the case of *.ipynb
files, `format_ipynb_string` checks a similar condition and removed
whitespaces. In the case of Python files, `format_str_once` includes a
check on the output that returns the correct newline character if
possible or an empty string otherwise.
Signed-off-by: Antonio Ossa Guerra <aaossa@uc.cl>