scribe/CHANGELOG
Edward Loveall 7d0bc37efd
Fix markup errors caused by UTF-16/8 differences
Medium uses UTF-16 character offsets (likely to make it easier to parse
in JavaScript) but Crystal uses UTF-8. Converting strings to UTF-16 to
do offset calculation then back to UFT-8 fixes some markup bugs.

---

Medium calculates markup offsets using UTF-16 encoding. Some characters
like Emoji are count as multiple bytes which affects those offsets. For
example in UTF-16 💸 is worth two bytes, but Crystal strings only count
it as one. This is a problem for markup generation because it can
offset the markup and even cause out-of-range errors.

Take the following example:

💸💸!

Imagine that `!` was bold but the emoji isn't. For Crystal, this starts
at char index 2, end at char index 3. Medium's markup will say markup
goes from character 4 to 5. In a 3 character string like this, trying
to access character range 4...5 is an error because 5 is already out of
bounds.

My theory is that this is meant to be compatible with JavaScript's
string length calculations, as Medium is primarily a platform built for
the web:

```js
"a".length // 1
"💸".length // 2
"👩‍❤️‍💋‍👩".length // 11
```

To get these same numbers in Crystal strings must be converted to
UTF-16:

```crystal
"a".to_utf16.size # 1
"💸".to_utf16.size # 2
"👩‍❤️‍💋‍👩".to_utf16.size # 11
```

The MarkupConverter now converts text into UFT-16 byte arrays on
initialization. Once it's figured out the range of bytes needed for
each piece of markup, it converts it back into UTF-8 strings.
2022-01-30 11:53:22 -05:00

99 lines
1.5 KiB
Text

2022-01-30
* Fix bug in markup generation for text with multiple codepoints
2022-01-29
* Provide list of instances as JSON file
2022-01-23
* Proxy GitHub gists with rate limiting
* Add CHANGELOG
2022-01-16
* Add scribe.bus-hit.me instance
2022-01-15
* Add instructions for Lucky config variables
* Ensure that scr/version is up-to-date when building
2022-01-08
* Use FAQ entry to explain custom domains
2022-01-04
* Improve Redirector extension instructions
* Home page instructions for custom domains
* Add visible version
2021-12-16
* update crystal version in Dockerfile
2021-12-12
* Upgrade Crystal to 1.2.1 and Lucky to 0.29.0
2021-12-04
* Add FAQ on how to use Scribe with custom domains
2021-11-20
* Add citizen4.eu instance
* Update readme
2021-11-11
* Add instance docs
2021-11-07
* Add project goals to README
2021-11-06
* Support null image widths and heights
2021-10-23
* Add FAQ
2021-10-20
* Add SECTION_CAPTION paragraph type
* Support null text on paragraphs
2021-10-16
* Parsing Fix: Add H2 Paragraph type
* Add Dockerfile
* Point source link at sr.ht project page
2021-10-15
* Further improve proposed pattern for Redirector
* Add support for development with Nix
2021-10-14
* Use app_domain in Redirector example
* Add mailing list for patches to README
2021-10-12
* Improve proposed pattern for Redirector extension
2021-10-11
* README updates
* Support medium's redirectUrl query param
* Homepage Example
2021-10-03
* Initial release