Commit graph

109 commits

Author SHA1 Message Date
Edward Loveall
5df9c44a5c
Support null text on paragraphs
I think this was an old feature on medium, but you can see examples of
null text on this post:

https://medium.com/message/the-joy-of-typing-fd8d091ab8ef
2021-10-20 20:40:43 -04:00
Edward Loveall
7166b7d834
Add SECTION_CAPTION paragraph type
This doesn't seem to be rendered on medium.com. Here's a post that has
one, but the text is nowhere on the page:
https://medium.com/message/the-joy-of-typing-fd8d091ab8ef

This help articles hints that it might have been a feature at one point
that they don't allow anymore:
https://medium.com/@Medium/images-652ee60abea6
2021-10-20 20:39:58 -04:00
Edward Loveall
513d590ce3
Point source link at sr.ht project page
Instead of the git page. That way it's easier to find the mailing lists
and whatnot.
2021-10-16 16:23:15 -04:00
Edward Loveall
f7ad92f4bf
Parsing Fix: Add H2 Paragraph type
The post id 34dead42a28 contained a new paragraph type: H2. Previously
the only known header types were H3 and H4. In this case, the paragraph
doesn't actually get rendered because it's the page title which is
removed from the page nodes (see commits 6baba803 and then fba87c10).
However, it somehow an author is able to get an H2 paragraph into the
page, it will display as an <h1> just as H3 displays as <h2> and H4
displays as <h3>.
2021-10-16 16:23:15 -04:00
tnp
eaf25ef23a
Add Dockerfile 2021-10-16 10:56:15 -04:00
Martin Puppe
0c018a898a
Add support for development with Nix
This patch adds support for development with the Nix package manager. In
order to support the traditional nix-shell tool as well as the (still
experimental) Nix Flakes feature of the upcoming version of Nix, this
patch adds shell.nix *and* flake.nix/flake.lock.  Usage instructions
have been added to the README.
2021-10-15 08:56:15 -04:00
Martin Puppe
56b6d546db
Further improve proposed pattern for Redirector
This patch further improves the proposed pattern for the Redirector
extension. In contrast to the old pattern, …

* … it will redirect the URL https://medium.com.
* … it will *not* redirect URLs with top-level domains like mediumXcom.
  (This point is purely theoretical, but it makes the regular expression
  more correct and consistent.)
* … it will *not* redirect URLs like https://link.medium.com/AXEtCilplkb
  which Scribe currently cannot handle. These are shortened URLs that
  users get when they use the Twitter button on Medium to share a post.

In order to implement the last point (not matching link.medium.com), the
pattern uses negative lookbehind. This feature of regular expressions is
supported by all recent browsers for which Redirector is available
(Firefox, Chrome, Edge, Opera)[^1], including the current version of
Firefox ESR (Extended Stability Release).

[^1]: https://caniuse.com/js-regexp-lookbehind
2021-10-15 08:55:26 -04:00
Edward Loveall
472b0092c8
Add mailing list for patches to README 2021-10-14 21:15:19 -04:00
Amolith
9fcf37f416
Use app_domain in Redirector example
In the current redirector example, "scribe.rip" is hardcoded as the
destination. This patch simply changes that to use the app_domain
environment variable, so people wanting to use a community instance
aren't mistakenly redirected to the main scribe.rip instance.
2021-10-14 18:10:46 -04:00
Martin Puppe
0d9170b8d6
Improve proposed pattern for Redirector extension
The old pattern matches all host names that end with medium.com. The new
pattern matches only medium.com and its sub-domains. For example, the
old pattern would have matched
https://foomedium.com/@user/post-123456abcdef.
2021-10-13 21:51:55 -04:00
Edward Loveall
e127a67c6b
Ensure gists display well at all device widths 2021-10-11 20:09:58 -04:00
Edward Loveall
91f4aae0bc
Add an example and tagline to homepage 2021-10-11 20:03:31 -04:00
Edward Loveall
bb94fb41b1
Support medium's redirectUrl query param
When a post has a gi= query param, Medium makes a global_identifier
"query". This redirects via a 307 temporary redirect to a url that
looks like this:

https://medium.com/m/global-identity?redirectUrl=https%3A%2F%2Fexample.c
om%2Fmy-post-000000000000

Previously, scribe looked for the Medium post id in the url's path, not
it's query params since query params can include other garbage like
medium_utm (not related to medium.com). Now it looks first for the post
id in the path, then looks to the redirectUrl as a fallback.
2021-10-11 12:04:17 -04:00
Edward Loveall
91687bb689
Add automatic redirect instructions to homepage 2021-10-10 15:05:56 -04:00
Edward Loveall
dbddfc9cb4
Update README 2021-10-10 14:52:37 -04:00
Edward Loveall
0998a87622
Remove postgress stuff from script/setup
This app doesn't use a database so there's no point.
2021-10-10 14:52:14 -04:00
Edward Loveall
fba87c1076
Improve title parsing
The subtitle has been removed because it's difficult to find and error
prone to guess at. It is somewhat accessible from the post's
previewContent field in GraphQL but that can be truncated.
2021-10-03 18:14:46 -04:00
Edward Loveall
2808505b4e
Add instructions on how to view a post 2021-10-03 17:21:17 -04:00
Edward Loveall
aacef34a14
Accept all known medium post path types
Including:

* https://example.com/my-cool-post-123456abcdef
* https://example.com/123456abcdef
* https://medium.com/@user/my-cool-post-123456abcdef
* https://medium.com/user/my-cool-post-123456abcdef
* https://medium.com/p/my-cool-post-123456abcdef
* https://medium.com/posts/my-cool-post-123456abcdef
* https://medium.com/p/123456abcdef

Replace any of those posts with the scribe domain and it should resolve
2021-10-03 16:45:20 -04:00
Edward Loveall
0f6a2a3e1e
Fix GitHub Gist width 2021-09-25 13:26:24 -04:00
Edward Loveall
bd56bfdd9f
Embed widths are now the same width as all content 2021-09-25 13:26:10 -04:00
Edward Loveall
561483cf9f
Link to the author's page
Right now this links to the user's medium page. It may link to an
internal page in the future.

Instead of the Page taking the author as a string, it now takes a
PostResponse::Creator object. The Articles::ShowPage then converts the
Creator (a name and user_id) to an author link.

Finally, I did some refactoring of UserAnchor (which I thought I was
going to use for this) to change it's userId attribute to user_id as is
Crystal convention.
2021-09-15 16:03:36 -04:00
Edward Loveall
1c20c81d06
Fix Blockquotes
In tufte.css blockquotes should contain a <p> that holds the content
and an optional <footer> for the source of the quote. Otherwise the
block quote text is unbounded and is way too wide. This wraps the
content in a paragraph
2021-09-15 15:25:34 -04:00
Edward Loveall
a6cafaa1fc
Render embedded content
PostResponse::Paragraph's that are of type IFRAME have extra data in the
iframe attribute to specify what's in the iframe. Not all data is the
same, however. I've identified three types and am using the new
EmbeddedConverter class to convert them:

* EmbeddedContent, the full iframe experience
* GithubGist, because medium or github treat embeds differently for
  whatever reason
* EmbeddedLink, the old style, just a link to the content. Effectively
  a fallback

The size of the original iframe is also specified as an attribute. This
code resizes it. The resizing is determined by figuring out the
width/height ratio and setting the width to 800.

EmbeddedContent can be displayed if we have an embed.ly url, which most
iframe response data has. GitHub gists are a notable exception. Gists
instead can be embedded simply by taking the gist URL and attaching .js
to the end. That becomes the iframe's src attribute.

The PostResponse::Paragraph's iframe attribute is nillable. Previous
code used lots of if-statements with variable bindings to work with the
possible nil values:

```crystal
if foo = obj.nillable_value
  # obj.nillable_value was not nil and foo contains the value
else
  # obj.nillable_value was nil so do something else
end
```

See https://crystal-lang.org/reference/syntax_and_semantics/if_var.html
for more info

In the EmbeddedConverter the monads library has been introduced to get
rid of at least one level of nillability. This wraps values in Maybe
which allows for a cleaner interface:

```crystal
Monads::Try(Value).new(->{ obj.nillable_value })
  .to_maybe
  .fmap(->(value: Value) { # do something with value })
  .value_or(# value was nil, do something else)
```

This worked to get the iframe attribute from a Paragraph:

```crystal
Monads::Try(PostResponse::IFrame).new(->{ paragraph.iframe })
  .to_maybe
  .fmap(->(iframe : PostResponse::IFrame) { # iframe is not nil! })
  .fmap(#and so on)
  .value_or(Empty.new)
```

iframe only has one attribute: mediaResource which contains the iframe
data. That was used to determine one of the three types above.

Finally, Tufte.css has options for iframes. They mostly look good except
for tweets which are too small and weirdly in the center of the page
which actually looks off-center. That's for another day though.
2021-09-15 15:18:08 -04:00
Edward Loveall
903f3f4b38
Add License 2021-09-12 17:34:48 -04:00
Edward Loveall
7851434952
Add script to build object file (.o) for Ubuntu
This ubuntu_server.o file then needs to be copied to the server and
linked.
2021-09-07 22:00:20 -04:00
Edward Loveall
9770ff5c7a
Add MIXTAPE_EMBED paragraph type 2021-09-07 21:13:28 -04:00
Edward Loveall
0cab1a11ed
Add home page 2021-09-06 13:38:18 -04:00
Edward Loveall
04b8d90b8f
Improve author/timestamp 2021-09-04 22:05:58 -04:00
Edward Loveall
b3166102c7
Parse medium URLs
As far as I can tell, the post id for all medium posts is always 12 hex
characters. We'll find out if that's true.
2021-09-04 21:31:48 -04:00
Edward Loveall
8939772b12
Add post creation date/time 2021-09-04 17:32:27 -04:00
Edward Loveall
c681d2e2ee
Add author to post
Instead of passing Paragraphs to the PageConverter, it now receives all
the data from the response. This has the author so it can be parsed out.
2021-09-04 17:15:30 -04:00
Edward Loveall
083abc5ef1
Add page title to <header> <title> 2021-09-04 14:44:05 -04:00
Edward Loveall
1dae8e2254
Move compression-webpack-plugin and postcss to prod 2021-08-29 17:08:55 -04:00
Edward Loveall
a5e49209a5
Move laravel-mix to production dependencies 2021-08-29 17:02:20 -04:00
Edward Loveall
d850eafbf2
Add tufte.css
These styles can also be added manually, but it's so much easier to
install them via NPM and have laravel mix take care of installing them.
2021-08-29 15:19:40 -04:00
Edward Loveall
533c297019
Only query for the attributes you need 2021-08-29 15:19:40 -04:00
Edward Loveall
6726dff526
Display figure captions as margin notes
On a thin viewport like a phone these show up as hidden at first until
the user expands them by interacting with the "writing hand" icon. Each
margin note needs a small bit of markup near it to enable the toggle.
Each also needs a unique ID to ensure it doesn't interact with
alternate content. The `hash` value of the FigureCaption's `children`
provides this unique value.
2021-08-29 15:19:40 -04:00
Edward Loveall
6baba80309
Display title and subtitle
Also wrap the content in an article for semantic formatting

tufte.css requires that content is wrapped in an <article> and at least
one <section>. There's no way of determining new semantic sections so
there is only one.
2021-08-29 15:19:39 -04:00
Edward Loveall
05c18f6451
Extract tile and subtitle from initial paragraphs
Medium guides each post to have a Title and Subtitle. They are rendered
as the first two paragraphs: H3 and H4 respectively. If they exist, a
new PageConverter class extracts them and sets them on the page.

However, they aren't required. If the first two paragraphs aren't H3
and H4, the PageConverter falls back to using the first paragraph as
the title, and setting the subtitle to blank.

The remaining paragraphs are passed into the ParagraphConverter as
normal.
2021-08-29 15:19:39 -04:00
Edward Loveall
f48f7c2932
Use H2/H3 instead of H3/H4 respectively
General CSS hygiene dictates that you shouldn't go beyond an H3 tag. H1
for the document title, H2 for section headings, and H3 for low-level
headings.
2021-08-14 16:12:01 -04:00
Edward Loveall
5c05086cbd
Don't render image heights explicitly
The CSS itself will take care of scaling the image height based on the
width. We still need to know the height to fetch the image because the
height is in the URL, but we don't need to render it in the HTML.
2021-08-14 16:07:31 -04:00
Edward Loveall
bf43c7f467
Add PQ (pullquote) type
This appears for something like medium's "top highlight". It's like a
blockquote but bigger
2021-08-08 18:18:07 -04:00
Edward Loveall
e64e9f0853
Use href from iframe media response
Turns out, href exists in the mediaResponse query. I can use that
instead of fetching that separately.
2021-08-08 16:49:02 -04:00
Edward Loveall
09995cde5c
Overlapping refactor
Example:

* Text: "strong and emphasized only"
* Markups:
  * Strong: 0..10
  * Emphasis: 7..21

First, get all the borders of the markups, including the start (0) and
end (text.size) indexes of the text in order:

```
[0, 7, 10, 21, 26]
```

Then attach markups to each range. Note that the ranges are exclusive;
they don't include the final number:

* 0...7: Strong
* 7...10: Strong, Emphasized
* 10...21: Emphasized
* 21...26: N/A

Bundle each range and it's related markups into a value object
RangeWithMarkup and return the list.

Loop through that list and recursively apply each markup to each
segment of text:

* Apply a `Strong` markup to the text "strong "
* Apply a `Strong` markup to the text "and"
  * Wrap that in an `Emphasis` markup
* Apply an `Emphasis` markup to the text " emphasized"
* Leave the text " only" as is

---

This has the side effect of breaking up the nodes more than they need
to be broken up. For example right now the algorithm creates this HTML:

```
<strong>strong </strong><em><strong>and</strong></em>
```

instead of:

```
<strong>strong <em>and</em></strong>
```

But that's a task for another day.
2021-08-08 15:08:43 -04:00
Edward Loveall
31f7d6956c
Anchor and UserAnchor nodes can contain children
The impetus for this change was to help make the MarkupConverter code
more robust. However, it's also possible that an Anchor can contain
styled text. For example, in markdown someone might write a link that
contains some <strong> text:

```markdown
[this link is so **good**](https://example.com)
```

This setup will now allow that. Unknown if UserAnchor can ever contain
any text that isn't just the user's name, but it's easy to deal with
and makes the typing much easier.
2021-08-08 14:34:40 -04:00
Edward Loveall
130b235a6c
crystal tool format 2021-08-08 14:23:38 -04:00
Edward Loveall
210f212116
Add .nova to gitignore
To enable the crystal formatting, the extension saves the crystal path
to a .nova folder. These paths are specific to my computer so I don't
need to store them in the repo
2021-08-08 14:22:34 -04:00
Edward Loveall
7cda16cef1
Show the host for the iframe link
Instead of showing only: Click to visit embedded content

An embedded link now displays with the domain it's linking to: Embedded
content at example.com

This hopefully breaks up the links a bit so it'e easier to distinguish
between a bunch of them in a row (as long as they are on different
domains).
2021-07-05 15:36:38 -04:00
Edward Loveall
d863cc27a5
Fetch the resized image
Instead of getting the full size image, the image can be fetched with a
width and height parameter so that only the resized data is
transferred. The url looks like this:

https://cdn-images-1.medium.com/fit/c/<width>/<height>/<media-id>

I picked a max image width of 800px. If the image width is more than
that, it scales the width down to 800, then applies that ratio to the
height. If it's smaller than that, the image is displayed as the
original.
2021-07-05 14:56:10 -04:00